In this article, a basic approach on dissecting linux ELF binaries has been demonstrated. Let’s start…
The ELF Format:
The Executable and Linkable Format (aka Extensible Linking Format) is a common standard file format for executables, object code, shared libraries and core dumps on Linux. An ELF file is divided into sections. For an executable program, these are the text section for the code. The ELF file contains headers that describe how these sections should be stored in memory. Unlike many proprietary executable file formats, ELF is very flexible and extensible, and it is not bound to any particular processor or Instruction set architecture. This has allowed it to be adopted by many different operating systems on many different platforms.
- ELF header – Elf32_Ehdr/Elf64_Ehdr struct.
- Program header – Elf32_Phdr/struct Elf64_Phdr struct.
- Section header – Elf32_Shdr/struct Elf64_Shdr struct.
Dissecting the ELF:
For demonstrating purpose, “ls” utility of Linux has been taken as an example.
First run the file command against ls:
Hexdump the ELF file (/bin/ls) with hd command. The first 16 bytes represents the “magic” field, which is a way to identify ELF files:
Each and every information of ELF files along with the bytes can be seen in the source code of elf.h. Part of that is shown in the screenshot below:
Now, let’s create a simple hello world program:
Name it as a.c
To create an ELF before linking, the command “gcc a.c -c” is used. Running this command will create an object file. This object file is an ELF file.
For analyzing ELF files, readelf command is used. Run the readelf command against the binary file created after compiling the c file (a.o) as shown in the screenshot below:
Now run hexdump against the a.o file. The result can be studied from elf.h source code:
Note: The first line is the magic field. In the second line, the first two bytes (01 00) represents the “e_type”. Below is the screenshot of elf.h source code wherein the bytes 01 00 means “Relocatable file”. See the screenshot below:
The third byte (3e) tells about the architecture. 3e is equivalent to 62 in decimal (3*16^1 + 14*16^0). Looking at the elf.h, 62 means x86-64 architecture:
This way an ELF file can be stripped and analyzed. The analysis of ELF headers can also be done using the following commands:
- Sections – readelf –S a.o
- Relocation – readelf –r a.o
My next post will be on basics of reverse engineering with “Radare2” reverse engineering framework.