An Intro to ELF File Format–Part 2 Sections and Segments

This is a follow up post of the Part 1 An Intro to ELF File Format.

Sections

We’ve seen the section headers in section header table above, now we discuss sections in detail.

Below are a list of commonly seen section types.  (Note that this is not a complete list)

  • NULL: this marks the section header as inactive. It does not have an associated section.
  • PROGBITS: program content, including code, data, debugging info etc.
  • SYMTAB and DYNSYM: the section holds a symbol table. SYMTAB provides symbols for link editing and dynamic linking. DYNSYM holds only a minimal set of dynamic linking symbols.
  • STRTAB: holds a string table.
  • RELA: relocation entries with explicit addends.
  • REL: relocation entries without explicit addends.
  • DYNAMIC: information for dynamic linking.
  • HASH: holds a symbol hash table for looking up the symbol in a ELF file quickly.
  • NOTE: information that marks the file.
  • NOBITS: like PROGBITS, but without occupying space in the file. Used for BSS data allocated at program load time.

We can see the section type from the section header screenshots above. For example, the .interp is a section of type PROGBITS.

Sections also have flags associated with them. Below is a list of commonly seen flags.

  • WRITE (W): the section contains data that is writable when loaded.
  • ALLOC (A): the section occupies memory when the program is loaded.
  • EXECINSTR (X): the section contains executable machine code.
  • MERGE (M): the data in the section may be merged to eliminate duplication.
  • STRINGS (S): the data in the section contain null-terminated character strings.

Below is a list of commonly seen sections.

Name Type Attributes Notes
.bss NOBITS ALLOC and WRITE holds uninitialized data. The system initializes the data with zeros before the program starts to run.
.comment PROGBITS MERGE and STRINGS some remarks about the file
.data PROGBITS ALLOC and WRITE holds initialized data.
.debug PROGBITS holds information for symbolic debugging
.dynamic DYNAMIC ALLOC (WRITE, processor specific) holds dynamic linking info.
.dynstr STRTAB ALLOC strings needed by dynamic linking, most commonly the names associated with dynamic linking symbol table entries
.dynsym DYNSYM ALLOC symbol table for dynamic linking
.fini PROGBITS ALLOC and EXEC process termination code
.got PROGBITS ALLOC and WRITE Global Offset Table (GOT).
.hash HASH ALLOC symbol hash table
.init PROGBITS ALLOC and EXEC process initialization code
.interp PROGBITS ALLOC on if file has a loadable segment that includes relocation path name of a program interpreter.
.note NOTE
.plt PROGBITS ALLOC and EXEC Procedure Linkage Table (PLT)
.rel<name> REL ALLOC on if file has a loadable segment that includes relocation relocation info, <name> indicates the section to which the relocations apply
.rela<name> RELA ALLOC on if file has a loadable segment that includes relocation relocation info, <name> indicates the section to which the relocations apply
.rodata PROGBITS ALLOC read-only data
.shstrtab STRTAB section names
.strtab STRTAB ALLOC on if the file has a loadable segment that includes the symbol string table strings. Most commonly the strings that represent the names associated with symbol table entries.
.symtab SYMTAB ALLOC on if file has a loadable segment that includes the symbol table symbol table.
.text PROGBITS ALLOC and EXEC executable code of a program

Segments

Executable and shared object ELF files statically represent programs. A system needs to use these files to create dynamic representation (process image) in order to execute the program. A process image consists of segments which holds text, data ,stack etc created from the segments in ELF files.

We can get various information about segments of ELF file from its program headers. Below is a screenshot of program headers in test executable file.

$ readelf -l test

Each record has several fields.

  • Type: the segment type.
  • Offset: the location of the first byte of the segment with respect to the beginning of the file
  • VirtAddr: the virtual address of the first byte of the segment in memory
  • PhysAddr: only make sure on systems where physical addressing is relevant.
  • FileSiz: segment size in number of bytes in file image of the segment
  • MemSiz: segment size in number of bytes in memory image of the segment.
  • Flg: flags
  • Align: memory alignment. Values 0 or 1 means no alignment is needed. Otherwise, VirtAddr%Align == Offset%Align

Below is a list of commonly seen segment types in ELF files.

  • NULL: unused
  • LOAD: a loadable segment.
  • DYNAMIC: dynamic linking info
  • INTERP: the segment specifies the path to an interpreter. If present, must proceed any loadable segment.
  • NOTE: location and size of auxiliary info.
  • PHDR: specifies the size and location of the program header itself, both in file image and memory image of the program.

A segment consists of one or more sections. The two most commonly seen segments are the text segment and the data segment (both are of LOAD type).

A text segment contains read-only instructions and data. It usually includes the following sections: .text, .rodata, .hash, .dynsym, .dynstr, .plt, .rel.got. In the screenshot above, the third entry (with LOAD type) is a text segment.

A data segment contains writable data and instructions. It usually includes the following sections: .data, .dynamic, .got, .bss. In the screenshot above, the fourth entry (with LOAD type) is a text segment.

References:

1. System V Application Binary Interface, Apr 2001 http://refspecs.linuxbase.org/elf/gabi4+/contents.html

2. The ELF Object File Format: Introduction. 1995. http://www.linuxjournal.com/article/1059?page=0,0

3. The ELF Object File Format by Dissection. 1995. http://www.linuxjournal.com/article/1060

An Intro to ELF File Format–Part 1 File Types and Dual Views

Executable and Linking Format (ELF) is the object format used in UNIX-like operating systems. This post gives brief introduction to ELF file format in the context of Linux.
We will use the readelf tool from Binutils to view the content of the ELF file of a “hello world” C program as below.

#include <stdio.h>
int main() {
   printf(“hello world!n”);
}

Save the code in a file named test.c, and then compile the source code into a *.o file and an executable with the commands below.

$ gcc test.c -c
$ gcc test.o -o test

The first command should produce a file named test.o, and the second command will output a file test.

File Types

Below are the four main types of ELF files we encountered during development.

  • Relocatable files: created by the compiler and assembler, usually ends with .o extension, and to be processed by the linker to produce the executable or library files.
  • Executable files: created by linker with all relocation done and all symbols resolved except for shared library symbols to be resolved at run time. It specifies how exec creates a program’s process image.
  • Shared object files: created by the linker, contains the symbol information and runnable code needed by the linker. It can be used by the linker to create another object file or used along with other shared objects and executable file by the dynamic linker to create a process image.
  • Core file: a core dump file.

We can check if a file is ELF file by the file command as shown below.

$ file test.o

$ file test

The file command only gives some brief information about the two files. We can use the readelf tool to obtain more info.

$ readelf -h test.o

$ readelf -h test

The -h option means to display the ELF file header. The Magic number is used to indicate the file is ELF file. From the Type field, we can easily tell which ELF file type it belongs to.

From a detailed explanation of the ELF header, we can refer to http://refspecs.linuxbase.org/elf/gabi4+/ch4.eheader.html.

Dual Views

ELF provides dual views of the file content, with the linking view to to facilitate program linking and execution view for program execution. This is shown in figure below.

In the linking view, the file content is seen as sections. The section header table gives information about the section name, type, address, etc. This view is used by the linker to build programs and libraries, therefore the files used in the linking process must has a section header. On the other hand, the execution view is used to create a process image and the file content is seen as segments. The program header table is mandatory.
We can view the section headers and program headers with readelf command.

$ readelf -S test.o

$ readelf -l test.o

There are no program headers in this file.

$ readelf -S test

$ readelf -l test

The -S option displays the section headers while the -l option displays the program headers. Note that the -l option also indicates the section to segment mapping as shown in the last screenshot above. For more detailed info about section headers, we can refer to http://refspecs.linuxbase.org/elf/gabi4+/ch4.sheader.html.

References:

1. System V Application Binary Interface, Apr 2001 http://refspecs.linuxbase.org/elf/gabi4+/contents.html

2. The ELF Object File Format: Introduction. 1995. http://www.linuxjournal.com/article/1059?page=0,0

3. The ELF Object File Format by Dissection. 1995. http://www.linuxjournal.com/article/1060