ELF structure

发布于 2019-10-26  65 次阅读


What is ELF

  • ELF (Executable and Linkable Format)
    • Relocatable File (.o file in Linux)
    • Executable File (a.out in Linux for example)
    • Shared Object File (.so file in Linux)
ELF's profile

ELF's structure in memory and storage

Linking view for linking and storaging in storage
Execution View for running in memory
  • ELF Header: It provide the layout of whole file
  • Program Header Table: Serving Segment. To tell system how to create process. Object files used to create process must have it.
  • Section Header Table: Serving Section. Describes information of sections. Object files used for linking must have it.
  • Section is for linking and loading and Segment is for marking defferent permmison in defferent address during excution.

ELF Header

#define EI_NIDENT   16
typedef struct {
    unsigned char   e_ident[EI_NIDENT];
    ELF32_Half      e_type;
    ELF32_Half      e_machine;
    ELF32_Word      e_version;
    ELF32_Addr      e_entry;
    ELF32_Off       e_phoff;
    ELF32_Off       e_shoff;
    ELF32_Word      e_flags;
    ELF32_Half      e_ehsize;
    ELF32_Half      e_phentsize;
    ELF32_Half      e_phnum;
    ELF32_Half      e_shentsize;
    ELF32_Half      e_shnum;
    ELF32_Half      e_shstrndx;
} Elf32_Ehdr;

Program Header Table

ELF32_Phdr program_header_table[0];
typedef struct {
    ELF32_Word  p_type;
    ELF32_Off   p_offset;
    ELF32_Addr  p_vaddr;
    ELF32_Addr  p_paddr;
    ELF32_Word  p_filesz;
    ELF32_Word  p_memsz;
    ELF32_Word  p_flags; // Segment's rwx permmision
    ELF32_Word  p_align;
} Elf32_Phdr;

Section Header Table

ELF32_Shdr section_header_table[]
typedef struct {
    ELF32_Word      sh_name;
    ELF32_Word      sh_type;
    ELF32_Word      sh_flags;
    ELF32_Addr      sh_addr;
    ELF32_Off       sh_offset;
    ELF32_Word      sh_size;
    ELF32_Word      sh_link;
    ELF32_Word      sh_info;
    ELF32_Word      sh_addralign;
    ELF32_Word      sh_entsize;
} Elf32_Shdr;

Segments and Sections

One segment includes multiple sections. Overview below:

  • Text Segment (includes sections with read-only code and data)
    • .text
    • .rodata
    • .hash
    • .dynsym
    • .dynstr
    • .plt
    • .rel.got
    • ……
  • Data Segment (includes sections with writable code and data)
    • .data
    • .dynamic
    • .got
    • .bss
    • ……

Defferent segments may overlap and include a same section.

Illustration of section

.text

main part of instruction in

.init

Initial instructions

.init.array

Addresses of initial instructions waiting for excution.

.fini

Instructions for exiting.

.fini.array

Address of instructions which excuted when process is about to exit.

.data

Includes initialized data.

.rodata

Includes read-only data.

.bss : Block Start by Symbol

Includes uninitialized global variables. Do not take space of its ELF file. Corresponding memory region will be initialize with \x00 when process starts to excute.

.got : Global Offset Table

Has two tables:

  • .got
    • Storages global variables' references.
  • .got.plt
    • Storages global functions' references.

First three items of .got.plt:

  • GOT[0] : address of .dynamic
  • GOT[1] : address of link_map
  • GOT{2] : address of _dl_runtime_resolve function

.plt : Procedure Linkage Table

Has two tables:

  • .plt
    • A series of instructions to resolve imported functions' address
  • .plt.got
    • Related to dynamic link

Instructions in .plt (the final result is call GOT[func] or call _dl_runtime_resolve(link_map, index) ) :

PIE disable:

.PLT0:pushl got_plus_4 // link_map
      jmp   *got_plus_8 // _dl_runtime_resolve
      nop; nop
      nop; nop
.PLT1:jmp   *name1_in_GOT
      pushl $offset@PC // index
      jmp   .PLT0@PC
.PLT2:jmp   *name2_in_GOT
      push  $offset // index
      jmp   .PLT0@PC
      ...

PIE enable:

.PLT0:pushl 4(%ebx) // link_map
      jmp   *8(%ebx) // _dl_runtime_resolve
      nop; nop
      nop; nop
.PLT1:jmp   *name1_in_GOT(%ebx)
      pushl $offset // index
      jmp   .PLT0@PC
.PLT2:jmp   *name2_in_GOT(%ebx)
      push  $offset // index
      jmp   .PLT0@PC
      ...

.strtab : String Table

Storages strings in ELF, including variable name and function name.

Does not loaded into memory, whose subset .dynstr is loaded instead.

Storage format:

索引+0+1+2+3+4+5+6+7+8+9
0\0name.\0Var
10iable\0able
20\0\0xx\0

.symtab : Symbol Table

Storage symbols' information of ELF

Symbols include variables and functions.

Symbol Table is an array, whose elements have structure below:

typedef struct {
    Elf32_Word      st_name; # Symbol name's index in string table. If it is 0, which means corresponding symbol has no name.
    Elf32_Addr      st_value;
    Elf32_Word      st_size;
    unsigned char   st_info;
    unsigned char   st_other;
    Elf32_Half      st_shndx;
} Elf32_Sym;

.dynamic

The program using dynamic link have this section.

Storage this information:

  • Which dynamic library is used
  • Information about .dynsym
  • Information about .dynstr
typedef struct {
    Elf32_Sword     d_tag;
    union {
        Elf32_Word  d_val;
        Elf32_Addr  d_ptr;
    } d_un;
} Elf32_Dyn;
extern Elf32_Dyn_DYNAMIC[];

Include a serice of key-value pairs: d_val->d_ptr. And d_tag provide d_val with a clear meaning

.dynstr

String table for dynamic link

.dynsym

Symbol table for dynamic link

.rel(a).dyn & .rel(a).plt

Include relocation information.

.rel(a).dyn for variables which needs reloaction

.rel(a).plt for functions which needs relocation

typedef struct {
    Elf32_Addr        r_offset;
    Elf32_Word       r_info;
} Elf32_Rel;

typedef struct {
    Elf32_Addr     r_offset;
    Elf32_Word    r_info;
    Elf32_Sword    r_addend;
} Elf32_Rela;

ELF32_Rel is for 32bit architecture program, and ELF32_Rela is for 64bit architecture program.