ELF文件结构分析(arm gnu版本)

发布时间 2023-10-28 21:10:23作者: zephyr~

为了学习使用objdump和size命令,以simple_section.c为例进行分析。
编译环境是x86 ubuntu下的交叉编译arm指令:
首先编译这个文件。
arm-none-eabi-gcc -c simple_section.c

命令解释

objdump

作用:分析二进制文件的内容信息
arm-none-eabi-objdump --help

Usage: arm-none-eabi-objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
      --disassemble=<sym>  Display assembler contents from <sym>
  -S, --source             Intermix source code with disassembly
      --source-comment[=<txt>] Prefix lines of source code with <txt>
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  • -d: 将代码段反汇编
  • -S:将代码段反汇编的同时,将反汇编代码和源代码交替显示,编译时需要给出-g,即需要调试信息。
  • -C:将C++符号名逆向解析。
  • -l:反汇编代码中插入源代码的文件名和行号。
  • -j section:仅反汇编指定的section。可以有多个-j参数来选择多个section。

size

作用:显示目标文件.code代码段、.data数据段、.bss段的大小。
使用arm-none-eabi-size命令要分别使用-A,-G参数,查看代码段和数据段的参数,综合判断
arm-none-eabi-size --help

Usage: arm-none-eabi-size [option(s)] [file(s)]
 Displays the arm-none-eabi-sizes of sections inside binary files
 If no input file(s) are specified, a.out is assumed
 The options are:
  -A|-B|-G  --format={sysv|berkeley|gnu}  Select output style (default is berkeley)
  -o|-d|-x  --radix={8|10|16}         Display numbers in octal, decimal or hex
  -t        --totals                  Display the total arm-none-eabi-sizes (Berkeley only)
            --common                  Display total arm-none-eabi-size for *COM* syms
            --target=<bfdname>        Set the binary file format
            @<file>                   Read options from <file>
  -h        --help                    Display this information
  -v        --version                 Display the program's version

readelf

readelf比objdump能显示更多的信息,比如arm-none-eabi-readelf -S simple_setction.o,会显示所有的段。arm-none-eabi-objdump -h只是显示了关键的几个段。
arm-none-eabi-readelf --help

Usage: arm-none-eabi-readelf <option(s)> elf-file(s)
 Display information about the contents of ELF format files
 Options are:
  -a --all               Equivalent to: -h -l -S -s -r -d -V -A -I
  -h --file-header       Display the ELF file header
  -l --program-headers   Display the program headers
     --segments          An alias for --program-headers
  -S --section-headers   Display the sections' header
     --sections          An alias for --section-headers
  -g --section-groups    Display the section groups
  -t --section-details   Display the section details
  -e --headers           Equivalent to: -h -l -S
  -s --syms              Display the symbol table
     --symbols           An alias for --syms
  --dyn-syms             Display the dynamic symbol table
  -n --notes             Display the core notes (if present)
  -r --relocs            Display the relocations (if present)
  -u --unwind            Display the unwind info (if present)
  -d --dynamic           Display the dynamic section (if present)
  -V --version-info      Display the version sections (if present)
  -A --arch-specific     Display architecture specific information (if any)
  -c --archive-index     Display the symbol/file index in an archive
  -D --use-dynamic       Use the dynamic section info when displaying symbols
  -x --hex-dump=<number|name>
                         Dump the contents of section <number|name> as bytes
  -p --string-dump=<number|name>
                         Dump the contents of section <number|name> as strings
  -R --relocated-dump=<number|name>
                         Dump the contents of section <number|name> as relocated bytes
  -z --decompress        Decompress section before dumping it

各个段的基本信息

arm-none-eabi-objdump -h simple_section.o

simple_section.o:     file format elf32-littlearm

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         0000008c  00000000  00000000  00000034  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  1 .data         00000008  00000000  00000000  000000c0  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  2 .bss          00000008  00000000  00000000  000000c8  2**2
                  ALLOC
  3 .rodata       00000004  00000000  00000000  000000c8  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .comment      0000004e  00000000  00000000  000000cc  2**0
                  CONTENTS, READONLY
  5 .ARM.attributes 0000002a  00000000  00000000  0000011a  2**0

arm-none-eabi-size -A simple_section.o

simple_section.o  :
section           size   addr
.text              140      0
.data                8      0
.bss                 8      0
.rodata              4      0
.comment            78      0
.ARM.attributes     42      0
Total              280

.code代码段

从反汇编内容可以看到,代码段长度为0x88+4等于0x8c=140,需要加上最后的一个word的数据。
与arm-none-eabi-objdump -h和arm-none-eabi-size -A显示的代码段大小相同。

arm-none-eabi-objdump -d simple_section.o

simple_section.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <func1>:
   0:	e92d4800 	push	{fp, lr}
   4:	e28db004 	add	fp, sp, #4
   8:	e24dd008 	sub	sp, sp, #8
   c:	e50b0008 	str	r0, [fp, #-8]
  10:	e51b1008 	ldr	r1, [fp, #-8]
  14:	e59f0010 	ldr	r0, [pc, #16]	; 2c <func1+0x2c>
  18:	ebfffffe 	bl	0 <printf>
  1c:	e1a00000 	nop			; (mov r0, r0)
  20:	e24bd004 	sub	sp, fp, #4
  24:	e8bd4800 	pop	{fp, lr}
  28:	e12fff1e 	bx	lr
  2c:	00000000 	.word	0x00000000

00000030 <main>:
  30:	e92d4800 	push	{fp, lr}
  34:	e28db004 	add	fp, sp, #4
  38:	e24dd008 	sub	sp, sp, #8
  3c:	e3a03001 	mov	r3, #1
  40:	e50b3008 	str	r3, [fp, #-8]
  44:	e59f3038 	ldr	r3, [pc, #56]	; 84 <main+0x54>
  48:	e5932000 	ldr	r2, [r3]
  4c:	e59f3034 	ldr	r3, [pc, #52]	; 88 <main+0x58>
  50:	e5933000 	ldr	r3, [r3]
  54:	e0822003 	add	r2, r2, r3
  58:	e51b3008 	ldr	r3, [fp, #-8]
  5c:	e0822003 	add	r2, r2, r3
  60:	e51b300c 	ldr	r3, [fp, #-12]
  64:	e0823003 	add	r3, r2, r3
  68:	e1a00003 	mov	r0, r3
  6c:	ebfffffe 	bl	0 <func1>
  70:	e51b3008 	ldr	r3, [fp, #-8]
  74:	e1a00003 	mov	r0, r3
  78:	e24bd004 	sub	sp, fp, #4
  7c:	e8bd4800 	pop	{fp, lr}
  80:	e12fff1e 	bx	lr
  84:	00000004 	.word	0x00000004
  88:	00000004 	.word	0x00000004

.data数据段

数据段存放的是已初始化的全局静态变量和局部静态变量
所以只保留了simple_section.c文件的global_init_var和static_var两个变量,共8个字节。

.rodata只读数据段

printf使用的字符串常量"%d\n",一共四个字符放到了".rodata段"。

可以看到text大小为140,而不是上面的text,那是因为这里的text等于text+rodata
arm-none-eabi-size simple_section.o

   text	   data	    bss	    dec	    hex	filename
    144	      8	      8	    162	     a2	simple_section.o

.bss段(block started by symbol)

.bss段存放的是未初始化的全局变量和局部静态变量
所以只保留了simple_section.c代码的global_uinit_var和static_var2,共8个字节。

其他段

段的名字以"."作为前缀,表示这些段的名字是为系统保留的

  • .rodata1
    Read only Data,这种段里存放的是只读数据,比如字符串常量、全局CONS
    变量。跟".rodata"一样
  • .comment
    存放的是编译器版本信息,比如字符串:"GCC:(GNU1) 4.2.0'
  • .debug
    调试信息
  • .dynamic
    动态链接信息
  • hash
    符号哈希表
  • .line
    调试时的行号表,即源代码行号与编译后指令的对应表
  • .note
    额外的编译器信息。比如程序的公司名、发布版本号等
  • .strtab
    String Table.字符串表,用于存储ELF文件中用到的各种字符串
  • .symtab
    Symbol Table。符号表
  • .shstrtab
    SectionString Table。段名表
  • .plt / .got
    动态链接的跳转表和全局入口表
  • .init / .fini
    程序初始化与终结代码段。"C++全局构造与析构"会用到

自定义段

GCC提供的扩展机制,将指定的变量和函数放到所处的段。
"__attribute__((section("name")))"

simple_section.c示例代码

int printf(const char* format, ...);

int global_init_var = 84;
int global_uninit_var;

void func1(int i)
{
    printf("%d\n", i);
}

int main(void)
{
    static int static_var = 85;
    static int static_var2;
    int a = 1;
    int b;
    func1(static_var + static_var2 + a + b);
    return a;
}