[[/html/2025/2025-09-11--touching_small_elfs-p1-broken_tools.html|Part 1: Understanding Small ELFs and Fixing Broken Tools.]] Learning objectives: - Handcrafted 80-byte Linux x86-64 ELF (with bugs) - User-space memory violation - Wrong permission bits in memory mapping - Broken kernel ELF mapping (what an offset can do) - Nonexistent ELF mapping (no ''PT_LOAD'') .#######################. ##################################. #######################################-# .#########################################-##+#. .##############.#..###########.-############-+######. ################.+-#+ -#####.-#.############# ## -##-#-. ---#################.##..## ..##..############ ## ##.- .-##-#.#-########-.# ##.#...#-## + #- .#- ### ++#. ... --+#+##.+#.#####.## +###.-#..#.- # # #+ .-- # ++ #+.--+ #####+ .#.#.###.-#-######+.##.##.+### .###### #. #######+ +. -- +. +.+.- # -++.####-. . .##.##+.### ##+##-###### -##- -.. .-- -+ -.+.-.### ---#-# #########.-+-#+.#-. .+######-. .++#..---. ...+....-....### --#.-. .####.+##. .# #...###############.............---........## ...-.#.#-.### # + ###-##+.##############...............-. ........### -#...+#.#++### #.##.###.##############..#########........--........### #++##...#+--######################.###+--+########-............###### #-#######+#####################-...............####...........-###### # #-.....#####################.....................................### #####.--.......-############............#######+...................## # ######..+---+++......######-.........###... ######...............#####. ######..-.......+-.....###-.....####. ########.............#### #. #######+.--.......+....###.-......# ########............####.# #########..-....-+.+...###.#........ #######...........- ### . #+####+###..-.....---..###.#.......... -#####+...........# ## +###########.-.....-+-.###.#...---...-.... ##...............#- #####+#####+.--......###.#...........-.................-+++-... .############- + ..###.#...........--....................-.... .##########+++.+.###.+...............................-...... ##+#######. # #####..+..................................... ####################.#..................................... .######################...........................___..... +######################--................................ ########################----........................... ########################-----+--.................... +########################.-+-------######-...... #########################. -----############+ #-########################.+... .. ############ #########################+....--.. . ############# # ######################+++-..-..-. ############.# #. ######+###########++##++++-++.-... ++###########++ # #######+#########++++##+++.-+-----. .++++-####### # # #############+++++++++#++++++++++--+- +++--++++#### # # -######+#####++++++++-+#+++++++++++++..#+ ++++++++##+ # # #######+####++--+++-+--#---------++++++++ .+++++++### .# # ##############+++------+#++++--------+ # .##+-----+#### # # #######+######+---------#-----------#+#+#+# -++-----#### # # ########+######+---------#---------------.-----------+##+# # # ################------------------------. . - ---------####+# #-###############+-------------------------- .- -.+------###+-# .-################------------------------ -. - -. --------+##### +#################-----------------------.. - ..- ..----+##### #################+----------------------.- . -- ------+##### ##################-------------------+-- - . . - -+--###### #-#####+##########+--------------------- -- ++ ---###### # ######+##########-------------++++----.-- . . --- +####### # ################+---+--------------- --- . ---. ####### # #################------------------ -+--- . +--++######- # #################-+-----+++-------.++--- . -++---####### # ################---+-----+-----..------ . . #+++--+####### # ################-+---+-+----- -+++++-- . . ##+++-+####### # ###############+-++-++---- ++----++--. . -##++++++###### # ###############-++--+-..+- ++++++++-- . . ####++++-#####+ # ###############+++--.--. +++++++++++############+#++#++++++####+++++####.# + ##############+++.-+--+++++++++++++###############+-+#+-########+++++##+ # # #############.+--- ++++++++++++++##+####+#+#####+-++--#-#+####++++++#..# # ###########- -+++++++++++++++++..##############-----+++++###.....-++## +- ########## ++++++++++++++++++.....---++---------------++##-......++. -#.#########+++++++++++++++++++...... ##++++#++++++++++-++#+........+. #..######+++++++++++++++++++........ +#######+#######+++........#++ -# . +#++++++++++++++++++.......... -##+#+###+#++++. .......##+ ##- ++++++++++++++++++........... . ++##++###++ .: ....###+ ++++++++++++++++++.............. . +#++#++. . ::::..#####+ ++++++++++++++++++-+............. . . +#- .:: ::.######+ .+++++++++++++++++------............ . .-.... .:: -######+ ++++++++++++++++++----####............ .. ..--...- ::.########. ++++++++++++++++++-#########+........... ....--.... #########- ++++++++++++++++++############## ......... ...-.--..-.-########+ ++++++++++++++++++##########+++----....... ....-.-.--. -#######+ ++++++++++++++++++######++-----------.... -......--- .-####### +++++++++++++++++++##++-----------------.. ...-.--.- --######. ++++++++++++++++++++-------------------- -.---- ---#####+ ++++++++++++++++++----------------.----- . --+##### +++++++++++++++++++---------------.------ ---##### +++++++++++++++++++---------------------- ---##### ===[ INTRO ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is part two of debugging small ELF64 binaries (for part one, see [ref1]). Here we'll look at an 80-byte overlaid x86-64 ELF binary and walk through how to track down and fix some subtle BUGs inside it. So let's ready our delving skills and delve hard into it! ===[ 80-Byte x86-64 ELF with Bugs ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All chapters in this article use the following 80-byte x86-64 ELF. I've added a few deliberate bugs, marked ''BUG #''. Each chapter fixes one of them using the tools and techniques from the previous article [ref1]. -----------------------------[ elf64-fixme.nasm ]------------------------------ BITS 64 org 0x0000000000000000 ; manual address computation ehdr: db 0x7F ; e_ident[EI_MAG] _code_p1: _start: db 'E' ; rex.rb ; 45 ; e_ident[EI_MAG] db 'L' ; rex.wr ; 4C ; e_ident[EI_MAG] db 'F' ; rex.rx ; 46 ; e_ident[EI_MAG] ; ^ rex.rx => register extention for the following instruction (ignored) syscall ; 0F 05 ; read (0, 0, 0) = 0 ; rax = 0 ; rcx = next instruction mov cl, _str-ehdr ; B1 3A push rcx ; 51 ; <-- ORIG #1 ;db 0xeb ; <-- BUG #1 times 6 nop ;add eax,0x3e0002 ; 05 02 00 3E 00 db 0x05 ; ehdr->e_ident[EI_PAD] db 0x02 ; ehdr->e_type[0] db 0x00 ; ehdr->e_type[1] db 0x3e ; ehdr->e_machine[0] db 0x00 ; ehdr->e_machine[1] pop rsi ; 5E mov dl, 14 ; B2 0E ; mov eax,0x1 ; B8 01 00 00 00 db 0xb8 phdr: db 0x01 ; ehdr->e_entry ; phdr->p_type ; <-- ORIG #4 ;db 0x00 ; ehdr->e_entry ; phdr->p_type ; <-- BUG #4 db 0x00 ; ehdr->e_entry ; phdr->p_type db 0x00 ; ehdr->e_entry ; phdr->p_type db 0x00 ; ehdr->e_entry ; phdr->p_type ; jmp short 0x30 ; EB 12 db 0xeb ; ehdr->e_entry ; phdr->p_flags ; <-- ORIG #2.1 ;db 0xe0 ; ehdr->e_entry ; phdr->p_flags ; <-- BUG #2.1 db 0x12 ; ehdr->e_entry ; phdr->p_flags ; will not use db 0x00 ; ehdr->e_entry ; phdr->p_flags db 0x00 ; ehdr->e_entry ; phdr->p_flags dq phdr - $$ ; ehdr->e_phoff ; phdr->p_offset dq 0x12eb00000000+phdr-ehdr ; ehdr->e_shoff ; phdr->p_vaddr ; <-- ORIG #2.2 ;dq 0x12e000000000+phdr-ehdr; ehdr->e_shoff ; phdr->p_vaddr ; <-- BUG #2.2 ;dq 0x000012eb00000000 ; ehdr->e_shoff ; phdr->p_vaddr ; <-- BUG #3 _location__0x30: syscall ; 0F 05 mov al, 0x3c ; B0 3C ; sys_exit syscall ; 0F 05 dw 0x0038 ; ehdr->e_phentsize ; phdr->p_paddr dw 0x0001 ; ehdr->e_phnum ; phdr->p_filesz dw 0x0000 ; ehdr->e_shentsize ; phdr->p_filesz dw 0x0000 ; ehdr->e_shnum ; phdr->p_filesz dw 0x0000 ; ehdr->e_shstrndx ; phdr->p_filesz ehdr_size equ $ - ehdr dq 0x0000000000000001 ; ; phdr->p_memsz ; <-- ORIG #5 ;dq 0x0000000000000002 ; ; phdr->p_memsz ; <-- BUG #5 _str: db 'WORKING', 0x0a ; ; phdr->p_align ------------------------------------------------------------------------------- Build: nasm -f bin elf64-fixme.nasm -o elf64-fixme chmod 755 ./elf64-fixme Running it unmodified should work: $ ./elf64-fixme WORKING ===[ BUG #1: Classical User-Space Memory Violation ]~~~~~~~~~~~~~~~~~~~~~~~~~~~ The first few bugs are straightforward and can be diagnosed in user space with ''strace''. Let's start with the first. If we comment out ''ORIG #1'' and uncomment ''BUG #1'', then build and run the binary, it fails with a ''Segmentation fault'': $ ./elf64-fixme Segmentation fault (core dumped) When dealing with system errors (and sometimes even logical errors), ''strace'' is an essential debugging tool because it can quickly pinpoint the source of a problem: $ strace ./elf64-fixme execve("./elf64-fixme", ["./elf64-fixme"], 0x7ffd64047dc0 /* 42 vars */) = 0 read(0, NULL, 0) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x12eaffffff9a} --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped) Here we can see that ''execve(2)'' and the first ''read(2)'' syscall completed successfully. The program then failed with ''SIGSEGV''. The kernel reports that the fault was caused by accessing unmapped memory (''SEGV_MAPERR'' -- Address not mapped to an object [ref6]), with the violation address given as ''si_addr=0x12eaffffff9b''. We can examine it further using GDB with the execve-wrapper [ref1]: $ gdb -ex 'file ./execve_wrapper' -ex 'catch exec' -ex 'run ./elf64-fixme' (gdb) x/6i $rip => 0x12eb00000001: rex.RB 0x12eb00000002: rex.WR 0x12eb00000003: rex.RX syscall 0x12eb00000006: mov cl,0x48 0x12eb00000008: jmp 0x12eaffffff9a <--- 0x12eb0000000a: nop Stepping over the first five instructions triggers the error: Cannot access memory at address 0x12eaffffff9a The issue is a jump to ''0x12eaffffff9a'', an unmapped address -- exactly as ''strace'' reported. Before continuing to the next bug, don't forget to revert the code to its pristine state. ===[ BUG #2: Wrong Permission Bits ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The second bug can also be solved easily in user space (but don't worry, this luxury won't last). Comment out all ''ORIG #2.*'' lines and uncomment all ''BUG #2.*'' lines, then build and run the binary. It again throws a ''Segmentation fault'' at us: $ strace ./elf64-fixme execve("./elf64-fixme", ["./elf64-fixme"], 0x7ffc5aabbc10 /* 42 vars */) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x12e000000001} --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped) This time it tells us that the permissions are wrong: ''SEGV_ACCERR'' (invalid permissions for mapped object [ref6]). We can confirm this with GDB: $ gdb -ex 'file ./execve_wrapper' -ex 'catch exec' -ex 'run elf64-fixme' (gdb) x/3i $rip => 0x12e000000001: rex.RB 0x12e000000002: rex.WR 0x12e000000003: rex.RX syscall (gdb) info proc mappings Start Addr End Addr Size Offset Perms objfile 0x12e000000000 0x12e000001000 0x1000 0x0 ---p /elf64-fixme ^ No EXEC permission! NOTE: Modern x86-64 CPUs support ''NX/XD'' (the non-executable bit for pages), and Linux enables ''EFER.NXE'' (Execute Disable Bit Enable). A page is executable only if ''NX=0'' in the mapping. This is contrary to user-space behavior: when we call ''mmap/mprotect'' with the ''PROT_EXEC'' bit, the kernel maps those pages with ''NX'' cleared. Btw, there is no per-page read bit on x86. Our bug effectively prevents the CPU from fetching an instruction from that page, and therefore we get ''SEGV_ACCERR''. Before continuing to the next bug, don't forget to revert the code to its pristine state. ===[ BUG #3: Broken Kernel ELF Mapping ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We'll stay on the topic of ELF mapping. Bugs are way more interesting when the kernel is involved. Comment out ''ORIG #2.2'' and uncomment ''BUG #3'', then build and run it. It fails with ... a ''Segmentation fault'' message: $ ./elf64-fixme Segmentation fault Now, let's run GDB (with the execve wrapper) first, since it fails with a mysterious error: $ gdb -ex 'file ./execve_wrapper' -ex 'catch exec' -ex 'run ./elf64-fixme' Cannot find user-level thread for LWP 1006822: generic error (gdb) info registers Selected thread is running. (gdb) continue Cannot execute this command while the selected thread is running. ''Selected thread is running'' my ass. In reality, the ''execve()'' syscall fails, so the program is definitely not running, as the holy ''strace'' confirms: $ strace ./elf64-fixme execve("./elf64-fixme", ["./elf64-fixme"], ...) = -1 EINVAL (Invalid argument) +++ killed by SIGSEGV +++ Segmentation fault (core dumped) Why does ''execve'' return ''EINVAL''? Let's check the Linux kernel source code. Under what conditions can ''load_elf_binary'' return ''EINVAL''? Well, there are several, but we can narrow them down since we don't care about ''ET_DYN'' or the interpreter (the binary is a static ''ET_EXEC''). That leaves two candidates: 1. ''elf_map'' returned a bad address [ref2]: error = elf_map(bprm->file, load_bias + vaddr, elf_ppnt, elf_prot, elf_flags, total_size); if (BAD_ADDR(error)) { retval = IS_ERR((void *)error) ? PTR_ERR((void*)error) : -EINVAL; goto out_free_dentry; } 2. A memory "overflow" occurred [ref3]: /* * Check to see if the section's size will overflow the * allowed task size. Note that p_filesz must always be * <= p_memsz so it is only necessary to check p_memsz. */ if (BAD_ADDR(k) || elf_ppnt->p_filesz > elf_ppnt->p_memsz || elf_ppnt->p_memsz > TASK_SIZE || TASK_SIZE - elf_ppnt->p_memsz < k) { /* set_brk can never work. Avoid overflows. */ retval = -EINVAL; goto out_free_dentry; } Anyway, we'll be hunting for ''EINVAL''. But first: what number does ''EINVAL'' represent? That way, we know what to look for in GDB: $ errno -l | grep -w EINVAL EINVAL 22 Invalid argument Now we can start QEMU and set a breakpoint on the ELF loader function ''load_elf_binary''. When the breakpoint triggers, set another on ''elf_map'', then finish the function so we can inspect the return address: (gdb) hbreak load_elf_binary (gdb) c (gdb) hbreak elf_map (gdb) c (gdb) fin (gdb) info registers rax rax 0xffffffffffffffea -22 Ok, we verified that the error originates in the ''elf_map'' function. From the linux source code, we see that it has two return points and a prototype with six arguments [ref4]: static unsigned long elf_map( struct file *filep, // rdi unsigned long addr, // rsi const struct elf_phdr *eppnt, // rdx int prot, // r10 int type, // r8 unsigned long total_size // r9 ) Let's run it again (with a breakpoint on ''elf_map''), but this time check the argument values, focusing on the ''addr'' argument: (gdb) hbreak elf_map (gdb) c (gdb) info register rdi rsi rdx r10 r8 r9 rdi 0xffff888005de5300 ; struct file *filep rsi 0x12eb00000000 ; unsigned long addr rdx 0xffff888006012c80 ; const struct elf_phdr *eppnt r10 0x0 ; int prot r8 0x100002 ; int type r9 0x0 ; unsigned long total_size From the source code, we see that ''elf_map'' calls ''vm_mmap'' twice (we're interested in the second call; see [ref5]), and these arguments are passed more or less directly to the ''vm_mmap'' function: map_addr = vm_mmap(filep, addr, size, prot, type, off); The ''vm_mmap'' function prototype is as follows: unsigned long vm_mmap( struct file *file, // rdi unsigned long addr, // rsi unsigned long len, // rdx unsigned long prot, // r10 unsigned long flag, // r8 unsigned long offset // r9 ) And here are the arguments when we break on it in GDB: (gdb) info register rdi rsi rdx r10 r8 r9 rdi 0xffff888005de5300 ; struct file *file rsi 0x12eb00000000 ; unsigned long addr rdx 0x1000 ; unsigned long len r10 0xffff888006012c80 ; unsigned long prot r8 0x100002 ; unsigned long flag r9 0x18 ; unsigned long offset <--- We're almost there. Look at the ''vm_mmap'' function. It has two conditions that return ''-EINVAL'' if they are not met: if (unlikely(offset + PAGE_ALIGN(len) < offset)) return -EINVAL; if (unlikely(offset_in_page(offset))) return -EINVAL; Both conditions check for an invalid offset. And there is one: ''0x18''! Now, how does it get there? For that, we must go back to ''elf_map''. At the beginning of the function, the ''off'' value is computed as follows: unsigned long off = eppnt->p_offset - ELF_PAGEOFFSET(eppnt->p_vaddr); ''eppnt'' is of type ''struct elf_phdr *'' => the program header. This narrows down the area we should be looking at. There are two likely possibilities: either ''phdr->p_offset'' or ''phdr->p_vaddr''. Let's check this insight against the values in our NASM code: phdr->p_offset = 0x18 phdr->p_vaddr = 0x000012eb00000000 ----------------------------------- off = 0x18 - ELF_PAGEOFFSET(0x000012eb00000000) = = 0x18 - (0x000012eb00000000 & (PAGE_SIZE - 1)) = = 0x18 - (0x000012eb00000000 & (0x1000 - 1)) = = 0x18 - (0x000012eb00000000 & 0xfff) = = 0x18 Well, there it is. Now we know where the ''0x18'' came from and that ''phdr->p_vaddr'' is wrong. It points to the beginning of the binary instead of the Program Header, meaning there is an incorrect or missing offset. Before continuing to the next bug, don't forget to revert the code to its pristine state. ===[ BUG #4: Nonexistent ELF Mapping ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We'll finish the article with -- drum roll, please -- a ''Segmentation fault'' message. Comment out ''ORIG #4'' and uncomment ''BUG #4'', build and run it, and wait for a shocking surprise: $ ./elf64-fixme Segmentation fault (core dumped) This segfault also has a different meaning. At first glance, we see that the ''execve'' syscall did not fail and that there is an issue with memory mapping: $ strace ./elf64-fixme execve("./elf64-fixme", ["./elf64-fixme"], 0x7ffc427319f0 /* 42 vars */) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x12eb00000000} --- +++ killed by SIGSEGV (core dumped) +++ Segmentation fault (core dumped) In GDB, we can see that the address ''0x12eb00000000'' was not mapped at all: $ gdb -ex 'file ./execve_wrapper' -ex 'catch exec' -ex 'run ./elf64-fixme' (gdb) x/10i $rip => 0x12eb00000000: Cannot access memory at address 0x12eb00000000 (gdb) info proc mappings Mapped address spaces: Start Addr End Addr Size Offset Perms objfile 0x7ffff7ff9000 0x7ffff7ffd000 0x4000 0x0 r--p [vvar] 0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x0 r-xp [vdso] 0x7ffffffde000 0x7ffffffff000 0x21000 0x0 rw-p [stack] In the previous chapter, we saw that ELF segment mappings are implemented in the ''elf_map'' function. Let's use ''bpftrace'' to check what it does: 1. Start a new shell to trace the specific PID (otherwise we will get lost among system executions): $ sh $ echo $$ 1013154 2. Trace the PID and the ''elf_map'' function, which we already know maps the binary into memory: # bpftrace -e 'kr:elf_map { if (pid == $1) { printf ("%d\n%d %s\n", pid, retval, kstack); } }' 1013154 Attaching 1 probe... 3. Run the program without forking (= replace the shell with our program while keeping the same PID): $ sh $ echo $$ 1013154 $ exec ./elf64-fixme 4. Check the ''bpftrace'' output... aaaand there is nothing! (NOTE: Always make sure you can trace other executables so you don't end up chasing ghosts. Especially in cases like this where nothing is printed. It is better to set two kprobes: one for ''load_elf_binary'' and one for ''elf_map'', and print the PID/TID for both => the ''load_elf_binary'' kprobe triggers when ''execve'' occurs, and if the ''elf_map'' kprobe doesn't trigger, we know it never even tried to map the ELF into memory.) From the absence of output, we can infer that the mapping is broken. Let's run our ''readsmallelf'' tool [ref1] to check whether this hypothesis is correct: $ ./readsmallelf ./elf64-fixme ... p_type = 00000000 (0) <--- NO 'PT_LOAD'! ... Well, yes -- on the ''BUG #2'' line, we set ''phdr->p_type'' to ''PT_NULL'', so there was no ''PT_LOAD'' segment and nothing from the binary was mapped into memory. Moreover, if we had properly implemented ELF and program header checks in our ''readsmallelf'', this would never happen, as we would catch it soon enough. Before continuing to the next bug, don't forget to revert the code to its pristine state. ===[ OUTRO ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As we have seen, most of the time it's ''Segmentation fault'' when working with barely legal small ELFs. Many ways exist to botch ELF creation, especially when making "smart" decisions. For instance, using ''phdr->p_flags'' and ''phdr->p_type'' as part of the code can make debugging quite an experience. Another common error is ''Exec format error'', which indicates a broken ELF structure. Here, the best approach is to make incremental changes, test them, and roll back when an executable breaks. There is one more bug I want to show (''BUG #5''). It's a real treat because the execution partially works. NEXT: Stay tuned for the next article. PREVIOUS: [[/html/2025/2025-09-11--touching_small_elfs-p1-broken_tools.html|Part 1: Understanding Small ELFs and Fixing Broken Tools.]] Hack the planet! ===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [ref1] https://research.h4x.cz/html/2025/2025-09-11--touching_small_elfs-p1-broken_tools.html > [ref2] https://elixir.bootlin.com/linux/v6.1.137/source/fs/binfmt_elf.c#L1167 > [ref3] https://elixir.bootlin.com/linux/v6.1.137/source/fs/binfmt_elf.c#L1205 > [ref4] https://elixir.bootlin.com/linux/v6.1.137/source/fs/binfmt_elf.c#L365 > [ref5] https://elixir.bootlin.com/linux/v6.1.137/source/fs/binfmt_elf.c#L394 > [ref6] https://www.man7.org/linux/man-pages/man2/sigaction.2.html