.-----------------------------------------------------------------------------.
| Touching Small x86-64 ELFs (Part 1): Broken tools |
'-----------------------------------------------------------------------------'
updated: 2025-10-01
Learning objectives:
- Understand the reduction of a 124-byte ELF64 file to 80 bytes.
- Understand how overlaying the ELF Header and the Program header works.
- Circumvent problems with:
- ELF tools (readelf, eu-readelf, objdump),
- radare2,
- GDB,
- kprobes (bpftrace).
- Craft a custom readelf.
- Use QEMU to debug an ELF loader.
########################-
##################################
########################################
.####################################.+.-#####
##################################-##+####+#+----.
####################################..-####.####+#-.#.
#################################-.######...#####.#######
###########################+.###+.#####+#-.##+..+######-###.
###########################.###+.##+...+###############+#+####
#### ###############-.####.#######+##..-################.#.#####
#####+#####++..#########..########.#+.################+.#...##+##
+##++--+#+#+..........-####+.##+.+#.#################..#....#####-
###########+#########################################+.+#.....######
###++###+##########################################.+#.......-#####
#####+###+######################################+-####+.......#####
##################+###########+############################....#####
######-######+#############################-...........###....##+##
.#######.-.....##########################.....................### #
.+###### +.-.........############+####...........+####+.......##.#-
.#########..+---++--.....+#########..........+##+..####......###.
####+#####..--......-+.....#######-.....+###+. #####....###
. ##########+..-........+...#######+.......# #####....#
.##########++..-....---.-..########......... . #####.....
#######+#+...+.....-+-..+#######........... -###....
. ######.+.#-.--.....+-.+#######....---....... #......
. ######.-.+.#-.--......++######........--...................
########.-.####-......++###+##............--...............
#####################++-###-##...........................
++#####################+-###.###.........................
#####+###### ###########.###.###..................+-....
+-####+##################.###.###........................
#########################-###.###.......................
-.########################-###-+###....................
#-####+###################-###--###...................
.-.+####+##################- ###--#### ........
# #####+##################. ### -####
+ ########################--### ###-
#..#######+#############+##--###---####
#. #####+################+#---###----###
# ####+###################----###---####
# #####+##################-----###+---###
# -#######+##############+#-------###---###
.# ########################---------+###--##
.-. ########+##############+#------------###+##
.# #####+################+#---------------#+##
.# ########################+-----------.----###--
# ####+####################-----------------###+---
# #########################------------------##------
+. #########################------------------###-------
# ##########################-----------------++##---------
# .#########################+--------.--------#-##----------
#.######+###################--------..-------+--##-----------
#.####+######################---------.----------##+---+------
# ###########################--------------------#-------------
.+###########################--------------------##--------+----
# ###########################--------------------#--------------
#############################------------------++---+-----------
####+#######################+----------------------------------
###+########################---------------.-----+----------++-
############################---------------------++------+--+++
.#+#########################---------------------+--+--###+###+
.##########################----------------------+#########+##-
######+###################-------------------+--+#######+####-
.###############+##########------------------++--+#####++#####-
#####+####################------------------++---###+########-
########################------------------+++---##+#########+
.########################------------------+++---++++#########
.########################-----------------++++---++#######+###
#######################----------------+++++---#############
######################-----------------##+++---###########++
#####################----------------++++++---############+.
#####################---------------+####+++--#######++####+
####+######+####+###---------------++++++++--++++###+#++##+
.#################---------------+++++++++--++++++#+++++##
#########+######---------------+++++++++--+++++++++++++#+
.##############--------------+++++++++++-++++++++++++++#
.############---------------++++++++++--+++++++++++++++
.##########---------------+++++++++++-+++++++++++++++
.########--------------++++++++++++++++++++++++++++.
.--------------+++++++++++++++++++++++++++++
--------------++++++++++++++++++++++++++++++
--------------++++++++++++++++++++++++++++++.
.--------------+++++++++++++++++++++++++++++++
--------------++++++++++++++++++++++++++++++++
In this article, we'll focus on the incredible journey of *debugging* small and
barely legal ELF binaries. The examples here will operate on an 80-byte x86-64
Linux ELF binary, whose source code will appear in Part 2.
If you're looking for how to craft what is probably the smallest x86-64 ELF
(73 bytes), there's a great article by lm978, published by tmp.out: Cramming a
Tiny Program into a Tiny ELF File: A Case Study ~lm978 [ref1]. See also
comments from LegionMammal978 in [ref2] and [ref3]. It's definitely worth
reading.
Here is a crash course on how small ELFs work:
1. An executable ELF needs two headers:
- ELF Header ('EHDR') and
- Program Header ('PHDR').
The ELF Header specifies the type of binary and the location of the Program
Header. The Program Header describes how to load the binary into memory.
2. The Linux kernel is very lenient when checking ELF metadata. It verifies
only the bare minimum.
Here is the full ELF Header ('Elf64_Ehdr') data structure, along with the
fields the Linux kernel actually checks:
THIS FIELD IS CHECKED
Elf64_Ehdr SIZE [BYTES] LINUX QEMU-X86_64
-------------------------------------------------------------
e_ident[EI_MAG] 4 YES YES
e_ident[EI_CLASS] 1 no YES
e_ident[EI_DATA] 1 no YES
e_ident[EI_VERSION] 1 no YES
e_ident[EI_OSABI] 1 no no
e_ident[EI_ABIVERSION] 1 no no
e_ident[EI_PAD] 7 no no
e_type 2 YES YES
e_machine 2 YES YES
e_version 4 no no
e_entry 8 YES YES
e_phoff 8 YES YES
e_shoff 8 no no
e_flags 4 no no
e_ehsize 2 no YES
e_phentsize 2 YES YES
e_phnum 2 YES YES
e_shentsize 2 no no
e_shnum 2 no no
e_shstrndx 2 no no
-------------------------------------------------------------
EHDR's total size = 64
From the Linux kernel's point of view, only 7 fields from the ELF Header are
crucial. The rest is an all-you-can-eat buffet.
The ELF Program Header ('Elf64_Phdr') is much trickier:
Elf64_Phdr SIZE [BYTES] REQUIRED BY LINUX?
-----------------------------------------------------------------------------
p_type 4 REQUIRED. Must be 0x00000001 (PT_LOAD)
p_flags 4 Somewhat. LSB must be 1 (= PF_X exec flag)
p_offset 8 REQUIRED
p_vaddr 8 REQUIRED
p_paddr 8 ignored
p_filesz 8 REQUIRED. Must be >= 1 && <= file_size (see [1])
p_memsz 8 See [1]
p_align 8 ignored
-----------------------------------------------------------------------------
PHDR's total size = 56
[1] 'p_filesz' and 'p_memsz' are the trickiest fields! We can do some
shenanigans with them, but it's very VERY fragile. Notes:
- 'p_filesz' can be less than the actual file size only if the segment fits
within a single memory page ('PAGE_SIZE') [ref5] [ref6].
- The first 'PT_LOAD' segment is kinda special, and the Linux kernel
doesn't like it at all when its 'p_memsz != p_filesz'. In such cases,
most likely 'execve(2)' fails ... until it doesn't [ref7].
- 'p_memsz > p_filesz' can also lead to zeroing out part of the binary
(of course it does).
- 'p_memsz' must be less than 'TASK_SIZE', which is either:
* 47 bits (minus PAGE_SIZE) normally, or
* 56 bits (minus PAGE_SIZE) with 5-level paging enabled [ref9] [ref10].
- 'p_memsz' must either:
* be less then the maximum available memory, or
* memory overcommit must be enabled ('/proc/sys/vm/overcommit_memory').
- Overcommit can be combined with 5-level paging to gain 9 additional bits of
virtual address space (= from 48 bits to 57 bits). [ref11] [ref12]
For more details, see lm978's article [ref1] and my notes on debugging in
Part 2 and 3.
3. If we combine the EHDR and PHDR correctly, the binary size will be 120
bytes:
EHDR_size + PHDR_size = 64 + 56 = 120
This can be seen in [ref13].
4. From the tables above, we can see that the Linux kernel doesn't check the
last three EHDR fields: 'e_shentsize, e_shnum, e_shstrndx'. What happens if
we shift the start of the Program Header to the position of 'e_shentsize'?
Well, the kernel takes it without any problems. This overlays EHDR and PHDR,
saving 6 bytes. It's the simplest overlay we can do.
5. But with great amount of thinking, we can push the overlay even further by
combining fields between EHDR and PHDR. Here's a diagram that helped me wrap my
head around it:
The ELF Header:
EI_MAG e_ident[EI_CLASS]...e_ident[EI_PAD] e_type,mach e_version e_entry e_phoff e_shoff, e_flags, e_ehsize e_phentsize,phnum e_shentsize...shstrndx
|xxxxxxxxx| |---------------------------------| |xxx| |xxx| |---------| |--------------xxxxxxx| |--xxxxxxxxxxxxxxxxxxx| |---------------------------------| |xxx| |xxx| |xxx| |---------------|
7f 45 4c 46 .. .. .. .. .. .. .. .. .. .. .. .. 03 00 3e 00 .. .. .. .. 01 00 00 00 01 00 00 00 .. 00 00 00 00 00 00 00 .. .. .. .. .. .. .. .. .. .. .. .. 40 00 38 00 01 00 .. .. .. .. .. ..
The Program Header:
p_type p_flags p_offset p_vaddr p_paddr p_filesz p_memsz p_align
01 00 00 00 .1 .. .. .. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .. .. .. .. .. .. .. .. 01 00 00 00 00 00 00 00 78 00 00 00 00 00 00 00 .. .. .. .. .. .. .. ..
|xxxxxxxxx| |-x-------| |--xxxxxxxxxxxxxxxxxxx| |--------------xxxxxxx| |---------------------| |--xxxxxxxxxxxxxxxxxxx| |--------xxxxxxxxxxxxx| |---------------------|
There are several possibilities to overlay the headers. Here's one example that
results in an 80-byte ELF:
EI_MAG e_ident[EI_CLASS]...e_ident[EI_PAD] e_type,mach e_version e_entry e_phoff e_shoff, e_flags, e_ehsize e_phentsize,phnum e_shentsize...shstrndx
|xxxxxxxxx| |---------------------------------| |xxx| |xxx| |---------| |--------------xxxxxxx| |--xxxxxxxxxxxxxxxxxxx| |---------------------------------| |xxx| |xxx| |xxx| |---------------|
7f 45 4c 46 .. .. .. .. .. .. .. .. .. .. .. .. 03 00 3e 00 .. .. .. .. 01 00 00 00 01 00 00 00 .. 00 00 00 00 00 00 00 .. .. .. .. .. .. .. .. .. .. .. .. 40 00 38 00 01 00 .. .. .. .. .. ..
01 00 00 00 .1 .. .. .. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .. .. .. .. .. .. .. .. 01 00 00 00 00 00 00 00 78 00 00 00 00 00 00 00 .. .. .. .. .. .. .. ..
|xxxxxxxxx| |-x-------| |--xxxxxxxxxxxxxxxxxxx| |--------------xxxxxxx| |---------------------| |--xxxxxxxxxxxxxxxxxxx| |--------xxxxxxxxxxxxx| |---------------------|
p_type p_flags p_offset p_vaddr p_paddr p_filesz p_memsz p_align
Legend:
* '-' / '..' -- can be any byte
* 'x' / '00' -- checked by the kernel => required
As we can see, the headers can be overlaid pretty hard (see [ref1] for more
overlay options). Now the question is: how do we debug this?
We're facing several problems. Overlaid ELFs are not fully valid, and existing
tools often get confused -- to the point of failing completely.
For example, 'readelf' will straight-up lie:
-----------------------[ Confused readelf is confused ]------------------------
$ readelf --file-header --program-headers ./elf64-80_bytes
ELF Header:
Magic: 7f 45 4c 46 5e 5e fe c2 f6 04 16 ff 75 f8 eb 04
Class: <unknown: 5e>
Data: <unknown: 5e>
Version: 254 <unknown>
OS/ABI: <unknown: c2>
ABI Version: 246
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64 # <-- BS
Version: 0x18eb01b0
Entry point address: 0x1 # <-- BS
Start of program headers: 7 (bytes into file) # <-- BS
Start of section headers: 24 (bytes into file)
Flags: 0x0
Size of this header: 24 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 7
Size of section headers: 0 (bytes)
Number of section headers: 51081
Section header string table index: 1295
readelf: Error: Too many program headers - 0x7 - the file is not that big
-------------------------------------------------------------------------------
'eu-readelf' from elfutils fails completely:
----------------------------[ eu-readelf gives up ]----------------------------
$ eu-readelf -h ./elf64-80_bytes
eu-readelf: failed reading './elf64-80_bytes': not a valid ELF file
-------------------------------------------------------------------------------
'objdump' fails as well:
-----------------------[ Welcome to the club, objdump ]------------------------
$ objdump -x ./elf64-80_bytes
objdump: ./elf64-80_bytes: file format not recognized
-------------------------------------------------------------------------------
'gdb' will refuse to load such an ELF:
--------------------------[ GDB refuse to cooperate ]--------------------------
$ gdb ./elf64-80_bytes
"/a/elf/elf64-80_bytes": not in executable format: file format not recognized
-------------------------------------------------------------------------------
Even 'radare2' is confused:
------------------------[ Like readelf, like radare2 ]-------------------------
$ r2 ./elf64-80_bytes
[0x00000001]> i ~^arch,bits
arch x86
bits 32
[0x00000001]> aa
INFO: Analyze all flags starting with sym. and entry0 (aa)
ERROR: af: Cannot find function at 0x00000001
-------------------------------------------------------------------------------
Attaching kprobes seems to fail as well:
------------------------------[ WTF?! bpftrace ]-------------------------------
# bpftrace -e 'kprobe:load_elf_binary { printf ("hit\n") }'
Attaching 1 probe...
[WARN] libbpf: prog 'kprobe_load_elf_binary_1': BPF program load failed: Invalid argument
[WARN] libbpf: prog 'kprobe_load_elf_binary_1': failed to load: -22
cannot attach kprobe, Cannot assign requested address
ERROR: Error attaching probe: kprobe:load_elf_binary
-------------------------------------------------------------------------------
So what does work? Or what can we fix to make the tools behave properly?
Let's start with kprobes, as they're the easiest to "fix". The problem isn't
that the symbol is inlined or blacklisted; it neither has the 'notrace'
attribute nor is marked with 'NOKPROBE_SYMBOL()'. It's normally traceable.
However, we must use its address instead of the symbol name, because there are
two different addresses associated with the same symbol:
# grep -w load_elf_binary /proc/kallsyms
ffffffff813e2c80 t load_elf_binary
ffffffff813e58d0 t load_elf_binary
# bpftrace -e 'k:$1 { printf ("hit\n") }' 0xffffffff813e2c80
stdin:1:1-6: WARNING: 0xffffffff813e2c80 is not traceable (either
non-existing, inlined, or marked as "notrace"); attaching to it
will likely fail.
Attaching 1 probe...
hit
hit
hit
Both addresses are traceable, but one corresponds to the 64-bit ELF and the
other to the 32-bit version. (See [ref14] for more details on symbols with
multiple addresses.)
In the default binary auto-detection run, radare2 gets confused in a similar
way to 'readelf'. It incorrectly identifies the binary as a 32-bit x86
executable and maps and disassembles it accordingly.
Here's a quick example of how to work with overlaid ELFs (see [ref15] for more
details):
$ r2 -n -m 0x700000000 -a x86 -b 64 -c 'f entry0=0x700000001' -c 's entry0' \
./elf64-80_bytes
[0x700000001]> af
[0x700000001]> pdf
;-- entry0:
0x700000001 454c46b20d mov dl, 0xd ; 13
0x700000006 5e pop rsi
0x700000007 5e pop rsi
...
And we're golden.
GDB is extremely picky, but it can be persuaded to cooperate. To do this, we
need to trick it with a valid ELF. We'll create an ELF wrapper binary that
executes our small ELF. It's super simple -- it just calls 'execve(2)' like
so:
-----------------------------[ execve_wrapper.c ]------------------------------
#include <unistd.h>
int main (int argc, char **argv, char **envp) {
return execve (argv[1], argv, envp);
}
-------------------------------------------------------------------------------
gcc -Wall -std=gnu99 -O0 ./execve_wrapper.c -o ./execve_wrapper
Now we can run 'execve_wrapper' in GDB, catch the 'execve(2)' syscall, and
then we'll be inside the ELF code:
$ gdb -ex 'catch exec' -ex 'run' --args ./execve_wrapper ./elf64-80_bytes
(gdb) x/5i $rip
=> 0x12eb00000001: rex.RB
0x12eb00000002: rex.WR
0x12eb00000003: rex.RX syscall
0x12eb00000006: mov cl,0x48
0x12eb00000008: push rcx
This is great when debugging code that jumps between "usable" parts of the ELF
headers.
Reading the kernel source code is awesome! But sometimes it's even better to
see what it actually does at the binary level (which is also an awesome
experience, as we'll see in Part 3).
The Linux kernel is "easy" to debug within QEMU. There are four essential
components we need (and one optional):
- a bootable kernel,
- kernel debug symbols,
- an initrd with the ELF binary we want to debug,
- QEMU to run the kernel.
- (Optional) The kernel source code for the GDB session. (I'm hard core, so I
usually don't use it.)
Here's the setup for Debian 12 Bookworm:
1. Install a bootable kernel ('vmlinuz') and its debug symbols ('vmlinux'):
apt install linux-image-amd64 linux-image-amd64-dbg
The bootable kernel binary will be in '/boot', e.g.:
/boot/vmlinuz-6.1.0-35-amd64
The debug symbols binary will be somewhere under '/usr/lib/debug', e.g.:
/usr/lib/debug/boot/vmlinux-6.1.0-32-amd64
2. Creating a custom initrd is simple. It's just a compressed 'cpio' archive:
echo ./elf64-80_bytes | cpio --quiet -H newc -o | gzip -5 -n > ./initrd.gz
3. Running QEMU:
qemu-system-x86_64 -accel tcg -smp 1 -machine memory-backend=mem \
-object memory-backend-file,id=mem,size=512M,mem-path=/dev/shm/mem,share=on \
-kernel ./vmlinuz-6.1.0-35-amd64 -initrd ./initrd.gz \
-append 'nopti nokaslr console=tty0 console=ttyS0,115200 rdinit=/elf64-80_bytes' \
-gdb tcp:127.0.0.1:1234 -monitor stdio -S
Notes:
- When debugging, DO NOT use KVM! Let QEMU emulate directly (TCG); otherwise,
you may miss events such as interrupts and context switches.
- Use only one CPU, as SMP can add unnecessary complexity.
- Disable Linux kernel address randomization (KASLR) and page table isolation
(KPTI) [ref16].
- The '-S' flag will immediately pause virtualization. It gives you control
over execution: you can type 'continue' in GDB, and QEMU will unpause.
- For output, use both serial console redirection and the QEMU-GTK console.
This is useful for recording and replaying kernel logs.
- The memory file is very handy when working with memory, as it allows simple
search and setting watchpoints on raw addresses.
4. Running GDB is a little bit tricky, as we need to load the symbols before
connecting to the QEMU GDB stub. (When using GDB on the kernel, it's always a
good idea to use debug symbols. Without them, it's way harder to correlate the
source code with the running kernel.)
Here's how to load symbols ('file'), connect to the QEMU GDB stub (''target
remote'), and set a breakpoint on the 'load_elf_binary'' function:
$ gdb
(gdb) file vmlinux-6.1.0-32-amd64
(gdb) target remote 127.0.0.1:1234
(gdb) hbreak load_elf_binary
(gdb) continue
When using '-accel tcg', we don't have to worry about the number of
"hardware" breakpoints ('hbreak') we can set. The emulation provides
"unlimited" hardware breakpoints.
When crafting our own tools, we have the advantage of knowing exactly what *we*
need. In this case, we are working with only two ELF headers: the ELF Header
and the Program Header. That makes it fairly simple to build such an ELF
parser.
We might consider not printing all the fields, since some of them contain code,
but I advise against this, as it makes the tool less flexible. It's better to
define custom conditions (see below).
Here is a simple template for a "readelf" tool:
------------------------------[ readsmallelf.c ]-------------------------------
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <err.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <elf.h>
#define p(s...) fprintf (stderr, s)
// "negated assert"
#define a(x) ((void) ((x) && (errx (1, "%s %d %s(): if (%s) -> [errno=%d] %s", \
__FILE__, __LINE__, __func__, #x, errno, strerror (errno)), 0) ))
int main (int argc, char *argv[])
{
int fd;
struct stat sb;
char *file;
int i;
Elf64_Ehdr *ehdr;
Elf64_Phdr *phdr;
a (argc < 2); // must have at least 1 argument
a ((fd = open (argv[1], O_RDONLY)) < 0);
a (fstat (fd, &sb) == -1);
a ((file = mmap (NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0)) == MAP_FAILED);
ehdr = (Elf64_Ehdr *) file;
a (!(ehdr->e_phoff >= 0 && ehdr->e_phoff <= (sb.st_size - sizeof (Elf64_Phdr))));
phdr = (Elf64_Phdr *) (file + ehdr->e_phoff);
p ("e_ident = "); for (i = 0; i < EI_NIDENT; i++) p ("%02x ", ehdr->e_ident[i]); p ("\n");
p ("e_type = %04x (%hu)\n", ehdr->e_type, ehdr->e_type);
p ("e_machine = %04x (%hu)\n", ehdr->e_machine, ehdr->e_machine);
p ("e_version = %08x (%u)\n", ehdr->e_version, ehdr->e_version);
p ("e_entry = %016lx\n", ehdr->e_entry);
p ("e_phoff = %016lx (%lu)\n", ehdr->e_phoff, ehdr->e_phoff);
p ("e_shoff = %016lx (%lu)\n", ehdr->e_shoff, ehdr->e_shoff);
p ("e_flags = %08x\n", ehdr->e_flags);
p ("e_ehsize = %04x (%hu)\n", ehdr->e_ehsize, ehdr->e_ehsize);
p ("e_phentsize = %04x (%hu)\n", ehdr->e_phentsize, ehdr->e_phentsize);
p ("e_phnum = %04x (%hu)\n", ehdr->e_phnum, ehdr->e_phnum);
p ("e_shentsize = %04x (%hu)\n", ehdr->e_shentsize, ehdr->e_shentsize);
p ("e_shnum = %04x (%hu)\n", ehdr->e_shnum, ehdr->e_shnum);
p ("e_shstrndx = %04x (%hu)\n", ehdr->e_shstrndx, ehdr->e_shstrndx);
p ("----------------------------\n");
p ("p_type = %08x (%u)", phdr->p_type, phdr->p_type);
p ("p_flags = %08x (%u)\n", phdr->p_flags, phdr->p_flags);
p ("p_offset = %016lx (%lu)\n", phdr->p_offset, phdr->p_offset);
p ("p_vaddr = %016lx\n", phdr->p_vaddr);
p ("p_paddr = %016lx\n", phdr->p_paddr);
p ("p_filesz = %016lx (%lu)\n", phdr->p_filesz, phdr->p_filesz);
p ("p_memsz = %016lx (%lu)\n", phdr->p_memsz, phdr->p_memsz);
p ("p_align = %016lx (%lu)\n", phdr->p_align, phdr->p_align);
return 0;
}
-------------------------------------------------------------------------------
Now we can read all the ELF files that 'readelf', 'eu-readelf', and 'objdump'
fail to parse:
$ gcc -Wall -std=gnu99 readsmallelf.c -o readsmallelf
$ ./readsmallelf ./elf64-80_bytes
e_ident = 7f 45 4c 46 0f 05 b1 48 51 90 90 90 90 90 90 05
e_type = 0002 (2)
e_machine = 003e (62)
e_version = b80eb25e (3087970910)
e_entry = 000012eb00000001
e_phoff = 0000000000000018 (24)
e_shoff = 000012eb00000018 (20800526614552)
e_flags = 3cb0050f
e_ehsize = 050f (1295)
e_phentsize = 0038 (56)
e_phnum = 0001 (1)
e_shentsize = 0000 (0)
e_shnum = 0000 (0)
e_shstrndx = 0000 (0)
----------------------------
p_type = 00000001 (1)
p_flags = 000012eb (4843)
p_offset = 0000000000000018 (24)
p_vaddr = 000012eb00000018
p_paddr = 0038050f3cb0050f
p_filesz = 0000000000000001 (1)
p_memsz = 0000000000000001 (1)
p_align = 0a474e494b524f57 (740646740429000535)
We can even program the checks we want. For instance, we might require that
'phdr->p_type' must be 1, and so on. Or we could go even further and use
'capstone' [ref17] to disassemble sections of interest.
I don't personally use disassembly in readelf, as it seems redundant when I'm
the one writing the code (= lazy excuse), but I do use coloring to highlight
errors:
void c (int cond, char *str) // contition check
{
if (cond)
p ("\t\t\t\033[1;31m<---!!! %s\033[0m", str); // make it RED
p ("\n");
}
...
p ("p_type = %08x (%u)", phdr->p_type, phdr->p_type);
c (phdr->p_type != 1, "PT_LOAD");
p ("p_flags = %08x (%u)", phdr->p_flags, phdr->p_flags);
c (!(phdr->p_flags & 1), "EXEC BIT");
p ("p_vaddr = %016lx\n", phdr->p_vaddr);
c (phdr->p_vaddr > ((1UL<<47)-4096), "more than available VM"); // [ref18]
p ("p_memsz = %016lx (%lu)", phdr->p_memsz, phdr->p_memsz);
c (phdr->p_filesz != phdr->p_memsz, "p_memsz != p_filesz");
Tools don't need to be general. It's fine to use a dedicated readelf tool for
each type of small ELF we work on. These tools are closely tied to specific
ELFs and serve well for build tests.
The most effective approach I know, is to make small, incremental changes and
think hard about what each one does (I know, it hurts). It requires real
understanding, and when we're researching something new, that understanding is
usually missing. Start from a known state and uncover how it works, step by
step.
I always start from a working proof of concept, tweak it slightly, and test
immediately. When a small change fails in an unexpected way, and I cannot
figure out why just by thinking, I turn to tools like GDB, QEMU, radare2, and
the (kernel) source code.
I also take extensive notes. In the beginning, it's mostly chaotic scribbles
and copy-pasted code, behaviors, links, and anything else of interest. After a
while, I write a summary of what I know and understand, and this becomes my new
baseline. Then I repeat the process until I'm done.
In part two and three we'll look at some paractical example BUGs and how to
solve them.
[ref1] https://tmpout.sh/3/22.html
[ref2] https://news.ycombinator.com/item?id=36662353
[ref3] https://news.ycombinator.com/item?id=42969322
[ref4] https://research.h4x.cz/html/2023/2023-04-17--linux--hiding_namespaces_within_a_fd.html
[ref5] https://elixir.bootlin.com/linux/v6.10.14/source/fs/binfmt_elf.c#L380
[ref6] https://elixir.bootlin.com/linux/v6.10.14/source/fs/binfmt_elf.c#L99
[ref7] https://elixir.bootlin.com/linux/v6.10.14/source/fs/binfmt_elf.c#L1171
[ref9] https://elixir.bootlin.com/linux/v6.10.14/source/arch/x86/include/asm/page_64_types.h#L75
[ref10] https://elixir.bootlin.com/linux/v6.10.14/source/arch/x86/include/asm/page_64_types.h#L57
[ref11] https://en.wikipedia.org/wiki/Intel_5-level_paging
[ref12] https://elixir.bootlin.com/linux/v6.10.14/source/arch/x86/include/asm/page_64_types.h#L56
[ref13] https://research.h4x.cz/html/2019/2019-10-22--linux--minimal_viable_x86_elf64_static_binary.html
[ref14] https://research.h4x.cz/html/2025/2025-06-20--kprobes_kernel_symbol_two_addresses_and_c_templating.html
[ref15] https://research.h4x.cz/html/2025/2025-07-23--radare2_working_with_not-so-valid_elfs.html
[ref16] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
[ref17] https://www.capstone-engine.org/
[ref18] https://elixir.bootlin.com/linux/v6.1.137/source/arch/x86/include/asm/page_64_types.h#L61