===[ Smallest ELF64 binary ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Lately I was inspired by reading how to create the smallest ELF32 binary [ref1] and I wanted to learn how to do something similar but for ELF64 binary on x86-64 architecture. It is actually not that difficult if you have the specification [ref2],[ref3] and a good assembler (it is even easy to do it by hand, but it is needlessly error prone).

===[ How to construct ELF64 binary ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If we want to create a more or less general ELF64 binary, there are two mandatory ELF64 headers, we must use: 1. 'Elf64_Ehdr' which has all the basic metadata like magic, offsets, type, ... 2. One entry of 'Elf64_Phdr' which tells the kernel loader that binary should be loaded into memory as an executable. Both of them are described in [ref2],[ref3] and if we look at their size, we can calculate the total size: Elf64_Ehdr + Elf64_Phdr = 64 + 56 = 120 Basic size will be at least 120 bytes long. Now the question is if we need more bytes for our program, or if we can "recycle" part of the header and keep the size at 120 bytes? If we look at 'Elf64_Ehdr' there is padding [1] which takes up to 7 bytes. We can use this for our code. Padding is reserved and should be ignored ([ref2],[ref3]). Now that we have some space, we can create a small program. 7 bytes is not much, but it should be enough to cleanly terminate our binary. We can do this by syscall 'exit(2)'. This syscall (or its variant 'exit_group(2)') is called at the end of any correctly finished application. It takes only one argument and most importantly it will gracefully end the program without abnormal termination by kernel (SIGSEGV, SIGBUS, and so on). A big limitation is that we have only 7 bytes to work with [1], so we need to cut some corners, don't we? My working version was that I sacrificed an exit value. The problem was that the exit was not so graceful. It basically exited with a random exit value. The code looked like this: -----------------------------------[ code ]------------------------------------ B83C000000 mov eax,0x3c 0F05 syscall ------------------------------------------------------------------------------- We can see, that an opcode for 'mov' into 64-bit/32-bit register is rather big. 5 bytes to be precise and with 2 bytes from 'syscall' opcode it totals to 7 bytes, which is exactly the size of 'e_ident[EI_PAD]'. That is nice, but we can create an equivalent piece of code that is 1 byte smaller: -----------------------------------[ code ]------------------------------------ 31C0 xor eax,eax B03C mov al,0x3c 0F05 syscall ------------------------------------------------------------------------------- One byte is not enough, but if we steal 1 byte from 'e_ident[EI_ABIVERSION]' which is also reserved/unused, we have 2 bytes and there is enough space for the opcodes which allow us to control the argument for 'exit(2)' (which is loaded from 'RDI' register). We have two nice possibilities: -----------------------------------[ code ]------------------------------------ 31FF xor edi,edi ; rdi = 0 89C7 mov edi,eax ; rdi = rax ------------------------------------------------------------------------------- In this particular example we are going to use 'mov edi,eax', because the exit value of 60 (0x3c) stands out more than zero.

===[ Full code ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(Real code starts at '_start' label and ends by 'syscall'.) ----------------------------[ smallest_elf64.asm ]----------------------------- BITS 64 org 0x0000000000400000 ehdr: ; Elf64_Ehdr db 0x7F, "ELF" ; e_ident[EI_MAG] db 2 ; e_ident[EI_CLASS] = 64 bit ELF db 1 ; e_ident[EI_DATA] = little endian db 1 ; e_ident[EI_VERSION] = ELF version db 0 ; e_ident[EI_OSABI] = SysV ABI ; Code: ;db 0 ; e_ident[EI_ABIVERSION] (undef in Linux) ;times 7 ; e_ident[EI_PAD] <-- [1] _start: ;mov edi, edi ; RDI = 0 -> arg1 for exit xor eax, eax ; RAX = 0 mov al, 0x3c ; RAX = 0x3c = 60 -> 'exit()' syscall mov edi, eax ; RDI = RAX = 60 -> arg1 for exit syscall ; End of Code dw 2 ; e_type = executable dw 0x3e ; e_machine = x86-64 dd 1 ; e_version = ELF version dq _start ; e_entry dq phdr - $$ ; e_phoff dq 0 ; e_shoff dd 0 ; e_flags dw ehdr_size ; e_ehsize dw phdr_size ; e_phentsize dw 1 ; e_phnum dw 0 ; e_shentsize dw 0 ; e_shnum dw 0 ; e_shstrndx ehdr_size equ $ - ehdr phdr: ; Elf64_Phdr dd 1 ; p_type = PT_LOAD dd 0x05 ; p_flags = PF_X | PF_R = rx dq 0 ; p_offset dq $$ ; p_vaddr dq $$ ; p_paddr dq file_size ; p_filesz dq file_size ; p_memsz dq 0x200000 ; p_align phdr_size equ $ - phdr file_size equ $ - $$ ------------------------------------------------------------------------------- Time for the test: -----------------------------------[ code ]------------------------------------ $ nasm smallest_elf64.asm $ ls -l ./smallest_elf64 -rwxr-xr-x 1 root root 120 2022-04-26 06:45:22 smallest_elf64* $ strace ./smallest_elf64 execve("./smallest_elf64", ["./smallest_elf64"], 0x7ffdc97aad20 /* 44 vars */) = 0 exit(60) = ? +++ exited with 60 +++ -------------------------------------------------------------------------------

===[ Summary ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The binary is theoretically 120 bytes long, but physically and logically it is always more than that. On disk it uses 'BLOCK-SIZE' space (in my case it is 4096 on ext4/xfs). And in memory it uses 'PAGE-SIZE' size (which is also 4096 bytes on x86-64).

===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[ref1] https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html [ref2] https://www.man7.org/linux/man-pages/man5/elf.5.html [ref3] https://refspecs.linuxbase.org/elf/elf.pdf