@@@@@@@ p@@@Z @@@@@@@@@@@@ @@@@@@@@@@@@@@# @@@@. @@@@@| @@@@@C @@@@@ @@@@ @@@@ b@@@@@ l@@@@ @@@Q _@@@@ @@@@@@@@ @@@@@@@@@@ ] @@@W @@@@{ @@@C @@@@@@ ^@@@@@@@@@- @@@ @@@@ @@@ @@@@@@ @@@@@@@@ @@@ @@@@ @@@@ @@@@@@v @@@@@$ @@@ +@@@@ @@@@ @@@@@@ f@@@@@ @@@ @@@@> @@@`t@ i@@@@@@ @@@@@@ @@@ @@@@ ^@@@ @@X @@@@@@; @@@@@ @@* @@@@. @@@@ '@@ # @@@@@@ $@@@@ @@@ z@@@@ @@@@ @@@@ @@ h@@@@@a @@@ @@@ @@@@I @@@` . @>@@ @@@ @@@ ^@@@@ @@@ @@@@ ;@@@ @@@@@@ n @@@@@@ @@@ @@@@ @@@@@@@ @@@ @@@@ @@@@@@l; @@@ b@@@@ @@@@@@@@@@r . @@ @ @@. .@@@@@@;,I @@@ @@@@. .@@@@@@@@@@@@@@ O@@ . @@@@@@;I;;. @@@ @@@@ @@@ @@@@@@@@@@@ @ z@@@@@@;;;;; p@@f @@@@ @@@ @@ @@@@@@@@@@@@@ . @@@@@@@:;;;; @@@ m@@@@ @@@@@ @@@@@@@@@@@@@@@@@ . @@@@@@^;;;;; @@@ @@@@ @@@@@@@@@@@@@@@@ @@@@@@@@@@I @@@@@@:I;;;^ @@@ @@@@ @@@@@@@@@@@@@@@@ @@@ @@@@@@@@@@@ .. ^@@@@@@:;;;; @@@ @@@@ @@@@@@@@@@@@@@@@@ @@@ @@@@@@@@@@@@@0 @@@@@@@:;;;; @@@ @@@@ @@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@ @@@@@@!:;;I; @@@ @@@ @@@@@@@@@@@@ @@@@@@@@@@@@@@@@@ @@@@@$ @@@@@@:;;;;` .@@@ @@@@ @@@@@@@@@ @@ @@@@@@@@@@@@@@ @@@ @@@ @@@@@@:;;;; p@@@ @@@@ @@@@@@@@ @@ @@ @@@@@@@@@@@ , @@ @@@@@@@;;;;I @@@@ @@@@ l@@@@@@. @@ @@ @@ @@@@@@@@@@@@@@@ @@@@@@i;;;;; @@@d @@@@ @@@@@ @@ @@ @@ @@ @@@@@@@@@@@p @@@@@@;;;;;. @@@ . @@@@ @@@@@@ @@ @@ @@ @@@@@@@@@@ .@@@@@@,;;;; @@@ @ @@@t ^@@@@ @@ @@ @@ @@ @@@@@@@@@ $@@@@@@;;;;; @@@ @ @@@@ @@@@@@ @@ @@ @@ @@@@@@@@@ @@@@@@<;;;;; I@@@ @ @@@@ @@@@@@@w @@ @@ @@@@@@@@@Q @@@@@@;;;;I *@@@ @@ .@@@@ .@@@@@@@@ @@ @@@@@@@@@@ @@@@@@:;;; @@@@ @@ @@@@$ @@@@@@@@@@ .@@@@@@@@@@ @@@@@@@I;;: @@@z @@@ >@@@@1 @@@@@@@@@@@@@@@@@` @@@@@@[:;;I @@@ @@@@ @@@@@@ m@@@@@@@@@@ @@@@@@@;;;; (@@_ @@@@ @@@@@@@@ <@@@@@@@@@;;;I @@@ @@@@@@ @@@@@@@@@@@@@@@@@@@@@@@@,;;;I. @@@ @@@@@ ;,@@@@@@@@@@@@@@@@@@x:;;;; @@@ @@@ @@ ;;::,f@@@@@@@[::;;;;;; @@@@ @@@@ ,:I;;;;;;;;;;;;; '@@@@@@@@' ===[ bpftrace and kernel symbols ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When using kprobes, beware of symbolic kernel tracing! Some programs may fail even though a function is clearly traceable: # bpftrace -e 'kprobe:load_elf_binary { printf ("hit\n") }' Attaching 1 probe... [WARN] libbpf: prog 'kprobe_load_elf_binary_1': BPF program load failed: Invalid argument [WARN] libbpf: prog 'kprobe_load_elf_binary_1': failed to load: -22 cannot attach kprobe, Cannot assign requested address ERROR: Error attaching probe: kprobe:load_elf_binary ''bpftrace'' doesn't fail because the ''load_elf_binary'' function is untraceable. It fails because the symbol has two different addresses: $ grep load_elf_binary /proc/kallsyms ffffffff813e2c80 t load_elf_binary ffffffff813e58d0 t load_elf_binary A symbol with multiple addresses is not an anomaly. We can even see it implemented twice in the debug binary: $ objdump -M intel-mnemonic -d /usr/lib/debug/boot/vmlinux-$(uname -r) \ | grep -A7 '' ffffffff813e2c80 : ffffffff813e2c80: e8 1b 2d c9 ff call ffffffff810759a0 <__fentry__> ffffffff813e2c85: 41 57 push r15 ffffffff813e2c87: 41 56 push r14 ffffffff813e2c89: 41 55 push r13 ffffffff813e2c8b: 41 54 push r12 ffffffff813e2c8d: 55 push rbp ffffffff813e2c8e: 48 89 fd mov rbp,rdi -- ffffffff813e58d0 : ffffffff813e58d0: e8 cb 00 c9 ff call ffffffff810759a0 <__fentry__> ffffffff813e58d5: 41 57 push r15 ffffffff813e58d7: 41 56 push r14 ffffffff813e58d9: 41 55 push r13 ffffffff813e58db: 41 54 push r12 ffffffff813e58dd: 55 push rbp ffffffff813e58de: 53 push rbx And the code really differs. So, which one is correct? Well, it depends. Do we want to trace the functions for loading a 64-bit ELF binary or a 32-bit one? Yes, one is the 64-bit ELF loader, and the other is for loading 32-bit ELF. Now, why is that? Why do they both have the same name? Let's ignore the symbol name collision for now and focus on a more interesting question: why are there two ''load_elf_binary'' implementations? If we look closely at the Linux kernel source code, ''load_elf_binary'' [ref1] is defined only once -- in ''fs/binfmt_elf.c''. So where does the second one come from? Have you ever wondered how Linux on x86-64 executes 32-bit binaries? Where is the 32-bit ELF loader implemented? As you've probably guessed, it's the same ''load_elf_binary'' function in ''fs/binfmt_elf.c''. But how? If you've seen the function, you've probably noticed that the ELF structures are oddly generic, like ''struct elf_phdr'' [ref2]. So how does the kernel handle different address widths, for example between ''Elf64_Addr'' and ''Elf32_Addr'' [ref3]? The structures are processed at compile time, so there can't be any on-the-fly detection of the ELF type. ===[ C templating ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Well, the ''fs/binfmt_elf.c'' file is, de facto, a C template. The native ELF structures, like the aforementioned ''elf_phdr'', are defined by C macros in ''include/linux/elf.h'' [ref4]. But there's also a second file, ''fs/compat_binfmt_elf.c'', which redefines these macros and symbolic constants [ref5], such as: --------------------------[ fs/compat_binfmt_elf.c ]--------------------------- /* * 32-bit compatibility support for ELF format executables and core dumps. ... * This file is used in a 64-bit kernel that wants to support 32-bit ELF. * asm/elf.h is responsible for defining the compat_* and COMPAT_* macros * used below, with definitions appropriate for 32-bit ABI compatibility. * * We use macros to rename the ABI types and machine-dependent * functions used in binfmt_elf.c to compat versions. */ ... #undef elf_phdr #define elf_phdr elf32_phdr ------------------------------------------------------------------------------- And at the end of the file, it actually *INCLUDES* the ''fs/binfmt_elf.c'' file! --------------------------[ fs/compat_binfmt_elf.c ]--------------------------- /* * We share all the actual code with the native (64-bit) version. */ #include "binfmt_elf.c" ------------------------------------------------------------------------------- Behold! This is one of the greater C hacks in real-world code: C templating! Cute, right? (By the way, ''fs/binfmt_elf.c'' is not x86-specific; the same trick is used on arm64, for example.) ===[ One symbol, multiple addresses ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The remaining question is: how does the kernel get away with one symbol name pointing to two different addresses? Simple: when we look at all the functions in the file, we see that they are declared as static. For example: ------------------------------[ fs/binfmt_elf.c ]------------------------------ static int load_elf_binary(struct linux_binprm *bprm); ------------------------------------------------------------------------------- That makes it a local symbol. Such symbols with identical names can exist in multiple object files because they are not globally visible. So there are actually two independent ''load_elf_binary'' functions. One in ''fs/binfmt_elf.o'' and one in ''fs/compat_binfmt_elf.o''. ===[ Kprobes ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Back to kprobes. Even though ''bpftrace'' fails, not all tools do. For instance, ''perf probe'' handles this correctly by creating kprobes for both addresses: # perf probe load_elf_binary Added new events: probe:load_elf_binary (on load_elf_binary) probe:load_elf_binary (on load_elf_binary) You can now use it in all perf tools, such as: perf record -e probe:load_elf_binary -aR sleep 1 # perf probe --list probe:load_elf_binary (on load_elf_binary@fs/binfmt_elf.c) probe:load_elf_binary (on load_elf_binary@fs/binfmt_elf.c) The ''--list'' output is misleading about the source file, but the actual kprobe attachment is correct: # cat /sys/kernel/debug/kprobes/list ffffffff813e2c80 k load_elf_binary+0x0 [DISABLED][FTRACE] ffffffff813e58d0 k load_elf_binary+0x0 [DISABLED][FTRACE] Now, there's one thing I don't know: which address corresponds to the 32-bit ELF loader, and which to the 64-bit one? It likely depends on the object file link order. I suppose the native ELF loader functions from ''binfmt_elf.o'' come first, so the first address is probably for the 64-bit ELF. Fortunately, it's easy enough to test: # bpftrace -e 'k:$1 { printf ("hit\n") }' 0xffffffff813e2c80 stdin:1:1-6: WARNING: 0xffffffff813e2c80 is not traceable (either non-existing, inlined, or marked as "notrace"); attaching to it will likely fail. Attaching 1 probe... hit hit hit NOTE: The bpftrace warning is still misleading, even in the current version (v0.22.1). It always prints this message when attaching to a raw address, regardless of whether the region is actually traceable. ===[ OUTRO ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We looked specifically at ''load_elf_binary'', but there are a plethora of symbols with multiple addresses. Just check ''/proc/kallsyms''. On my 6.1.137-amd64 kernel, there are approximately 398 such symbols: $ awk '$2 == "t" || $2 == "T" {print $3}' /proc/kallsyms \ | sort | uniq -c | grep -v '^ *1 ' | wc -l 398 We have to be careful about what we are tracing. We might end up tracing the wrong functions. For completeness, here are some checks to determine if a function is (not) traceable: grep -w FUNC /proc/kallsyms # Must be here grep -w FUNC /sys/kernel/tracing/available_filter_functions # Must be here grep -w FUNC /sys/kernel/debug/kprobes/blacklist # Must NOT be here Hack the planet! ===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [ref1] https://elixir.bootlin.com/linux/v6.10.14/source/fs/binfmt_elf.c#L819 > [ref2] https://elixir.bootlin.com/linux/v6.10.14/source/fs/binfmt_elf.c#L825 > [ref3] https://www.man7.org/linux/man-pages/man5/elf.5.html > [ref4] https://elixir.bootlin.com/linux/v6.10.14/source/include/linux/elf.h#L53 > [ref5] https://elixir.bootlin.com/linux/v6.10.14/source/fs/compat_binfmt_elf.c#L35