===[ "Scripting" in C ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I often need to quickly test some functionality (Proof of Concept), e.g. testing the behavior of a system, using syscall functionality that is not yet implemented in user space, direct memory access, etc. The C programming language is great for this because it is very low level and, at the same time, it is high level enough with lots of libraries. This gives me a lot of freedom and direct access to assembly, and (if I don't use optimizations) it has more or less deterministic output/behavior. I use ''gcc''s ''gnu99'' standard mode, which is C99 with GNU extensions[ref1], because I want to quickly write my PoC and I don't care about portability and don't want to waste time on nicely looking declarations, etc. (i.e., I will use the fastest way possible for prototyping like this because it will never be production code.) For that, I have one unholy trick that I have used for a very long time. TL;DR: [[#Complete C template]]. But we are well-bred hackers, and we *need* to understand why it works. Firstly, we need to know how executable files are ... well, executed. ===[ Executing a Binary on Linux ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ On Linux, an executable file is processed in these steps: 1. The kernel opens the file and reads the first several bytes (= the magic header). 2. The kernel then compares the header to the known binary formats. There are roughly three types of executable files: - a binary with a known header (e.g. ''\x7FELF'' for an ELF binary), - a shebang header ''#!...'' (with correct path to an interpreter) and - an unknown header. 3. When the kernel encounters an unknown header, the ''execve(2)'' syscall fails and returns the error code ''ENOEXEC'' (Exec format error). ===[ Exploiting Shell Behavior ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Most modern shells (e.g.,''bash'', ''dash'', ''zsh'', etc.) have an interesting behavior: if ''execve(2)'' fails with ''ENOEXEC'' (Exec format error), then the shells will open the file, do some sanity checks, and if they decide that the file can be a script, they will call themselves with the name of the executable as the first argument. This means that we can create a script like this: printf 'ls\n' > exec_me chmod 755 exec_me ./exec_me and the execution will look like this: execve("./exec_me", ["./exec_me"], ...) = -1 ENOEXEC (Exec format error) ... execve("/bin/sh", ["/bin/sh", "./exec_me"], ...) = 0 It looks like we have a highly useful feature! ===[ Executable C File: C preprocessor ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The C preprocessor is the most abused feature of any C compiler, and we will exploit it further. The C preprocessor language has its own type of conditions: ''#if cond''. If the condition is not met, that part of the code is discarded at the preprocessing phase. (A typical use is to check for architecture specific features.) But we can use it as follows: ----------------------------[ preprocessor_code.c ]---------------------------- #if 0 # This code will be discarded by the C preprocessor (because 0 will always be # False), but it can be interpreted by bash without errors because '#' is a # valid comment and 'exit 0' will end the script before the interpreter starts # parsing the C code. echo "I'm bash"; exit 0; #endif #include int main () { printf ("I'm C\n"); return 0; } ------------------------------------------------------------------------------- If we compile ''preprocessor_code.c'', it should create a valid binary. Also if we set the executable flag on ''preprocessor_code.c'' and then execute it, we should get a valid result: $ chmod 755 preprocessor_code.c $ ./preprocessor_code.c I'm bash $ gcc preprocessor_code.c -o preprocessor_code $ ./preprocessor_code I'm C Pretty sweet, isn't it? But we still have one more trick up our sleeve. ===[ Executable C File: C99 comments ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The C99 standard introduced single-line "C++-style" comments using double slashes. The slash character possesses a key feature that we can (and will) exploit: it serves as the path delimiter character in Unix-like systems. Furthermore, an excess of slashes (e.g., ''////'') is disregarded. As a result, we can invoke a dummy program followed by our code for execution: ------------------------------[ comment_code.c ]------------------------------- //bin/true ; echo "I'm bash"; exit 0; #include void main () { printf ("I'm C\n"); } ------------------------------------------------------------------------------- Let's put it to the test: $ chmod 755 comment_code.c $ ./comment_code.c I'm bash $ gcc comment_code.c -o comment_code $ ./comment_code I'm C I prefer this approach as it takes up less visual space in the C file. However, the main drawback is that the command must be a one-liner (which is not a problem in the shell). ===[ Bulid Me Daddy ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Neato, right?! But what are the possibilities? Think about building a binary and automatically executing it each time we run our source code file (and yes, it sounds pretty wacky, but it's not all that different from Python scripts). We could also incorporate logic to check if the source code file is newer than the binary, triggering a build only when necessary. Furthermore, we could add debugging features when a specific environment variable is set, such as executing the binary using ''strace''. Here's an example of a POSIX shell script: -----------------------------[ build_and_exec.sh ]----------------------------- bin="${0%.c}"; if [ "$0" -nt "$bin" ]; then gcc -Wall -Wno-long-long -Wno-variadic-macros -Wno-unused-variable \ -Wno-unused-but-set-variable -std=gnu99 -ggdb3 -g3 -O0 "$0" -o "$bin" \ || exit $?; fi if [ -n "$s" ]; then o="/tmp/strace.$(date +%F_%T)"; echo "STRACE OUTPUT: $o"; s="strace -f -o $o -yy -tt -s 512 "; fi exec $s "$bin" "$@"; exit $?; ------------------------------------------------------------------------------- When we integrate this code at the beginning of a C file, we end up with an executable source code that compiles only when there are changes to the source code itself. ===[ Two Files Template: Comment ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the complete code I use (as of 2023-10-10): ----------------------------[ template_comment.c ]----------------------------- //bin/true; bin="${0%.c}"; if [ "$0" -nt "$bin" ]; then gcc -Wall -Wno-long-long -Wno-variadic-macros -Wno-unused-variable -Wno-unused-but-set-variable -I ~/include -std=gnu99 -ggdb3 -g3 -O0 "$0" -o "$bin" || exit $?; fi; if [ -n "$s" ]; then o="/tmp/strace.$(date +%F_%T)"; echo "STRACE OUTPUT: $o"; s="strace -f -o $o -yy -tt -s 512 "; fi; exec $s "$bin" "$@"; exit $?; #include int main (int argc, char *argv[]) { return 0; } ------------------------------------------------------------------------------- I have this header file in my home directory: ----------------------------[ ~/include/hacking.h ]---------------------------- #ifndef _HACKING__H_ #define _HACKING__H_ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include typedef int8_t s8; typedef int16_t s16; typedef int32_t s32; typedef int64_t s64; typedef uint8_t u8; typedef uint16_t u16; typedef uint32_t u32; typedef uint64_t u64; #define p(s...) fprintf (stderr, s) #define a(x) ((void) ((x) && (errx (1, "%s %d %s(): if (%s) -> [errno=%d] %s", __FILE__, __LINE__, __func__, #x, errno, strerror (errno)), 0) )) // *NEGOVANY* assert()! #define sx2i(s) strtoll ((const char *) (s), NULL, 16) // str hex num -> num #define s2i(s) strtol ((const char *) (s), NULL, 10) // str decimal num -> num #define hitme() do { int i; p ("Waiting for (PID=%d) ...", getpid()); read (0, &i, 1); } while(0) #define p2(s...) do { int i; struct tm *lt; static char buf[80]; static struct timeval tv; gettimeofday (&tv, NULL); lt = localtime (&(tv.tv_sec)); strftime (buf, sizeof (buf), "%F %T", lt); i = fprintf (stderr, "%s.%06ld (%ld.%ld)", buf, tv.tv_usec, tv.tv_sec - tv_last_cmd.tv_sec, tv.tv_usec - tv_last_cmd.tv_usec); fprintf (stderr, "%*c", 50-i, ' '); fprintf (stderr, s); tv_last_cmd = tv; } while (0) void xs (char *buf, unsigned char *s, int n) { int i, j = 0; //static char b[1024]; for (i = 0; i < n; i++) { sprintf (buf+(i*3), "%02X ", s[i]); } if (i == 0) buf[i] = '\0'; else buf[i*3 -1] = '\0'; } #define px(s, n) (pxd ((u8 *) (s), (size_t) (n))) void pxd (u8 *s, size_t n) { int i, j = 0; char b1[16*3+1+1], b2[16+1+1]; char *p1, *p2; long addr = 0; p ("-- %p --\n 0 1 2 3 4 5 6 7 8 9 A B C D E F 01234567 89ABCDEF\n", s); while (j < n) { p1 = b1; p2 = b2; for (i = 0; i < 16 && j < n; i++, j++) { if (i == 8) { *p1 = *p2 = ' '; p1++; p2++; } p1 += sprintf (p1, "%02X " , s[j]); *p2 = s[j] >= ' ' && s[j] <= '~' ? s[j] : '.'; p2++; } if (i < 16) p1 += sprintf (p1, "%*c" , (16-i)*3 + (i<=8?1:0) , ' '); *p1 = *p2 = '\0'; p ("%04lX: %s %s\n", addr, b1, b2); addr += i; } p ("\n"); } #endif /* _HACKING__H_ */ ------------------------------------------------------------------------------- ===[ One File Template: Preprocessor ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --------------------------[ template_preprocessor.c ]-------------------------- #if 0 bin="${0%.c}"; if [ "$0" -nt "$bin" ]; then gcc -Wall -Wno-long-long -Wno-variadic-macros -Wno-unused-variable -Wno-unused-but-set-variable -std=gnu99 -ggdb3 -g3 -O0 "$0" -o "$bin" || exit $?; fi if [ -n "$s" ]; then o="/tmp/strace.$(date +%F_%T)"; echo "STRACE OUTPUT: $o"; s="strace -f -o $o -yy -tt -s 512 "; fi exec $s "$bin" "$@"; exit $?; #endif #include #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include typedef int8_t s8; typedef int16_t s16; typedef int32_t s32; typedef int64_t s64; typedef uint8_t u8; typedef uint16_t u16; typedef uint32_t u32; typedef uint64_t u64; #define p(s...) fprintf (stderr, s) #define a(x) ((void) ((x) && (errx(1, "%s %d %s(): if (%s) -> [errno=%d] %s", __FILE__, __LINE__, __func__, #x, errno, strerror(errno)), 0) )) // !assert() #define sx2i(s) strtoll ((const char *) (s), NULL, 16) // str to hex #define s2i(s) strtol ((const char *) (s), NULL, 10) // str to decimal #define hitme() do { int i; p ("Waiting for (PID=%d) ...", getpid()); read (0, &i, 1); } while(0) #define p2(s...) do { int i; struct tm *lt; static char buf[80]; static struct timeval tv; gettimeofday (&tv, NULL); lt = localtime (&(tv.tv_sec)); strftime (buf, sizeof (buf), "%F %T", lt); i = fprintf (stderr, "%s.%06ld (%ld.%ld)", buf, tv.tv_usec, tv.tv_sec - tv_last_cmd.tv_sec, tv.tv_usec - tv_last_cmd.tv_usec); fprintf (stderr, "%*c", 50-i, ' '); fprintf (stderr, s); tv_last_cmd = tv; } while (0) void px (u8 *s, int n) { int i, j = 0; char b1[16*3+1+1], b2[16+1+1]; char *p1, *p2; long addr = 0; p ("-- %p --\n 0 1 2 3 4 5 6 7 8 9 A B C D E F 01234567 89ABCDEF\n", s); while (j < n) { p1 = b1; p2 = b2; for (i = 0; i < 16 && j < n; i++, j++) { if (i == 8) { *p1 = *p2 = ' '; p1++; p2++; } p1 += sprintf (p1, "%02X " , s[j]); *p2 = s[j] >= ' ' && s[j] <= '~' ? s[j] : '.'; p2++; } if (i < 16) p1 += sprintf (p1, "%*c" , (16-i)*3 + (i<=8?1:0) , ' '); *p1 = *p2 = '\0'; p ("%04lX: %s %s\n", addr, b1, b2); addr += i; } p ("\n"); } int main (int argc, char *argv[]) { return 0; } ------------------------------------------------------------------------------- ===[ Using the Templates ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "Manual" usage: chmod 755 template.c cp -a template.c test1.c ./test1.c # Build the code and execute the binary. s=1 ./test1.c # Build the code and execute the binary by `strace`. ===[ C Functions and Macros for Fast Prototyping ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It's hard to overlook the macros and functions in the templates provided above. These are constructs that I frequently use when hacking some code. I highly recommend that you consider creating similar tools for yourself as well. Here's a brief overview of the functions' purposes: ''a()'' -- like ''assert(3)'' but it fails when a condition is met. This is great for replacing full checks if some function fails: a ((fd = open (argv[1], O_RDONLY)) < 0); // Fail if open() returns value < 0 ''p()'' -- it has the same semantics as ''printf(3)'', but it forces unbuffered output to ''stderr''. ''p2()'' -- printf with a date and time. This is great for logging progress over time. ''px()'' -- hexdump of a memory region. u8 buf[] = "aaaaaaaaaaaaaaaaa"; px (buf, sizeof (buf)); -- 0x7ffcb2a8fbd0 -- 0 1 2 3 4 5 6 7 8 9 A B C D E F 01234567 89ABCDEF 0000: 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 aaaaaaaa aaaaaaaa 0010: 61 00 a. ''s2i(s)'' -- string number to int. ''sx2i(s)'' -- hex string to int. int n1 = s2i ("1337"); // => (long) 1337 int n2 = sx2i ("0x1337"); // => (long) 4919 ''hitme()'' -- waiting for a user to hit ''''. This is very useful if I need to attach, for example, ''strace'' to some part of a code. hitme(); // => Waiting for (PID=3063996) ... ===[ Vim FTW ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ My editor of choice is ''vim'', so I use vim templates for a new file: ----------------------------------[ .vimrc ]----------------------------------- exec "autocmd BufNewFile *.c,*.cpp,*.cc 0r " . g:vim_path . "/skeletons/new.c" ------------------------------------------------------------------------------- and ''~/.vim/skeletons/new.c'' is the template above. ===[ Instead of Conclusion ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Q: Why use one large, complex file instead of ''#include ''? A: While a single large file may not be visually appealing, it offers an advantage: it won't break when changes are made to the shared header file. For instance, if the API for the ''p()'' function is altered, every older proof of concept will cease to function. This is why a self-contained file is worth considering. (I personally go back and forth between these two approaches, so you'll need to decide for yourself. Try both approaches and determine which one is more suitable for your environment.) Q: Why don't we just use ''make'' or the build-in functionality inside our editor for building binaries? A: Yes, I use both. I have keybindings for ''gcc'' build in my ''vim'' and for invoking ''make'', but I also use this technique because I don't have to be inside an editor to execute it or manually compile it before running it. I just run ''./my_c_code.c'' as I would when calling, for example, a python script. ===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > [ref1] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#C-Dialect-Options