[HOME] [TXT]

===[ "Scripting" in C ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I often need to quickly test some functionality (Proof of Concept), e.g. testing the behavior of a system, using syscall functionality that is not yet implemented in user space, direct memory access, etc. The C programming language is great for this because it is very low level and, at the same time, it is high level enough with lots of libraries. This gives me a lot of freedom and direct access to assembly, and (if I don't use optimizations) it has more or less deterministic output/behavior. I use 'gcc's 'gnu99' standard mode, which is C99 with GNU extensions[ref1], because I want to quickly write my PoC and I don't care about portability and don't want to waste time on nicely looking declarations, etc. (i.e., I will use the fastest way possible for prototyping like this because it will never be production code.) For that, I have one unholy trick that I have used for a very long time. TL;DR: Complete C template. But we are well-bred hackers, and we *need* to understand why it works. Firstly, we need to know how executable files are ... well, executed.

===[ Executing a binary on Linux ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

On Linux, an executable file is processed in these steps: 1. The kernel opens the file and reads the first several bytes (= the magic header). 2. The kernel then compares the header to the known binary formats. There are roughly three types of executable files: - a binary with a known header (e.g. '\x7FELF' for an ELF binary), - a shebang header '#!...' (with correct path to an interpreter) and - an unknown header. 3. When the kernel encounters an unknown header, the 'execve(2)' syscall fails and returns the error code 'ENOEXEC' (Exec format error).

===[ Exploiting the bash behavior ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

'bash' and 'dash' have an interesting behavior: if 'execve(2)' fails with 'ENOEXEC' (Exec format error), then the shells will open the file, do some sanity checks, and if they decide that the file can be a script, they will call themselves with the name of the executable as the first argument. This means that I can create a script like this: ----------------------------------[ exec_me ]----------------------------------
printf 'ls\n' > exec_me
chmod 755 exec_me
./exec_me
------------------------------------------------------------------------------- and the execution will look like this: ----------------------------------[ strace ]-----------------------------------
execve("./exec_me", ["./exec_me"], ...) = -1 ENOEXEC (Exec format error)
...
execve("/bin/sh", ["/bin/sh", "./exec_me"], ...) = 0
------------------------------------------------------------------------------- It looks like a highly useful feature.

===[ How to create executable C file ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The C preprocessor is the most abused feature of any C compiler, and we will exploit it further. The C preprocessor language has its own type of conditions: '#if cond'. If the condition is not met, that part of the code is discarded at the preprocessing phase. (A typical use is to check for architecture specific features.) But we can use it as follows: ----------------------------[ preprocessor_code.c ]----------------------------
#if 0
# This code will be discarded by the C preprocessor (because 0 will always be
# False), but it can be interpreted by bash without errors because '#' is a
# valid comment and 'exit 0' will end the script before the interpreter starts
# parsing the C code.
echo "I'm bash";
exit 0;
#else
#include <stdio.h>
int main () { printf ("I'm C\n"); return 0; }
#endif
------------------------------------------------------------------------------- If we compile 'preprocessor_code.c', it should create a valid binary. Also if we set the executable flag on 'preprocessor_code.c' and then execute it, we should get a valid result: -----------------------------------[ code ]------------------------------------
$ chmod 755 preprocessor_code.c
$ ./preprocessor_code.c
I'm bash

$ gcc preprocessor_code.c -o preprocessor_code
$ ./preprocessor_code
I'm C
------------------------------------------------------------------------------- Now imagine that we can build a binary and execute it whenever we execute our source file (and yes, that sounds pretty weird). We can also check if a source code file is newer than a binary and only then build it. And we can add some debug features when some ENV is set, e.g. executing a binary by 'strace'.

===[ Complete C template ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is the complete code I use (as of 2022-10-11): --------------------------------[ template.c ]---------------------------------
#if 0
bin="${0%.c}"; if [ "$0" -nt "$bin" ]; then gcc -Wall -Wno-long-long -Wno-variadic-macros -Wno-unused-variable -Wno-unused-but-set-variable -std=gnu99 -ggdb3 -g3 -O0 "$0" -o "$bin" || exit $?; fi
if [ -n "$s" ]; then o="/tmp/strace.$(date +%F_%T)"; echo "STRACE OUTPUT: $o"; s="strace -f -o $o -yy -tt -s 512 "; fi
exec $s "$bin" "$@"; exit $?;
#else
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <limits.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/user.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <sys/un.h>
#include <errno.h>
#include <err.h>
#include <stddef.h>
#include <time.h>
#include <sys/time.h>
#include <sys/ioctl.h>
#include <sched.h>
#include <signal.h>

typedef int8_t s8; typedef int16_t s16; typedef int32_t s32; typedef int64_t s64; typedef uint8_t u8; typedef uint16_t u16; typedef uint32_t u32; typedef uint64_t u64;
#define p(s...)  fprintf (stderr, s)
#define a(x)     ((void) ((x) && (errx(1, "%s %d %s(): if (%s)  -> [errno=%d] %s", __FILE__, __LINE__, __func__, #x, errno, strerror(errno)), 0) ))  // !assert()
#define sx2i(s)  strtoll ((const char *) (s), NULL, 16)     // str to hex
#define s2i(s)   strtol  ((const char *) (s), NULL, 10)     // str to decimal
#define hitme()  do { int i; p ("Waiting for <ENTER> (PID=%d) ...", getpid()); read (0, &i, 1); } while(0)
#define p2(s...)  do { int i; struct tm *lt; static char buf[80]; static struct timeval tv; gettimeofday (&tv, NULL); lt = localtime (&(tv.tv_sec)); strftime (buf, sizeof (buf), "%F %T", lt); i = fprintf (stderr, "%s.%06ld (%ld.%ld)", buf, tv.tv_usec, tv.tv_sec - tv_last_cmd.tv_sec, tv.tv_usec - tv_last_cmd.tv_usec); fprintf (stderr, "%*c", 50-i, ' '); fprintf (stderr, s); tv_last_cmd = tv; } while (0)
void px (u8 *s, int n) { int i, j = 0; char b1[16*3+1+1], b2[16+1+1]; char *p1, *p2; long addr = 0; p ("-- %p --\n        0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F    01234567 89ABCDEF\n", s); while (j < n) { p1 = b1; p2 = b2; for (i = 0; i < 16 && j < n; i++, j++) { if (i == 8) { *p1 = *p2 = ' '; p1++; p2++; } p1 += sprintf (p1, "%02X " , s[j]); *p2 = s[j] >= ' ' && s[j] <= '~' ? s[j] : '.'; p2++; } if (i < 16) p1 += sprintf (p1, "%*c" , (16-i)*3 + (i<=8?1:0) , ' '); *p1 = *p2 = '\0'; p ("%04lX:  %s   %s\n", addr, b1, b2); addr += i; } p ("\n"); }


int main (int argc, char *argv[])
{
    //a (argc < 2);
    //int fd; a ((fd = open (argv[1], O_RDONLY)) < 0);
    //int pid; a ((pid = fork()) < 0);
    //char *line_buf = NULL; size_t line_buf_len = 0; while (getline (&line_buf, &line_buf_len, stdin)) {



    return 0;
}
#endif
-------------------------------------------------------------------------------

===[ Using the template ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

"Manual" usage: -----------------------------------[ setup ]-----------------------------------
chmod 755 template.c
cp -a template.c test1.c
------------------------------------------------------------------------------- -----------------------------------[ usage ]-----------------------------------
./test1.c       # Build the code and execute the binary.
s=1 ./test1.c   # Build the code and execute the binary by `strace`.
-------------------------------------------------------------------------------

===[ C functions and macros for fast prototyping ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

'a()' -- like 'assert(3)' but it fails when a condition is met. This is great for replacing full checks if some function fails: -------------------------------[ not_assert.c ]--------------------------------
a ((fd = open (argv[1], O_RDONLY)) < 0);  // Fail if 'open()' returns value < 0
------------------------------------------------------------------------------- 'p()' -- it has the same semantics as 'printf(3)', but it forces unbuffered output to 'stderr'. 'p2()' -- printf with a date and time. This is great for logging progress over time. 'px()' -- hexdump of a memory region. -----------------------------------[ px.c ]------------------------------------
u8 buf[] = "aaaaaaaaaaaaaaaaa";
px (buf, sizeof (buf));

-- 0x7ffcb2a8fbd0 --
        0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F    01234567 89ABCDEF
0000:  61 61 61 61 61 61 61 61  61 61 61 61 61 61 61 61    aaaaaaaa aaaaaaaa
0010:  61 00                                               a.
------------------------------------------------------------------------------- 's2i(s)' -- string number to int. 'sx2i(s)' -- hex string to int. -----------------------------------[ s2i.c ]-----------------------------------
int n1 = s2i ("1337");     // => (long) 1337
int n2 = sx2i ("0x1337");  // => (long) 4919
------------------------------------------------------------------------------- 'hitme()' -- waiting for a user to hit '<enter>'. This is very useful if I need to attach, for example, 'strace' to some part of a code. -----------------------------------[ code ]------------------------------------
hitme();    // => Waiting for <ENTER> (PID=3063996) ...
-------------------------------------------------------------------------------

===[ Vim ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

My editor of choice is 'vim', so I use vim templates for a new file: ----------------------------------[ .vimrc ]-----------------------------------
exec "autocmd BufNewFile *.c,*.cpp,*.cc 0r " . "~/.vim/skeletons/new.c"
------------------------------------------------------------------------------- and '~/.vim/skeletons/new.c' is the template above.

===[ Discussion ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Q: Why don't we just use '#include "my_funcs.h"' instead of that big ugly C template? A: I actually did that for a while, but it backfired on me when I mede some breaking changes in some of the functions and I wanted to execute an older PoC. As you probably guessed, it failed! After that, I decided it would be better to always have one big but self-contained file. Q: Why don't we just use 'make' or the build-in functionality inside our editor for building binaries? A: Yes, I use both. I have keybindings for 'gcc' build in my 'vim' and for invoking 'make', but I also use this technique because I don't have to be inside an editor to execute it or manually compile it before running it. I just run './my_c_code.c' as I would when calling, for example, a python script.

===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[ref1] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#C-Dialect-Options