[TXT]     [HOME]     [TOOLS]     [GAMES]     [RSS]        [ABOUT ME]    [GITHUB]

.-----------------------------------------------------------------------------.
|                    WebP Polyglot II: Executable Picture                     |
'-----------------------------------------------------------------------------'
updated: 2023-10-08


===[ How About A Runnable WebP? ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(NOTE: While nothing here is intentionally malicious, be a good hacker and run it in a secure environment, such as QEMU-KVM.) This is part 2 of "WebP polyglots". The context for this article is in the first part here: https://research.h4x.cz/html/2023/2023-08-08--webp_polyglot_i-bootable_picture.html We already know how the WebP format works and how to create an x86 bootable WebP image. But what else can we do with it? Say, have you ever wanted to run an image? Let's look at how we can execute a WebP image inside a Linux environment.

===[ RIFF Shell Script ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You might have noticed that the RIFF/WEBP header is mostly printable text. So, can we run it as a shell script? The short answer is yes, but there are lots of pitfalls. However, first things first, what can we exploit? Most shells have two properties/functionalities we can take advantage of: 1. Shell scripts don't have any special magic header (the shebang '#!' exists primarily for the 'execve(2)' syscall; see [ref1]). As such, shells will try to parse any file we throw at them. 2. Even though shell scripts are supposed to be text files, most shells are willing to parse binary data (e.g., skipping zero bytes, and so on). Some shells have checks for binary files (see Bash's Binary Check), but they mostly have no problem crunching arbitrary data. We will explore various strategies for creating a valid "RIFF/WebP" shell script compatible with the most popular shells in 2023 [ref2]: - bash (5.2.15) - zsh (5.9) - Debian /bin/sh (dash 0.5.12-2) - FreeBSD /bin/sh (freebsd 13.2) - FreeBSD /bin/csh (freebsd 13.2) The main goal remains ensuring the WebP image is valid, (at least for Chrome and Firefox).

===[ Hello, I'm Here-Document ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here-Document [ref3] is a build-in shell feature that enables us to feed a program with multi-line data terminated by a specified string. It might sound complicated but it is not. Here is an example: -----------------------------[ here-document.sh ]------------------------------
$ cat << __MY_TERMINATOR__
They're trashing our rights, man!
They're trashing the flow of data!
Trashing!
Trashing!
Hack the planet!
__MY_TERMINATOR__
------------------------------------------------------------------------------- The 'cat' command will get the data at the file descriptor 0 (stdin). '__MY_TERMINATOR__' can be any string and when placed on its own line it will terminate the input (e.g., '\n__MY_TERMINATOR__\n'). Note that Here-Document is by default interpreting the data. That means, variables are expanded, commands in backticks are executed and so on. Here is an example: ---------------------------[ Here-Document (Eval) ]----------------------------
$ i=10

$ cat << _EOF
ls output: `ls`
i is $i
_EOF

ls output: webp.c
i is 10
------------------------------------------------------------------------------- But when we place the terminator in quotes, evaluation will not happen: ----------------------[ Quoted Here-Document (No Eval) ]-----------------------
$ cat << "_EOF"
ls output: `ls`
i is $i
_EOF

ls output: `ls`
i is $i
------------------------------------------------------------------------------- You probably see where it is going. Our goal is to inject valid Here-Document redirection into the RIFF size. We would prefer the quoted terminator, but unfortunately it is way harder then I originally thought. I initially thought that I was in control of both the RIFF chunk size and VP8X chunk size and that I could easily inject '<<"' into the RIFF chunk size and '"\n' into the VP8X chunk size, like so:
  RIFF <<"WEBPVP8X"\n\0\0...
At the end of the file I would simply append the Here-Document terminator and my payload for a shell script:
  printf "\nWEBPVP8X\nls -la\n" >> out.webp
That actually worked ... but WebP image was valid only in Chrome. Chrome is one of a few viewers that doesn't assume that the 'VP8X' size is '0x0a' bytes! For example, here is a snippet from the Firefox code: ---------------------[ media/libwebp/src/dec/webp_dec.c ]----------------------
static VP8StatusCode ParseVP8X(const uint8_t** const data,
                               size_t* const data_size,
                               int* const found_vp8x,
...
{
    /* From: media/libwebp/src/webp/format_constants.h
     *
     * #define VP8X_CHUNK_SIZE    10    // Size of a VP8X chunk
     */
    const uint32_t vp8x_size = CHUNK_HEADER_SIZE + VP8X_CHUNK_SIZE;  // <-- !!!
    ...
    if (!memcmp(*data, "VP8X", TAG_SIZE)) {
    ...
        *data += vp8x_size;                                          // <-- !!!
        *data_size -= vp8x_size;
        *found_vp8x = 1;
    }
    return VP8_STATUS_OK;
}
------------------------------------------------------------------------------- When Firefox finds the 'VP8X' header, it enforces its chunk size to '0x0a' bytes and modifies the pointer to the next chunk accordingly to it (i.e., 'data + 0xa'). We could create a workaround by embedding an image into the 'VP8X' data, but that would be too fiddly. Instead, we can take another approach and craft headers such that we won't need to use a quoted Here-Document. We will create a chain like this:
  RIFF << WEBPVP8X\n\0\0\0...EXIF****\nWEBPVP8X\nOUR CODE\nEXIF_PADDING...VP8L
Here is the expanded and more readable version of it:
RIFF << WEBPVP8X
\0\0\0..........EXIF****
WEBPVP8X
echo "Hack the planet!"
exit 0
...EXIF_PADDING...
VP8L...
In this approach, we must be careful not to have any special shell characters (like '$(', '`', and so on) up to the Here-Document terminator. The critical sections are: - VP8X chunk data (beware of the VP8X flags and canvas size), - EXIF chunk size. When we use an unquoted terminator, a shell will try to interpret the content, and unfortunately, it might fail, for example, when there is an unmatched backtick '`' (sub-shell execution) and exit with failure! Not good hacksmanship, I tell you! Anyway, we have to solve one other issue, and that is the overall RIFF size. We cannot just use ' << ' (spaces on both sides) as it translates to ''20 3c 3c 20' in hex, which in turn translates to the '0x203c3c20'' little-endian number, which finally translates to 515.76 MiB in size! That's a lot of bytes for an image (even by today's standards). We could shrink it down by replacing the character at the end with the zero byte '\0'. When we do that, we get '' <<\0', which is '0x003c3c20'' and that equals to 3.76 MiB. Keep that in mind, we will manipulate with the size later on. The crucial question is: will shells correctly parse such a string? And the answer is ... well yes, but actually no:
  $ for i in bash dash zsh csh; do echo "== $i =="; "$i" ./out.webp; echo "-> $?"; done

  == bash ==
  ./out.webp: ./out.webp: cannot execute binary file
  -> 126

  == dash ==
  ./out.webp: 1: RIFF: not found
  Hack the planet!
  -> 0

  == zsh ==
  script-here_text.sh:1: command not found: RIFF
  Hack the planet!
  -> 0

  == csh ==
  RIFF: Command not found.
  Hack the planet!
  -> 0

  == sh (freebsd) ==
  out.webp: RIFF: not found
  Hack the planet!
Four out of five. Yeah, it could be counted as a success if the most popular shell, bash, was working! But it's not! The biggest problem is that the shells will terminate the script when they encounter an error in the evaluation of the Here-Document. It can still be used successfully, but we have to be careful. Before we look at another method, let's examine that pesky bash binary check.

===[ Bash's Binary Check ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Bash has a pretty annoying check. It searches for a zero byte '\0' at the first line. If it is hit, then bash fails with:
  #                 V
  $ printf 'RIFF <<\0WEBPVP8X\n\0\0\0...\nWEBPVP8X\nls -lA\n' > out.webp
  $ bash out.webp
  out.webp: out.webp: cannot execute binary file
We can kinda circumvent it if we replace the NUL byte in the size with a non-zero byte (e.g. '\1'):
  #                 V                      V
  $ printf 'RIFF <<\1WEBPVP8X\n\0\0\0...\n\1WEBPVP8X\nls -lA\n' > out.webp
  $ bash out.webp
  out.webp: line 1: RIFF: command not found
  total 1
  -rw-r--r-- 1 f f  41 Sep 19 20:44 out.webp
It has two problems: firstly, the overall size will grow from 2.14 MiB to 18.14 MiB, but the bigger problem is that older versions of bash will fail to find the end of the here-document:
  $ bash out.webp
  out.webp: line 10: warning: here-document at line 1 delimited by end-of-file
  (wanted `WEBPVP8X')
(I don't know the exact version of bash where it started working, but since bash 5.0.3 in Debian 10 Buster, it works -- unlike prior versions in Stretch, Jessie, and Wheezy.) It's a bug in the bash parser where the bash code compares a line with the wrong string value. When we run 'gdb' and break at 'make_here_document', we can see that the line comparison is done against the wrong string:
  $ gdb /bin/bash
  (gdb) start ./out.webp
  (gdb) b *(make_here_document+246)
  (gdb) c
  (gdb) x/b 0x00005555556a7010
  0x5555556a7010: "\001\001WEBPVP8X"
Notice that the string in memory starts with '\1\1', but our file has only one '\1' at the start of the here document. And if you are asking if it starts working when we add the other '\1', the answer is yes:
  #                                        V
  $ printf 'RIFF <<\1WEBPVP8X\n\0\0\0...\n\1\1WEBPVP8X\nls -lA\n' > out.webp
  $ bash out.webp
  out.webp: line 1: RIFF: command not found
  total 1
  -rw-r--r-- 1 f f 42 Sep 19 21:00 out.webp
BUT! this works only for bash! Other shells expect only one '\1', of course! Using characters other than '\0' is problematic because it significantly increases the size:
  <<"\0   -> 0x00223c3c = 2.14 MiB       # We want this to work
  <<"\1   -> 0x01223c3c = 18.14 MiB
  <<"\2   -> 0x01223c3c = 34.14 MiB
  ...
  <<"!    -> 0x21223c3c = 530.14 MiB     # The first (non-space) printable char
Or we can hit two targets with one arrow. If we bypass the check for binary files altogether, then we could have a NUL byte in the RIFF size. (Shell parsers typically ignore the leading '\0' bytes in the here-document terminator.) When we look at the condition in the 'shell.c' file and the 'open_shell_script' function, we can see very helpful commentary:
  /* Only do this with non-tty file descriptors we can seek on. */
  if (fd_is_tty == 0 && (lseek (fd, 0L, 1) != -1))
OK, that means that the condition is only valid if we give bash a file directly (i.e., 'bash script_file'). But the check doesn't apply if we "stream" the file. So, we can run the first command again, but with redirection (a pipe will also work):
  $ printf 'RIFF <<\0WEBPVP8X\n\0\0\0...\nWEBPVP8X\nls -lA\n' > out.webp
  $ bash < out.webp
  total 1
  -rwxr-xr-x  1 f f 40 Aug 16 08:41 out.webp
Hmmm, the question is: will this redirection work in the rest of the shells? Let's find out:
  $ for i in bash dash zsh csh; do echo "== $i =="; "$i" < ./out.webp; echo "-> $?"; done

  == bash ==
  bash: line 1: RIFF: command not found
  Hack the planet!
  -> 0

  == dash ==
  dash: 1: RIFF: not found
  Hack the planet!
  -> 0

  == zsh ==
  zsh: command not found: RIFF
  Hack the planet!
  -> 0

  == csh ==
  RIFF: Command not found.
  Hack the planet!
  -> 0

  == sh (freebsd) ==
  out.webp: RIFF: not found
  Hack the planet!
Yeah, it's ... well ... hm ... OH! WHY THE HACK NOT?! We have successfully bypassed the binary parser! But there is another, more elegant solution.

===[ Comment The Situation ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Here-Document is somewhat usable, but for me personally, it was a pretty spectacular failure, as we cannot use the quoted Here-Document because it makes the WebP picture invalid in viewers that expect a fixed-sized VP8X header. But we also cannot use an unquoted Here-Document because some shells, especially bash (see Bash's Binary Check), will fail to execute it. Let's brainstorm for a sec. Is there a "goto" (or something similar)? No. The best we can do are conditions, but they are too long (e.g., 'if ! :; then'). We need something that can fit into 3 bytes or less. Is there the multiline comment? No. But! what about the single-line comment? We could systematically comment out problematic bytes. We could combine a new line '\n' and a hash '#'. Whitespace before the hash will ensure that the comment will apply and at the same time give us a clearer way to work with a shell script. We will design the header as follows:
  RIFF\n##\0WEBPVP8X\n\0\0\0\0\n#\0\0\0....HACK....sh_code...VP8L...
Here is more readable version of it: -----------------------[ riff_shell_script-comment.sh ]------------------------
RIFF
##^@WEBPVP8X
^@^@^@^@
#.......HACK....
echo "Hack the planet!"
exit 0
VP8L...
------------------------------------------------------------------------------- Note #1: '^@' is the visualization of the NUL byte '\0' in VIM. You can insert a NUL byte in insert mode by pressing 'Ctrl-V' and 'SPACE'. And yes, I regularly commit such "atrocities"! I'm a proud member of the Church of Unholy Hacks. Note #2: We must be very careful with canvas sizes. Neither width nor height must have a "new line" byte ('0x0a') in them as it would end the comment! For instance, a size of 10 is '0x0a', or sizes in the range between 2560 to 2815 ('0x0a00 -- 0x0aff'). One downside of the script above is that we have one line full of NUL bytes. Some shells don't like it, but they will survive it. Unfortunately, we cannot do better with this approach because those NUL bytes represent the rest of the VP8X size, and as we saw in the previous section ([[#Hello I'm Here-Document]]), we cannot change it if we want it to work in most viewers. Let's test 'riff_shell_script-comment.sh':
  $ for i in bash dash zsh csh; do echo "== $i =="; "$i" ./out.webp; echo "-> $?"; done

  == bash ==
  ./out.webp: line 1: RIFF: command not found
  Hack the planet!
  -> 0

  == dash ==
  ./out.webp: 1: RIFF: not found
  Hack the planet!
  -> 0

  == zsh ==
  ./out.webp:1: command not found: RIFF
  ./out.webp:3: permission denied:
  Hack the planet!
  -> 0

  == csh ==
  RIFF: Command not found.
  Hack the planet!
  -> 0

  == sh (freebsd) ==
  out.webp: RIFF: not found
  Hack the planet!
We did it, folks! We have a runnable image! Such coolness cannot even be expressed in words. Now, the sky is the limit. We can do anything we can do with a shell script. For example, We could embed the whole OS (and not just an MBR) deep into the file, and the scripthcould extract it and run it. The only wrinkle on our beauty is the 'RIFF: Command not found.' error message. If we have control over the environment, we could fix it externally (e.g., by redirecting, defining the command, etc.), but I will take a cue from Tolkien himself: "Shut up!" [ref4] That sounds really great, but aren't we still constrained by the size of our chunk? That means we have only limited space, but the previous paragraph said something like we can embed anything. Well, that's actually a good point. Yes, the basic code is constrained by it. However, we can use the quoted Here-Document to skip to the end of the WebP image data, and from there, we have "unlimited" space. Here is what I mean: -----------------------[ riff_shell_script-comment.sh ]------------------------
RIFF
##^@WEBPVP8X
^@^@^@^@
#.......HACK....

echo "The first part of the script."

# Skip the rest of the RIFF/WebP
cat > /dev/null << "__EOF__"
...PADDING...
VP8L...image data
__EOF__

echo "The second part of the script (after end of the RIFF chunk)."
------------------------------------------------------------------------------- And we are golden.

===[ Equality For Everyone Except For Me ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I had an ingenious idea that could solve the issue with the 'RIFF: not found' error (see Commentary on the Current Situation). The 'RIFF' string can be a valid variable name. What if we assign to it the rest of the headers? What I mean is something like this: ---------------------------[ script-shell-equal.sh ]---------------------------
RIFF='^@^@WEBPVP8X
^@^@^@..........HACK....'

echo "Hack the planet!"
------------------------------------------------------------------------------- This works in most shells, but of course bash and csh must be special snowflakes. bash classically errors with:
  ./script-shell-equal.sh
  cannot execute binary file
because there is a NUL byte at the first line. Again, this can be easily fixed with a new line instead of the first NUL. BUT there is a way bigger problem! The equal sign character is represented as '0x3D', which is an ODD number, therefore making the WebP image invalid! If bash wasn't fooling around with that binary check, it could be circumvented just by adding an even ASCII character before the equal sign, for instance like this:
  RIFF0='\0
Unfortunately, if we replace the second NUL byte with any other character, the size will grow to over 100 MiB. I'm documenting this technique because it still works in some shells, and even bash can be force-fed from stdin using this method:
  ( printf "RIFF0='\0WEBPVP8X\n\0\0\0\0\0\0\0\0\0\0\0\0\0HACK\0\0\0\0';"
    printf "echo Hack the planet!;\n"
  ) | bash
In this example, the RIFF chunk size is 2.45 MiB.
  "0='\0"  = 303d2700  =  0x00273d30  =  2571568  =  2.45 MiB
Maybe we'll get a better idea later on.

===[ It's My Gem! ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Well, well, well, the technique described in the previous section can be semi-successfully used in Ruby! Why "semi"? Because Ruby doesn't like arbitrary bytes in a string. It especially has problems with bytes above '0x7f' because it tries to interpret them as UTF-8 characters. But unlike Perl or Python, Ruby doesn't care if a quoted value contains new lines! We can work with it: ---------------------------[ script-ruby-equal.rb ]----------------------------
RIFF0='^@WEBPVP8X
^@^@^@..........HACK....'
puts("Hack the planet!")
exit(0)
------------------------------------------------------------------------------- Let's test it:
  $ printf 'RIFF0="\0WEBPVP8X\n\0\0\0\0\0\0\0\0\0\0\0\0\0HACK\x7f\0\0\0"\nputs("Hack the planet!")\nexit(0)\n\0' > script-ruby-equal.rb
  $ ruby script-ruby-equal.rb
  Hack the planet!
Do you remember the problem with bytes above '0x7f'? Well, they are not just a problem in the beginning but continue to be problematic later in the file (i.e., WebP binary data after our code):
  $ printf 'puts("Hack!")\n\xff' | ruby
  -:2: invalid multibyte char (UTF-8)
We could comment out the rest of the file with a Here-Document '<<' (or by similar means like'=begin' and '=end'), but there is a more elegant hack. Matz's Ruby Interpreter will stop its interpretation when it encounters a NUL byte outside of a string/comment! That means we can just add '\0' at the end of our code like so:
  $ printf 'puts("Hack!")\n\0\xff' | ruby
  Hack!
That's just fine and dandy, but we still have the original problem that the VP8X header must not contain bytes greater than '0x7f'. We could use the "comment" technique and use the reserved bytes in the VP8X header for commenting out the rest of the header. Alternatively, we could choose a size that will not trigger Ruby's UTF-8 encoder. I've chosen the latter option. Note that the canvas size stored in VP8X is decremented by one! For example, canvas size 1024 will be stored as 1023! That's a problem for such "rounded" numbers because:
  1024 = 0x0400
  1023 = 0x03ff
The number '0x03ff' contains 'ff', so Ruby will fail with the UTF-8 error. Therefore, we must use 1025 as our canvas size. I know, it's fiddly as hell but it's also super straightforward.

===[ Changeling ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since we have a nice programming environment at our disposal, let's create an image that changes its content when run with Ruby. The PoC will be simple. We will inject multiple VP8L (lossless) images into our WebP. We must rewrite all VP8L chunk IDs except one, as only one VP8L header is allowed in a valid image. Then we need to write a Ruby script that will seek at the positions of VP8L headers and switch them (i.e., one will be a valid VP8L, and the other one(s) not). We need some means to locate VP8L chunks, but I definitely don't want to create a new WebP parser in Ruby. Therefore, I modified the code so it adds the VP8L offsets as the 'IMG' variables at the beginning of the script. It's not pretty to work with strings in C, but it is more general than editing it manually. The Ruby script that changes images: ---------------------------[ script-ruby-equal.rb ]----------------------------
f = File.new( $0, 'r+b' )

print( "Look at me now: " );
if f.pread( 1, IMG0 ) == "V"
  f.pwrite( 'U', IMG0 )
  f.pwrite( 'V', IMG1 )
  puts( "I'm Moon Moon! Let's HAAAAX!" );
else
  f.pwrite( 'V', IMG0 )
  f.pwrite( 'U', IMG1 )
  puts( "I Can Has 1337burger?" );
end

exit(0);
------------------------------------------------------------------------------- Note: Remember that 'IMG0' and 'IMG1' will be inserted by the C program we've created [ref5].
  ./webp-polyglot -S 20200000 -s script-ruby-equal.rb -p -1 $'\';\n' \
      -C 203d2700 -c -o out.webp cat-1025x1025.webp dog-1025x1025.webp
And here is the result: Run me with Ruby. Download it and run it. (Don't forget to run it in a VM. It's not malicious, but mistakes might happen.)

===[ O.k., We Can Go Home Now? ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Polyglot files are a fascinating topic. They clearly showcase the distilled and easy-to-understand ingenuity of hacking -- taking something that has a specific purpose and allowing it to piggyback something completely different. This principle forms the basis for more advanced techniques, such as hacking parsers (see the work of Sergey Bratus [ref6] on programming weird machines [ref7]) or smugling malicious content though EDR/IDS radars [ref8] [ref9]. Ange Albertini has great resources on polyglot formats and binary files [ref10]. And remember: Hack The Planet !!1!

===[ References ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[ref1] https://research.h4x.cz/html/2022/2022-10-11--scripting_in_c.html [ref2] https://research.h4x.cz/data/2023/most_popular_shells-2023.png [ref3] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04 [ref4] https://www.youtube.com/watch?v=1-Uz0LMbWpI * Tolkien explains why the Fellowship didn't fly the Eagles to Mordor [ref5] https://github.com/fandauchytil/webp-polyglot [ref6] https://cs.dartmouth.edu/~sergey/ [ref7] https://www.cs.dartmouth.edu/~sergey/wm/ [ref8] https://decoded.avast.io/martinchlumecky/png-steganography/ [ref9] https://www.cse.chalmers.se/~andrei/ccs13.pdf * Polyglots: Crossing Origins by Crossing Formats, November 2013 * DOI:10.1145/2508859.2516685 [ref10] https://github.com/angea

===[ APPENDIX A: csh's quoted here terminator ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

'csh' (C shell) still exists and is the default shell on FreeBSD. It can parse Here-Documents, and it even understands quoted terminators, but it also requires the quotes in the termination line: ---------------------------[ Here-Document (Eval) ]----------------------------
$ cat << _EOF
`ls`
_EOF

out.webp
------------------------------------------------------------------------------- ----------------------[ Quoted Here-Document (No Eval) ]-----------------------
$ cat << "_EOF"
`ls`
"_EOF"

`ls`
------------------------------------------------------------------------------- In the case of 'csh', we need to inject quotes as follows:
  $ printf '\n"WEBPVP8X"\nls -la\n'  >> out.webp
  $ csh out.webp
  total 2196
  drwxr-xr-x   2 f     f           64 Aug 16 06:31 .
  drwxr-xr-x  19 root  wheel      512 Aug 16 06:27 ..
  -rwxr-xr-x   1 f     f      2243710 Aug 16 10:32 out.webp
Note that there is no need for '\0'.