CMU Bomb Lab with Radare2 — Secret Phase

Mark Higgins
9 min readJul 16, 2019

One…last..level. Let’s go!

Previous Posts:

Load the binary and analyze functions. Get a list of functions with afl. Notice sym.secret_phase? That sounds promising. Seek to the function, but before we start analyzing it, let’s find out how we get here.

In Visual mode, with your cursor on the first instruction of the sym.secret_phase function, press the x key to get data/code references to this address. Press Enter to select the only x-ref: sym.phase_defused.

This function gets called every time a level is completed, and each time, obj.num_input_strings is checked against a value of 6. If this returns true, a scanf function is called, which leads towards sym.secret_phase. Let’s reopen r2 in Debug mode with your answers.txt. Set a breakpoint inside sym.phase_defused and start debugging.

First you’ll notice the check against 6 now validates as true. This check ensures no one can access the secret phase without finishing the other levels. The next function we have to deal with is a scanf

0x004015f0      be19264000     mov esi, str.d__d__s 
0x004015f5 bf70386000 mov edi, 0x603870
0x004015fa e8f1f5ffff call sym.imp.__isoc99_sscanf;[1]

This function is expecting to read in two digits and a null terminated string from memory address 0x603870.

So what’s at the address?

:> ps @ 0x603870
0 0

That looks familiar. You might remember this as the answer for Phase 4, where we brute forced the solution so we could avoid recursive function calls. The only problem is, there’s no string following the digits.

 |   0x00401604      be22264000     mov esi, str.DrEvil
| 0x00401609 488d7c2410 lea rdi, [var_10h]
| 0x0040160e e825fdffff call sym.strings_not_equal;[2]
| 0x00401613 85c0 test eax, eax
,==< 0x00401615 751e jne 0x401635
|| 0x00401617 bff8244000 mov edi, str.Curses...
|| 0x0040161c e8eff4ffff call sym.imp.puts;[3]
|| 0x00401621 bf20254000 mov edi, str.But_finding...
|| 0x00401626 e8e5f4ffff call sym.imp.puts;[3]
|| 0x0040162b b800000000 mov eax, 0
|| 0x00401630 e80dfcffff call sym.secret_phase;[4]

Looking at the rest of the segment, it compares the result of sym.imp.__isoc99_sscanfwith the string DrEvil. If those match, it calls sym.secret_phase. We now know the string and where to enter it. Let’s go.

Secret Phase

Update your answers.txt, adding DrEvil to Phase 4:

Taylor Swift is pretty fly
stupid sexy Flanders
1 9001
0 0 DrEvil
ionefg
4 3 2 1 6 5

Run r2 in Debug mode with your answers.txt, and strap-in.

That’s underwhelming. It reads a line, calls sym.imp.strol on it to convert from string to a long integer, calls a sym.fun7, then we win. The sym.fun7 function is called with two arguments, a struct named obj.n1, and our stdin. Remember how we defined print structures in Phase 6? That’s going to to come up again. Also notice that in order to win, sym.fun7 has to return 2.

Let’s look at sym.fun7 ...oh no…it’s..

Recursive Functions and Binary Trees

sym.fun7 (int32_t arg1, int32_t arg2);
; arg int32_t arg1 @ rdi
; arg int32_t arg2 @ rsi
; CALL XREF from sym.secret_phase (0x401273)
0x00401204 4883ec08 sub rsp, 8
0x00401208 4885ff test rdi, rdi
,=< 0x0040120b 742b je 0x401238
| 0x0040120d 8b17 mov edx, dword [rdi]
| 0x0040120f 39f2 cmp edx, esi
,==< 0x00401211 7e0d jle 0x401220
|| 0x00401213 488b7f08 mov rdi, qword [rdi + 8]
|| 0x00401217 e8e8ffffff call sym.fun7
|| 0x0040121c 01c0 add eax, eax
,===< 0x0040121e eb1d jmp 0x40123d
|`--> 0x00401220 b800000000 mov eax, 0
| | 0x00401225 39f2 cmp edx, esi
|,==< 0x00401227 7414 je 0x40123d
||| 0x00401229 488b7f10 mov rdi, qword [rdi + 0x10]
||| 0x0040122d e8d2ffffff call sym.fun7
||| 0x00401232 8d440001 lea eax, [rax + rax + 1]
,====< 0x00401236 eb05 jmp 0x40123d
|||`-> 0x00401238 b8ffffffff mov eax, 0xffffffff
||| ; CODE XREFS from sym.fun7 (0x40121e, 0x401236)
```--> 0x0040123d 4883c408 add rsp, 8
0x00401241 c3 ret

Breaking it down, this function:

  • Tests if rdi (arg1) is 0. If True, itreturns with a value of 0xffffffff. We’ll assume that’s a bad thing.
  • Compares the value at rdi (arg1) and the value of rsi (arg2).
  • If False, sym.fun7 is called with the value at rdi+8 as arg1 and our stdin as arg2
  • If True, first eax is set to 0, then two checks are done. If rsi and rdi are equal, it jumps to the end and returns. If not equal, sym.fun7 is called again with arg1 set to the value at rdi+0x10, and our text as arg2.

We know that our text always stays in rsi during this function, so next we need to figure out arg1. When originally called, sym.fun7 was called with arg1 set to obj.n1. Use px to print the hexdump at that address.

:> px 128 @ obj.n1
- offset - | 0 1 2 3 4 5 6 7 8 9 A B C D E F|0123456789ABCDEF
0x006030f0 |2400 0000 0000 0000 1031 6000 0000 0000|$........1`.....
0x00603100 |3031 6000 0000 0000 0000 0000 0000 0000|01`.............
0x00603110 |0800 0000 0000 0000 9031 6000 0000 0000|.........1`.....
0x00603120 |5031 6000 0000 0000 0000 0000 0000 0000|P1`.............
0x00603130 |3200 0000 0000 0000 7031 6000 0000 0000|2.......p1`.....
0x00603140 |b031 6000 0000 0000 0000 0000 0000 0000|.1`.............
0x00603150 |1600 0000 0000 0000 7032 6000 0000 0000|........p2`.....
0x00603160 |3032 6000 0000 0000 0000 0000 0000 0000|02`.............

If you look closely there’s a pattern. Like in Phase 6’s linked lists, these are structures. Use pxa to use r2 to annotate the results:

[0x00401204]> pxa @ obj.n1
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
/obj.n1
0x006030f0 2400 0000 0000 0000 1031 6000 0000 0000 $........1`.....
0x00603100 3031 6000 0000 0000 0000 0000 0000 0000 01`.............
/obj.n21
0x00603110 0800 0000 0000 0000 9031 6000 0000 0000 .........1`.....
0x00603120 5031 6000 0000 0000 0000 0000 0000 0000 P1`.............
/obj.n22
0x00603130 3200 0000 0000 0000 7031 6000 0000 0000 2.......p1`.....
0x00603140 b031 6000 0000 0000 0000 0000 0000 0000 .1`.............
/obj.n32
0x00603150 1600 0000 0000 0000 7032 6000 0000 0000 ........p2`.....
0x00603160 3032 6000 0000 0000 0000 0000 0000 0000 02`.............

For each node, there’s one int and two pointers to addresses. When sym.fun7 is called recursively, our jle instruction determines whether it is called with the first address or second. Nodes probably look something like this:

typedef struct node {
int value;
struct node *l,*r;
};

We could use pf to print these structs like in Phase 6, but this is a good excuse to actually define them using td — (T)ype — Load Types from String. After defining a struct, you can use tp — (T)ype — Cast Data and (P)rint) — to cast data to that struct.

"td struct node {int value;struct node *l,*r;};"

Note: You must enclose the entire command with double quotes.

Now print obj.n1 with ts

tp node @ obj.n1

Before printing more nodes, we need to touch on Flags and loops. Flags are a way r2 tracks interesting things, be it symbols, strings, relocs, etc. It arranges them into a “Flag space” which you can run r2 commands against. To view Flag space, use fs.

[0x00401204]> fs
9 . classes
0 . functions
28 . imports
19 . regs
31 . relocs
36 . sections
10 . segments
39 . strings
101 * symbols
32 . symbols.sections

To search flagspace, first you must select the space to use with fs [flag space section], then f will print all items.

[0x00401204]> fs strings
[0x00401204]> f
0x004022b6 29 str.s:_Error:_Couldn_t_open__s
0x004022d3 26 str.Usage:__s___input_file
0x004022ed 30 str.That_s_number_2.__Keep_going

r2 has the ability to loop over command results and run commands against it, using @@. This comes in handy for many things — finding xrefs, string references, mapping APIs, etc. In this case, we’ll set the flag space to symbols, then use@@ to loop over results that match obj.n*

fs symbols
tp node @@ obj.n*

This wildcard search will return a few objects from other levels, but the majority will be formatted into our struct definition. Notice how some nodes have no pointers to other addresses. In a binary tree, that indicates there are no more nodes (leaves).

Last Stretch

We’ll choose a random node’s value and add that to our answers.txt at the end. Let’s go with 47. Load the binary with r2 and answers.txt in Debug mode, use dcu sym.secret_phase to break at our function, and step through it.

After calling sym.fun7 and entering the function, look at the values at rdi (arg1) and the value of rsi (arg2). Arg1 is set to 36, the value of obj.n1. After the jump, rsi and rdi are tested for equality, then we call sym.fun7 again, this time with rdi+0x10 and 47.

Keep stepping through until it rdi and rsi are equal, then pay attention to eax as you step through until reaching sym.secret_phase again. See how it’s incremented? That’s the return value that needs to equal 2 to win.

When you look at how the eax is manipulated, you’ll see the two function calls within sym.fun7 do different things to eax after they return. We’ll call the first case the left, the other the right.

After the left calls sym.fun7, it adds eax to itself:

0x0040121c 01c0 add eax, eax

When the right calls sym.fun7, it adds eax to itself, then adds 1

0x00401232 8d440001 lea eax, [rax + rax + 1]

So when we use 7 as our argument, we run into:

eax + eax + 1
eax + eax
eax + eax + 1

This gives us 5, which won’t do. We need 2. Try changing your answers.txt to search for a number that doesn’t match any node’s value (ie 56). You’ll see that sym.fun7 hits the error state we identified earlier as a node’s next address was NULL. As mentioned earlier, that indicates there are no more nodes beyond this one, meaning the lookup failed.

Since these function calls are being made recursively, the math being applied to eax is executed LIFO (Last In First Out), executing the last modifications to eax first. If we want eax equal to 2, we need

eax + eax + 1 = 1        # Right 0 + 0 + 1 = 1 

Then

eax + eax = 2          # Left   1 + 1 = 2

If will want right evaluated before left, we should go left, then right. It may be easier to visualize this.

Credit to Unlogic for the image

If you’re at all fuzzy on this, test out the correct answer in Debug mode and see for yourself. Then try hitting other nodes.

Aftermath

I hope that was fun. At the very least, you got a chance to play with r2 and see how it handles basic RE. Since we’ve spent so much time on this binary, we might as well use it for a few more things.

I’ll follow this post up with a few more, focusing on slick r2 features that can be applied to this binary. Nothing advanced, just ESIL, profiles, scripts, tips, etc. Stay tuned.

--

--