Fuzyll


Hopelessly passionate husband, engineer, hacker, gamer, artist, and tea addict.


HITCON 2015 Qualifiers: Risky

The qualifiers for HITCON's 2015 CTF were held this weekend and featured 36 hours of all the web, crypto, RE, pwnable, and weird BASH/Ruby trivia challenges you could hope for. As an added bonus, the finals were announced as a qualifying event for the DEFCON 24 CTF finals next year! Unfortunately, the announcement came way too late for me. I'd already scheduled something else for this weekend. As a result, I missed the vast majority of what appeared to be a fairly challenging competition.

I did, however, get a chance to take a crack at one challenge: Risky (file here). It was a 300-point reverse engineering challenge featuring a binary that couldn't be opened in IDA. That's right - we had to go back to basics on this one!

Reconaissance

Running file seemed innocent enough at first:

risky-13de1366628df39000749a782f69d894: ELF 64-bit LSB executable, version 1 (SYSV),
dynamically linked (uses shared libs), for GNU/Linux 2.6.32, stripped

...but, IDA couldn't recognize the architecture. So, I ran readelf --headers:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           <unknown>: 0xf3
  Version:                           0x1
  Entry point address:               0x800900
  Start of program headers:          64 (bytes into file)
  Start of section headers:          3904 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         24
  Section header string table index: 23

The e_machine member of the ELF header reported 243 (seen as 0xf3 above) as the machine type. This page suggested that would make the architecture "RISC-V". Some quick Googling turned up this developer site, complete with ISA manual and toolchain. Apparently "RISC-V" is a real thing?

I installed their toolchain on my Ubuntu 14.04 VM:

sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \
  libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf
git clone https://github.com/riscv/riscv-tools.git
cd riscv-tools/
git submodule update --init --recursive
sudo -i
mkdir ~/riscv
export RISCV=/root/riscv
./build.sh

This placed all of their tools into /root/riscv/bin once they were built. Their version of gcc didn't build properly for some reason, but I didn't need it for this challenge anyway.

Once that was all built, I ran their objdump to see what the binary was doing:

/root/riscv/riscv64-unknown-elf-objdump -D ~/risky-13de1366628df39000749a782f69d894 > ~/risky.log

I'm not going to reproduce the output here in full, but you can download the file from here if you're interested.

Entry Point

According to the e_entry member of the ELF header, our entry point is at 0x00800900. So, let's start there:

_start:
  800900:   auipc   gp,0x2
  800904:   addi    gp,gp,-696 # 802648 <stdin+0x7e8>
  800908:   mv  a5,a0
  80090c:   auipc   a0,0x0
  800910:   addi    a0,a0,-908 # 800580 <strlen@plt+0x10>
  800914:   ld  a1,0(sp)
  800918:   addi    a2,sp,8
  80091c:   andi    sp,sp,-16
  800920:   auipc   a3,0x0
  800924:   addi    a3,a3,280 # 800a38 <strlen@plt+0x4c8>
  800928:   auipc   a4,0x0
  80092c:   addi    a4,a4,416 # 800ac8 <strlen@plt+0x558>
  800930:   mv  a6,sp
  800934:   j   800540 <__libc_start_main@plt>            ; goto __libc_start_main

I'll be honest, I have very little idea what most of this is doing. If you've reverse-engineered Linux binaries before, though, this should look familiar. A given libc generally has one or more initialization routines that are called before main is called. _start appears to be calling one at 0x00800934 (__libc_start_main):

0000000000800540 <__libc_start_main@plt>:
  800540:   auipc   t3,0x2
  800544:   ld  t3,-1824(t3) # 801e20 <strlen@plt+0x18b0>
  800548:   jalr    t1,t3
  80054c:   nop

This is a Procedure Linkage Table entry. The PLT, in case you're unfamiliar, is used to call dynamically-linked code (via a Global Offset Table entry). Think of it like a redirect. In any case, we see above that it will jump to...

  801e20:   mv  s1,s4
  801e22:   ret

...a GOT entry (as we expected). When a dynamically-liked library is loaded, the addresses of the functions we want to call out of it are populated in the GOT (sometimes up-front, sometimes "lazily" during run-time). We can't know their locations at compile-time, so the compiler simply generates a call to corresponding PLT entry instead. Unfortunately, our challenge writer didn't think we needed a copy of libc, so we're sunk. We have no way of knowing where it's going to go next because we don't have the code.

Main Function

So, we'll have to rely on context clues. _start above references 4 other addresses: 0x00802648, 0x00800580, 0x00800a38, and 0x00800ac8. The first location doesn't appear to be in the binary. The fourth function is simply a ret. The third function is a bit more interesting, but looks fairly small. The second function looks far more juicy, so we'll start there:

main:
  800580:   lui a0,0x801
  800584:   addi    sp,sp,-176                  ; // creates a new stack frame
  800588:   addi    a0,a0,-1328 # 800ad0 <strlen@plt+0x560>     ; // references first string in .rodata
  80058c:   sd  ra,168(sp)                      ; // saves off the caller's register state
  800590:   sd  s0,160(sp)
  800594:   sd  s1,152(sp)
  800598:   sd  s2,144(sp)
  80059c:   sd  s3,136(sp)
  8005a0:   sd  s4,128(sp)
  8005a4:   sd  s5,120(sp)
  8005a8:   sd  s6,112(sp)
  8005ac:   sd  zero,8(sp)                      ; size_t *n = NULL // sp+8
  8005b0:   sd  zero,16(sp)                     ; char *buf = NULL // sp+16
  8005b4:   jal 800500 <puts@plt>               ; puts("The flag is protected by my RISKY machine. You have to give me the correct code:")
  8005b8:   ld  a2,-2024(gp)
  8005bc:   addi    a1,sp,8
  8005c0:   addi    a0,sp,16
  8005c4:   jal 800530 <getline@plt>            ; getline(buf, n, stream)
  8005c8:   ld  a5,16(sp)
  8005cc:   li  a3,45
  8005d0:   lbu a4,4(a5)
  8005d4:   beq a4,a3,800610 <strlen@plt+0xa0>  ; if (buf[4] == '-') { goto success }

The reason this looks more juicy is because it appears to actually be setting up a stack frame and calling functions. As you can see from my comments above, it appears to be printing a message and then getting input from the user. The ISA manual doesn't mention it, but the Linux conventions for general-purpose registers in RISC-V state that x4-x11 (referred to by objdump as a0-a7) are for passing arguments to functions. This is how I was able to identify what sp+8 and sp+16 are: They're passed to getline(), which is a standard C function.

You might be wondering how I knew what the string was at this point. I figured this out with objdump -s -j .rodata:

Contents of section .rodata:
 800ad0 54686520 666c6167 20697320 70726f74  The flag is prot
 800ae0 65637465 64206279 206d7920 5249534b  ected by my RISK
 800af0 59206d61 6368696e 652e2059 6f752068  Y machine. You h
 800b00 61766520 746f2067 69766520 6d652074  ave to give me t
 800b10 68652063 6f727265 63742063 6f64653a  he correct code:
 800b20 00000000 00000000 56657269 6679696e  ........Verifyin
 800b30 67000000 00000000 68697463 6f6e7b00  g.......hitcon{.
 800b40 0a47656e 65726174 696e6720 666c6167  .Generating flag
 800b50 00000000 00000000 0a25730a 00000000  .........%s.....
 800b60 53535373 73535353 73737353 53737353  SSSssSSSsssSSssS
 800b70 2e2e2e00 00000000                    ........

You don't even need the custom-compiled RISC-V objdump for it!

Failing Constraints

At the end of the block above, we can see a beq (branch if equal). Let's take a look real quick at what happens if the fourth byte of our input (4(a5) loaded into a4 as a byte at 0x008005d0) is not a dash (the 45 ASCII value loaded into a3 at 0x008005cc):

fail:
  8005d8:   lui a0,0x801
  8005dc:   addi    a0,a0,-1184 # 800b60 <strlen@plt+0x5f0>     ; // references string in .rodata
  8005e0:   jal 800500 <puts@plt>       ; puts("SSSssSSSsssSSssS...")
leave:
  8005e4:   ld  ra,168(sp)
  8005e8:   li  a0,0
  8005ec:   ld  s0,160(sp)              ; // restores the caller's register state
  8005f0:   ld  s1,152(sp)
  8005f4:   ld  s2,144(sp)
  8005f8:   ld  s3,136(sp)
  8005fc:   ld  s4,128(sp)
  800600:   ld  s5,120(sp)
  800604:   ld  s6,112(sp)
  800608:   addi    sp,sp,176           ; // restores the caller's stack frame
  80060c:   ret                         ; return 0

Here, we can see we'll get hissed at and then exit the function if we don't provide the correct input. Nothing fancy.

Passing Constraints

If we succeed, though, we'll head here:

success:
  800610:   lbu a3,9(a5)
  800614:   bne a3,a4,8005d8 <strlen@plt+0x68>      ; if (buf[9] != '-') { goto fail }
  800618:   lbu a4,14(a5)
  80061c:   bne a4,a3,8005d8 <strlen@plt+0x68>      ; if (buf[14] != '-') { goto fail }
  800620:   lbu a2,19(a5)
  800624:   bne a2,a4,8005d8 <strlen@plt+0x68>      ; if (buf[19] != '-') { goto fail }
  800628:   lbu a3,24(a5)
  80062c:   li  a4,10
  800630:   bne a3,a4,8005d8 <strlen@plt+0x68>      ; if (buf[24] != '\n') { goto fail }
  800634:   ld  a6,-2048(gp)
  800638:   addi    a0,a0,-1
  80063c:   mv  a3,a5
  800640:   sub a4,a3,a5
  800644:   bleu    a0,a4,80066c <strlen@plt+0xfc>  ; if (buf) { goto gotstr }
  800648:   lbu a4,0(a3)
  80064c:   addi    a3,a3,1
  800650:   addiw   a4,a4,-45
  800654:   andi    a4,a4,255
  800658:   srl a1,a6,a4
  80065c:   andi    a1,a1,1
  800660:   bltu    a2,a4,8005d8 <strlen@plt+0x68>
  800664:   bnez    a1,800640 <strlen@plt+0xd0>
  800668:   j   8005d8 <strlen@plt+0x68>            ; goto fail

This checks to make sure that our 9th (0x00800614), 14th (0x0080061c), and 19th (0x00800624) bytes are also a dash (by checking against the previously checked bytes). Our 24th (0x00800630) byte should be a newline (\n, or 0x0a). As a result, we know our input should be in the format: AAAA-BBBB-CCCC-DDDD-EEEE\n.

If we use the wrong format, we go to fail. If we use the correct format and pass all the checks, we'll head to gotstr. This part is pretty long, so we'll break it up a bit:

gotstr:
  80066c:   lw  s1,10(a5)
  800670:   lw  s5,15(a5)
  800674:   lw  s3,0(a5)
  800678:   lw  s2,5(a5)
  80067c:   lui a0,0x801
  800680:   addi    a0,a0,-1240 # 800b28 <strlen@plt+0x5b8>
  800684:   lw  s0,20(a5)
  800688:   jal 800520 <printf@plt>             ; printf("Verifying")
  80068c:   li  a0,0
  800690:   jal 800560 <fflush@plt>             ; fflush(0)

Here, we're loading the 5 segments of our input above into different registers. AAAA goes into s3, BBBB goes into s2, CCCC goes into s1, DDDD goes into s5, and EEEE goes into s0. We'll then print the string "Verifying" and flush the buffer.

Input Verification

Next, we'll cover the 7 verification steps. This is step 1:

  800694:   mulw    s6,s1,s5
  800698:   lui a4,0x181aa
  80069c:   addiw   a4,a4,-929
  8006a0:   mulw    a5,s3,s2
  8006a4:   addw    a5,a5,s6
  8006a8:   addw    a5,a5,s0
  8006ac:   bne a5,a4,8005d8 <strlen@plt+0x68>  ; if (step1 != 0x181a9c5f) { goto fail }

This is step 2:

  8006b0:   li  a0,1
  8006b4:   jal 800510 <sleep@plt>              ; sleep(1)
  8006b8:   li  a0,46
  8006bc:   jal 800550 <putchar@plt>            ; putchar('.')
  8006c0:   li  a0,0
  8006c4:   jal 800560 <fflush@plt>             ; fflush(0)
  8006c8:   mulw    a5,s3,s1
  8006cc:   addw    a4,s2,s0
  8006d0:   addw    a5,a5,a4
  8006d4:   lui a4,0x2dead
  8006d8:   addiw   a4,a4,-821
  8006dc:   bne a5,a4,8005d8 <strlen@plt+0x68>  ; if (step2 != 0x2deacccb) { goto fail }

This is step 3:

  8006e0:   li  a0,1
  8006e4:   jal 800510 <sleep@plt>              ; sleep(1)
  8006e8:   li  a0,46
  8006ec:   jal 800550 <putchar@plt>            ; putchar('.')
  8006f0:   li  a0,0
  8006f4:   jal 800560 <fflush@plt>             ; fflush(0)
  8006f8:   addw    a5,s3,s2
  8006fc:   addw    a5,a5,s1
  800700:   addw    a5,a5,s5
  800704:   lui a4,0x8e2f6
  800708:   addw    a5,a5,s0
  80070c:   addiw   a4,a4,1920
  800710:   bne a5,a4,8005d8 <strlen@plt+0x68>  ; if (step3 != 0x8e2f6780) { goto fail }

This is step 4:

  800714:   li  a0,1
  800718:   jal 800510 <sleep@plt>              ; sleep(1)
  80071c:   li  a0,46
  800720:   jal 800550 <putchar@plt>            ; putchar('.')
  800724:   addw    s4,s2,s1
  800728:   li  a0,0
  80072c:   jal 800560 <fflush@plt>             ; fflush(0)
  800730:   addw    s4,s4,s0
  800734:   addw    a4,s3,s5
  800738:   mulw    a4,a4,s4
  80073c:   lui a5,0xb3da8
  800740:   addiw   a5,a5,-1185
  800744:   bne a4,a5,8005d8 <strlen@plt+0x68>  ; if (step4 != 0xb3da7b5f) { goto fail }

This is step 5:

  800748:   li  a0,1
  80074c:   jal 800510 <sleep@plt>              ; sleep(1)
  800750:   li  a0,46
  800754:   jal 800550 <putchar@plt>            ; putchar('.')
  800758:   li  a0,0
  80075c:   jal 800560 <fflush@plt>             ; fflush(0)
  800760:   lui a5,0xe3b0d
  800764:   addiw   a5,a5,-529
  800768:   bne s4,a5,8005d8 <strlen@plt+0x68>  ; if (step5 != 0xe3b0cdef) { goto fail }

This is step 6:

  80076c:   li  a0,1
  800770:   jal 800510 <sleep@plt>              ; sleep(1)
  800774:   li  a0,46
  800778:   jal 800550 <putchar@plt>            ; putchar('.')
  80077c:   li  a0,0
  800780:   jal 800560 <fflush@plt>             ; fflush(0)
  800784:   mulw    a4,s3,s0
  800788:   lui a5,0x4978e
  80078c:   addiw   a5,a5,-1980
  800790:   bne a4,a5,8005d8 <strlen@plt+0x68>  ; if (step6 != 0x4978d844) { goto fail }

This is step 7:

  800794:   mulw    s4,s2,s1
  800798:   li  a0,1
  80079c:   jal 800510 <sleep@plt>              ; sleep(1)
  8007a0:   li  a0,46
  8007a4:   jal 800550 <putchar@plt>            ; putchar('.')
  8007a8:   li  a0,0
  8007ac:   jal 800560 <fflush@plt>             ; fflush(0)
  8007b0:   lui a5,0x9bcd3
  8007b4:   addiw   a5,a5,222
  8007b8:   bne s4,a5,8005d8 <strlen@plt+0x68>  ; if (step7 != 0x9bcd30de) { goto fail }

This is step 8:

  8007bc:   mulw    s4,s6,s4
  8007c0:   li  a0,1
  8007c4:   jal 800510 <sleep@plt>              ; sleep(1)
  8007c8:   li  a0,46
  8007cc:   jal 800550 <putchar@plt>            ; putchar('.')
  8007d0:   li  a0,0
  8007d4:   jal 800560 <fflush@plt>             ; fflush(0)
  8007d8:   lui a5,0x41c7a
  8007dc:   addiw   a5,a5,928
  8007e0:   mulw    s4,s4,s0
  8007e4:   bne s4,a5,8005d8 <strlen@plt+0x68>  ; if (step8 != 0x41c7a3a0) { goto fail }

...and this is the final, 9th step:

  8007e8:   li  a0,1
  8007ec:   jal 800510 <sleep@plt>              ; sleep(1)
  8007f0:   li  a0,46
  8007f4:   jal 800550 <putchar@plt>            ; putchar('.')
  8007f8:   li  a0,0
  8007fc:   jal 800560 <fflush@plt>             ; fflush(0)
  800800:   lui a5,0x313ac
  800804:   addiw   a5,a5,1924
  800808:   bne s6,a5,8005d8 <strlen@plt+0x68>  ; if (step9 != 0x313ac784) { goto fail }

In each step, we sleep for a second, print a dot to the screen, flush the buffer, perform some calculations on our input data, calculate a constant value, and then compare the two calculated values. I'll cover the calculations on our input data later, but the calculated values above come from two instructions:

  1. An lui, which loads a value into the upper 20 bits of a 32-bit register (clearing the bottom 12 bits)
  2. An addiw, which will add a 16-bit immediate value to a register.

All the immediate values, in this case, are 12-bit values. This is done because it doesn't appear RISC-V has a way to simply move a 32-bit value into a register (not an uncommon problem on architectures with fixed-length instructions).

I calculated all the constant values you see in my added comments above in an interactive python session:

>>> hex((0x181aa << 12) -929)    # step 1
'0x181a9c5f'
>>> hex((0x2dead << 12) -821)    # step 2
'0x2deacccb'
>>> hex((0x8e2f6 << 12) + 1920)  # step 3
'0x8e2f6780'
>>> hex((0xb3da8 << 12) - 1185)  # step 4
'0xb3da7b5f'
>>> hex((0xe3b0d << 12) - 529)   # step 5
'0xe3b0cdef'
>>> hex((0x4978e << 12) - 1980)  # step 6
'0x4978d844'
>>> hex((0x9bcd3 << 12) + 222)   # step 7
'0x9bcd30de'
>>> hex((0x41c7a << 12) + 928)   # step 8
'0x41c7a3a0'
>>> hex((0x313ac << 12) + 1924)  # step 9
'0x313ac784'

Preparing Constants

Once the program has finished validating our input data, it will calculate another 5 4-byte constants:

  80080c:   lui a5,0x2c281
  800810:   addiw   a5,a5,-721
  800814:   sw  a5,32(sp)           ; sp+32 = 0x2c280d2f
  800818:   lui a5,0x38053
  80081c:   addiw   a5,a5,1317
  800820:   sw  a5,36(sp)           ; sp+36 = 0x38053525
  800824:   lui a5,0x6b5c3
  800828:   addiw   a5,a5,-1500
  80082c:   sw  a5,40(sp)           ; sp+40 = 0x6b5c2a24
  800830:   lui a5,0x27542
  800834:   addiw   a5,a5,1832
  800838:   sw  a5,44(sp)           ; sp+44 = 0x27542728
  80083c:   lui a5,0x29755
  800840:   addiw   a5,a5,1839
  800844:   sw  a5,48(sp)           ; sp+48 = 0x2975572f

The same lui and addiw pattern from above can be seen here. Again, I calculated these values in python:

>>> hex((0x2c281 << 12) - 721)   # constant 1
'0x2c280d2f'
>>> hex((0x38053 << 12) + 1317)  # constant 2
'0x38053525'
>>> hex((0x6b5c3 << 12) - 1500)  # constant 3
'0x6b5c2a24'
>>> hex((0x27542 << 12) + 1832)  # constant 4
'0x27542728'
>>> hex((0x29755 << 12) + 1839)  # constant 5
'0x2975572f'

Generating the Flag

Once it has calculated this new 20-byte constant value, it will prepare to generate the flag for us:

  800848:   lui a5,0x801
  80084c:   ld  a5,-1224(a5) # 800b38 <strlen@plt+0x5c8>     ; // reference to "hitcon{" in .rodata
  800850:   lui a0,0x801
  800854:   addi    a0,a0,-1216 # 800b40 <strlen@plt+0x5d0>  ; // reference to "Generating flag" in .rodata
  800858:   sw  s1,64(sp)
  80085c:   sw  s0,72(sp)
  800860:   sd  a5,80(sp)                   ; char *flag = "hitcon{" // sp+80
  800864:   sw  s3,56(sp)
  800868:   sw  s2,60(sp)
  80086c:   sw  s5,68(sp)
  800870:   sw  zero,28(sp)
  800874:   jal 800520 <printf@plt>         ; printf("Generating flag")
  800878:   li  a0,0
  80087c:   jal 800560 <fflush@plt>         ; fflush(0)

Here, we can see it placing the string "hitcon{" (which most (all?) of the flags in this competition started with) at sp+80. It then prints "Generating flag" and flushes the buffer.

Next, let's look at how it generates the flag:

  800880:   li  s0,0                        ; int i = 0
  800884:   li  s1,20                       ; int max = 20
genflag:
  800888:   addi    a3,sp,32                ; // store sp+32 in a3
  80088c:   add a4,a3,s0                    ; // add i to a3, store in a4
  800890:   addi    a3,sp,56                ; // store sp+56 in a3
  800894:   add a5,a3,s0                    ; // add i to a3, store in a5
  800898:   lw  a4,0(a4)                    ; // get 4 bytes out of a4, store in a4
  80089c:   lw  a5,0(a5)                    ; // get 4 bytes out of a5, store in a5
  8008a0:   li  a0,1
  8008a4:   addi    s0,s0,4                 ; // increment our index (s0) by 4
  8008a8:   xor a5,a4,a5                    ; // xor a5 and a4 together
  8008ac:   sw  a5,24(sp)                   ; // store into sp+24
  8008b0:   jal 800510 <sleep@plt>          ; sleep(1)
  8008b4:   li  a0,46
  8008b8:   jal 800550 <putchar@plt>        ; putchar('.')
  8008bc:   li  a0,0
  8008c0:   jal 800560 <fflush@plt>         ; fflush(0)
  8008c4:   addi    a0,sp,80
  8008c8:   jal 800570 <strlen@plt>         ; int len = strlen(flag)
  8008cc:   addi    a5,sp,80
  8008d0:   addi    a1,sp,24
  8008d4:   add a0,a5,a0
  8008d8:   jal 8004f0 <stpcpy@plt>         ; stpcpy(flag+len, sp+24)
  8008dc:   bne s0,s1,800888 <strlen@plt+0x318>  ; if (i != max) { goto genflag }
  8008e0:   li  a5,125
  8008e4:   sb  a5,0(a0)                    ; // add '}' to the end of our flag
  8008e8:   sb  zero,1(a0)                  ; // null-terminate our flag
  8008ec:   lui a0,0x801
  8008f0:   addi    a1,sp,80
  8008f4:   addi    a0,a0,-1192 # 800b58 <strlen@plt+0x5e8>
  8008f8:   jal 800520 <printf@plt>         ; printf("\n%s\n", flag);
  8008fc:   j   8005e4 <strlen@plt+0x74>    ; goto leave

The genflag loop above (from 0x00800888 to 0x008008dc) takes 4 bytes of our input (a5, 0x0080089c) and 4 bytes of the generated constants (a4, 0x00800898) and xors them together (at 0x008008a8). It will do this 5 times (from 0 until 20, by increments of 4) and sleep, print, and flush each time. Once this loop has terminated, it will get the current length of the string with a strlen (which will be 7, the length of "hitcon{" from before), increment the pointer to the end of that, and then stpcpy the result of the xors after it. It will then terminate it with a "}" and a NULL. Then, it will print the value out to us.

This is our flag!

All Together Now

Now that we know what the program is doing, we need to determine all the correct initial values. Due to its ease of use (and its popularity in CTF circles), I chose z3py to do my solving. Z3 is an SMT solver. If you've never heard of SMT solving before, it's complicated logic and math to quickly solve problems regarding satisfiability (whether a value exists to satisfy a particular equation or set of constraints). It's a really handy tool to have and I highly recommend spending some time with it if you enjoy reverse engineering.

Anyway, here's my solution's source code:

import z3
import sys
import struct

# create our solver
s = z3.Solver()

# add our input variables as 32-bit "bit vectors"
s3 = z3.BitVec("s3", 32)  # set at 0x00800674
s2 = z3.BitVec("s2", 32)  # set at 0x00800678
s1 = z3.BitVec("s1", 32)  # set at 0x0080066c
s5 = z3.BitVec("s5", 32)  # set at 0x00800670
s0 = z3.BitVec("s0", 32)  # set at 0x00800684

# add constraints from the calculations between 0x00800688 and 0x00800808
s4 = z3.BitVec("s4", 32)
s6 = z3.BitVec("s6", 32)
s.add(s6 == s1 * s5)
s.add(0x181a9c5f == (s2 * s3) + s6 + s0)
s.add(0x2deacccb == (s1 * s3) + (s0 + s2))
s.add(0x8e2f6780 == s3 + s2 + s1 + s5 + s0)
s.add(s4 == s2 + s1 + s0)
s.add(0xb3da7b5f == s4 * (s3 + s5))
s.add(0xe3b0cdef == s4)
s.add(0x4978d844 == s3 * s0)
s.add(0x9bcd30de == s1 * s2)
s.add(0x41c7a3a0 == s1 * s2 * s6 * s0)
s.add(0x313ac784 == s6)

# check and print our model
if s.check() != z3.sat:
    print "ARE YOU NOT SATISFIED?!"
    sys.exit(0)
print s.model()

# calculate our flag
sp32 = s.model()[s3].as_long() ^ 0x2c280d2f  # set at 0x00800814
sp36 = s.model()[s2].as_long() ^ 0x38053525  # set at 0x00800820
sp40 = s.model()[s1].as_long() ^ 0x6b5c2a24  # set at 0x0080082c
sp44 = s.model()[s5].as_long() ^ 0x27542728  # set at 0x00800838
sp48 = s.model()[s0].as_long() ^ 0x2975572f  # set at 0x00800844

print "hitcon{" + struct.pack("IIIII", sp32, sp36, sp40, sp44, sp48) + "}"

First, we declare our input variables. These are the "AAAA", "BBBB", "CCCC", "DDDD", and "EEEE" values from above. Then, we translate all the calculations from the 9 verification steps into constraints. I've used register names here, so you can look back at the assembly above and verify my work. Then, we'll check our model to see if it's satisfied. If it is, we'll print it out so we can see our original input values.

Once our model is satisfied, we need to perform the XORs from the flag generation step. And, once we're done with that, we need to tack the other flag bits on the front and end before printing it out. Here's what it looks like when we run it:

[s2 = 1295338573,
 s5 = 1362445638,
 s3 = 1497977931,
 s1 = 1379355478,
 s0 = 1145321036,
 s6 = 825935748,
 s4 = 3820015087]
hitcon{dYauhy0urak9nbavca1m}

Success! Our flag, as shown above, was: hitcon{dYauhy0urak9nbavca1m}.