# Control Flow Unflattening

## Target

Recently I have analyzed a RASP solution called Approov. Although there are some novel detection techniques, overall it’s not that interesting. Instead, I will focus on the obfuscation part of the native library, control flow flattening. You can find a host app with a little googling :)

## CFF

Control flow flattening (CFF) in a nutshell is a technique to obfuscate code flow by rearranging code blocks (+create new basic blocks) into switch cases. Visualizing it is more straightforward:

If you think about the dynamic analysis perspective, the execution flow is the same as in the first case. The dispatcher (gray block) should execute blocks in the order of normal execution. Since you need to follow the state variable after each “case” execution, static analysis is harder for a reverse engineer and sometimes state variable is also obfuscated.

After seeing the cff’ed binary I looked up earlier research about cff. I tried to work with stadeo, then found eShard d810. Stadeo uses miasm to lift binary and then analyze the cff whereas d810 uses HexRays microcode (intermediate representation of IDA). D810 can backward track variables, emulate microcode, and have very easy-to-extend config/rule sets for opaque predicates, etc. So I sticked to d810. Make sure to read their blogs about d810[1][2]

## Microcode

D-810 hooks the decompilation process of IDA and makes changes. Micro instructions(minsnt_t), micro-operations (mop_t), blocks (mblocks_t) in microcode. Before giving execution back to IDA final mba needs to be verified.

Oh, boi.. Before realizing all internal errors of (INTERR) microcode was documented in hexrays_sdk/verifier/verify.cpp I tried to find the meaning of all INTERR’s by dumping microcode. I used d810’s dump_microcode_for_debug function to dump microcode which helped a lot for debugging. To debug D810 (or any IDA Pro plugin) I followed this blog post

## Unflatten

To unflatten, we need to find state variables in dispatcher fathers. The Dispatcher father is a block that goes into the dispatcher directly. If we know the state variable in dispatcher father, we can calculate the next dispatched block from dispatcher and replace the dispatcher father’s next location with found block.

Let’s give a example from the sample. Each switch case body and initial state variable assignments are a dispatcher father.

Switch case starts with initial state variable 1010106. Knowing if state is 1010106 (X), dispatcher(Y) will go into code block(Z): v1 = (qword_470e8 == 0) + 1010107, We can directly go into block(Z) from assignment of 1010106(X). Meaning instead of X->Y->Z we will have X->Z

In d810 there is a rule for cff with a switch table called default_unflattening_switch_case.json. Unfortunately, it does not work for this case. But why? Because of the v1 = (qword_470e8 == 0) + 1010107. It can generate two different variables from single block.

Actually, they tried to solve this problem. If a block can have more than one state variable value (think about two if cases with different state assignments), they “duplicate any block which appears in (at least) 2 paths with (at least) 2 different predecessors.” But there is no predecessor in this case. So how I solved it ? By creating jz condition with two new blocks.

Let’s look at the microcode representation of these blocks to understand easier. To show microcode I used a plugin called lucid

• Take comparison part of block 5
• make it jz condition with comparison
• Create two new blocks with calculated state variable
• fix in/out bounds of blocks

After finding dispatcher fathers, I used the mop tracker history feature of d810 to find all father histories and fixed instructions that can create two state variables before executing generic unflattening of d810.

After the changes, all dispatcher fathers have constant state variables. Then we can use d810’s cff solver part and continue.