Dancing With Shellcodes: Cracking the latest version of Guloader

File metadata

Hash: d55259bcf47af7e645ab7b003aa2cd4071cb36c6

Sample metadata in Pestudio

Getting into the shellcode

In its initial state, Guloader is wrapped with a VB. To overcome it, we’ll first reach the entry point and then set a breakpoint on VirtualAlloc. Next, we will click Run 12 times (the VB wrapper calls several times to VirtualAlloc, but we only care about the 12th time).

12th VirtualAlloc
Call the shellcode
Take the jump

The shellcode

After taking the initial jump, we see three different functions. For our unpacking tutorial, we can skip them and go straight to the JMP 602766, located at the end.

Take the jump
Step into
Anti VM function

Anti-Analysis 1: Anti-VM

To our surprise, when we will try to step over the CALL to function 6031A9 we encounter the following message box.


Why did it happen?

Without paying attention, the shellcode pushed 8 pre-computed hashes into the stack, in the following order:
push 0xB314751D
push 0xA7C53F01
push 0x7F21185B
push 0x3E17ADE6
push 0xF21FD920
push 0x27AA3188
push 0xDFCB8F12
push 0x2D9CC76C
These hashes will be used by the function 6031A9 in the following manner:
1) The function will use the API call ZwQueryVirtualMemory (the kernel equivalent of VirtualQuery) to scan the process’s memory.

How we overcome this anti-VM technique?

There are three different approaches we can take:
1) The first approach is to change the pre-computed hashes on the stack before the call to 6031A9.
2) Fill the CALL line with no operation (NOP)
3) Change the control flow by redirecting the EIP register to contain the address of the next instruction (after the CALL to 6031A9)

Changing the hashes on the stack

Anti-Analysis 2: Time checks & CPUID

If we will try to step over this function, we’ll see that we are stuck and can't move forward.

Anti-Analysis function

Why did it happen?

Inside the function 601F28, there is another routine that consists of two anti-analysis mechanisms. Time cheks using RDTSC (Read Time-Stamp Counter), and anti-VM using CPUID.

Anti-Analysis function

How we overcome this anti-analysis?

Similar to the first anti-VM, we can change the control flow with the EIP register, or fill the line of the CALL to 601F28 with NOPS.

NOP the function
Step Into
Resolving API Calls
Take the jump
Step Into

Anti-Analysis 3: Anti-VM\Anti-Sandbox

After we step over the call to EnumWindows, we see the line: cmp eax,c.
Using this line the shellcode determines if there are at least 12 (C in hexadecimal) windows in the machine. If not, the process will be terminated using the previously mentioned API call - TerminateProcess.

Check for at least 12 windows

How we overcome this anti-sandbox?

Switch the flag in the JGE jump if necessary, however, I did not have any issues with it.

Getting into the Anti-breakpoint function

Anti-Analysis 4: Anti breakpoints

When we step into this function, we observe an interesting anti-debugging technique. In its first lines, the shellcode gets the function DbgBreakPoint and store it on esp+18.

Getting DbgBreakPoint
Getting DbgUiRemoteBreakin
Patching DbgBreakPoint
Patching DbgUiRemoteBreakin
Before and after patch

How we overcome this anti-breakpoint?

The best way is to bypass the function that responsible for this anti-analysis mechanism, which is 6034F4. Either NOP or Control flow solutions are fine here.

NOP Anti-Analysis function

Anti-Analysis 5: Anti-VM

Next, we see the function 602038, if we step over it and we’ll see the string “C:\Program Files\qqa\qqa.exe”. This is because 602038 functionality is to search whether the Qemu gues agent is located on the machine. This is another anti-VM feature of Guloader.

Qemu gues agent

Anti-Analysis 6: NtSetInformationThread

The second argument is ThreadHideFromDebugger (11), which in this case will cause the process to crash if it's working under a debugger.

NtSetInformationThread Anti-Analysis

How we overcome this anti-debugger technique?

ScyllaHide covers this technique, however, we can just change the control flow or insert NOPs.

Take the jump
Step into
MsiEnumProducA and MsiGetProductInfo
Change the flag

Shellcode main function

Once we took the jump, we will reach one of the most important functions in the shellcode. This function will mainly consist of two important functions.
The first one is the already mentioned 602F54 which will resolve API calls. The second one is 603B93 which will be responsible to execute them (except few cases). This function will be the main execution function, where the most important API calls will be executed.
These two functions will be used multiple times during the final stages of the shellcode. Set a breakpoint on 603B93 and step into it.

Two important functions
Execution function architecture

Anti-analysis 7: Hardware breakpoints

The DR (debug registers) are located in the following locations:
[eax+4] = DR 0
[eax+8] = DR 1
[eax+C] = DR 2
[eax+10] = DR 3
[eax+14] = DR 4
[eax+18] = DR 5
The shellcode will compare any of these registers to the number 0, if one of them is not 0 that means there is a hardware breakpoint. In this case, the shellcode will jump using the JNE 603C97 and the process will be terminated.

Hardware breakpoint example

How we overcome this technique?

If you set a hardware breakpoint, you can change the flag so the JNE jump will not be taken.
The easiest solution will be to use the ScyllaHide plugin.

Anti-analysis 8: Software breakpoints

In this technique, the shellcode will get the API call to be executed from the EAX register, move one byte to the bl portion of the EBX register, and will inspect if any software breakpoints assign to it.
If it has any software breakpoint, it will have one of the breakpoint opcodes(for example, 0xCC which means INT 3, and as we know, the INT 3 opcode represents a software breakpoint).

Software breakpoint example
Software breakpoint example

How we overcome this technique?

Change the ZF to be 0, or change the instruction to be NOP. As mentioned before, the easiest solution is the ScyllaHide plugin.

Creating process
RegAsm in suspend state
Write the second shellcode
Observing the second shellcode

Wrap the first shellcode

After the first shellcode creates the RegAsm process and injects a second shellcode into it, it will execute the API call NtResumeThread to activate the second shellcode within the RegAsm memory.

Debugging the second shellcode

When we start to debug the second shellcode, we notice that to our surprise this shellcode starts the same as the first one, In fact, this is the almost same shellcode. This resembles give us the advantage to bypass all the anti-analysis mechanism that we already see in the first shellcode.

Differences from the first shellcode

After we reach the main function we saw in the first shellcode, we will set the same breakpoints. Then, as we click Run and step over functions, we start to see indications of additional capabilities that we have not seen in the first shellcode.

Observing the C2
Observing the C2


When we sum up the entire architecture of Guloader, we observe several stages and key features:
1) The malware initially come wrapped with a VB layer
2) After the VB part ends, the entire malware activity is executed by a shellcode.
3) The shellcode contains multiple anti-analysis mechanisms, some of them are inescapable without manual intervention.
4) The shellcode creates the process RegAsm and injects a second shellcode into it with a unique variation of the Process Hollowing injection.
5) The second shellcode downloads further malware

Guloader architecture


In this article, I covered the entire process of the Guloader malware and presented several anti-analysis mechanisms from this shellcode-based downloader.






Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store