Dissecting and automating Hancitor’s config extraction

  1. Verify that the IOCs are related to Hancitor.
  1. Statically using Cyberchef.
  2. Statically using python script (I’ll work with pycharm).
Hancitor sample metadata in PEstudio

Configs in malware and how to hunt them

Before we start, we first need to understand how config extractions and Hancitor in particular works. To do so, we'll use the Ghidra decompiler & IDA Disassembler (It doesn’t really matter which tool we use, I chose to use both tools because they are free and accessible for everyone).

  1. Allocation + memcopy operation and manipulate the copied data
  2. Usage of CryptoAPI or custom encryption

Understanding how Hancitor config extraction works

In Hancitor’s case, we see two chunks of data almost right at the beginning of the data section, that immediately raises our suspicious:

  1. DAT_10007264 - a larger chunk of data
DAT_10005010 and DAT_10007264 in IDA & Ghidra
CryptoAPI function
  1. The length of the config will be 0x200.
Understanding the parameters
  1. The pointer for the key that will be used in CryptDecrypt will be generated by the function CryptDeriveKey.
  2. To match a key to an algorithm, CryptDeriveKey also receives in its second argument ALG_ID the algorithm identifier, which according to Microsoft documentation: “An ALG_ID structure that identifies the symmetric encryption algorithm for which the key is to be generated”.
    In our case, we can see that the Algid is equal to 6801, which according to Microsoft documentation stands for CALG_RC4.
CryptDeriveKey Algid in IDA
Algid CALG_RC4 in Microsoft documentation
  1. RC4 is a stream cipher algorithm, which means that in order to decrypt it we need the initial key that the malware authors create (not the same as the session key generated by CryptDeriveKey).
  1. In its third argument, it gets the length of that key.
  2. In the first argument, it outputs the hashed key (to be sent to the CryptDeriveKey function).
CryptHashData in Microsoft documentation
Understanding the parameters
CryptCreateHash Algid in IDA
Algid CALG_SHA1 in Microsoft documentation

Recap of Hancitor config extraction mechanism

In the first part, we tried to first understand how the Hancitor config decryption mechanism works, our analysis led us to the following findings:

  1. The decryption mechanism is done with the CryptoAPI
  2. The config is encrypted with the RC4 algorithm.
  3. The initial key for the config is the data array named DAT_10005010, and its size is 8 bytes.
  4. The initial key (DAT_10005010) will be hashed with SHA1
  5. The final session key will be the first 5 bytes of the SHA1 hash.

First extraction method - Dynamically using x32dbg

Usually, the dynamic approach is the quickest, and in Hancitor's case, it's no different. First, we can see that this module has two randomly named export functions, so it will make sense to operate through one of them.

Hancitor export functions in PE-bear
Executing Hancitor using rundll32
Before CryptDecrypt
After CryptDecrypt

Second extraction method - Statically using Cyberchef

In some cases, malware researchers prefer to extract the IOCs statically, this approach reduces the risk of human error that can lead to malware communication with its C2. In our case, this way will allow us to really prove our hypothesis about the decryption mechanism to the test.

Extracting the initial key
Initial key represented in HEX
Initial key and encrypted config in HEX

Getting the session key

To get the session key, all we need to do is to follow our hypothesis from part one. Let’s do it again step by step in Cyberchef:

Manually creating the session key
Decrypting the config

Third extraction method - Statically using Python script

The last approach is to extract the config using python script, the biggest advantage of this method is that once we have our script, it can basically work on any other Hancitor sample (as long the malware keeps using the current config decryption mechanism).

  1. Extract the data section, because the config and its key stored in it.
  2. Extract the encrypted config and its key from the already extracted data section.
  3. Hash the key with SHA1 and then have only the first 5 bytes as the final key.
  4. Decrypt the encrypted RC4 config with the final key.
  5. Display the config.
  1. pefile - a multi-platform Python module to parse and work with Portable Executable (PE) files.
  2. Hashlib - an interface for hashing messages.
  3. arc4 - A small and insanely fast ARCFOUR (RC4) cipher implementation of Python.
import binascii
import arc4
import pefile
import hashlib

Extracting the data section

Now, in order to work on any sample, we need a path, therefore we’ll first get the sample’s path as an input.

filepath = raw_input('please write the file path: ')
pe = pefile.PE("path of the sample")
for section in pe.sections:
if ".data" in section.Name:
return section.get_data()
def extractDataSection(path):
pe = pefile.PE(path)
for section in pe.sections:
if ".data" in section.Name:
return section.get_data()
rawdata = extractDataSection(filepath)
keyPlusData = rawdata[16:]

Getting the key and encrypted config

As we already know, the key length is 8 bytes, therefore, to extract it we need to take only the first 8 bytes from the rawdata variable.
This can be done with the following command:

key = keyPlusData[:8]
encryptedConfig = keyPlusData[8:]

Hashing the key

Now that we have the initial key, we know that we need to hash it using the SHA1 algorithm, with the help of the hashlib module we can do it with the following command to do so:

hashedKey = hashlib.sha1(key).hexdigest()
finalkey = hashedKey[:10]

Decrypting the config

First, we need to create a new function that will get the final session key and the encrypted data as arguments. Next, with the help of the module arc4, we’ll use the key to decrypt the encrypted content.
To do so, do the following commands:

cipher = arc4.ARC4(key)
decrypted_content = cipher.decrypt(encryptedConfig)
final_config = decrypted_content[:150]
print(final_config)
def rc4_decryption(key,encryptedConfig):
cipher = arc4.ARC4(key)
decrypted_content = cipher.decrypt(encryptedConfig)
final_config = decrypted_content[:150]
print(final_config)
import binascii
import arc4
import pefile
import hashlib
def rc4_decryption(key,encryptedConfig):
cipher = arc4.ARC4(key)
decrypted_content = cipher.decrypt(encryptedConfig)
final_config = decrypted_content[:150]
print(final_config)
def extractDataSection(path):
pe = pefile.PE(path)
for section in pe.sections:
if ".data" in section.Name:
return section.get_data()
def main():#getting the file's path
filepath = raw_input('please write the file path: ')

#call to data extraction function
rawdata = extractDataSection(filepath)

#remove the first 16 bytes of the extracted data section
keyPlusData = rawdata[16:]

#extracting the key
key = keyPlusData[:8]

#extracting the encrypted config
encryptedConfig = keyPlusData[8:]

#hashing the key with SHA1
hashedKey = hashlib.sha1(key).hexdigest()

#getting only the first 5 bytes from the hashed key
finalkey = hashedKey[:10]

#call for decryption function
rc4_decryption(binascii.unhexlify(finalkey),encryptedConfig)
if __name__ == '__main__':
main()

Conclusion

In this article \ tutorial, I presented the theory behind malware configs, and particularly, the config extraction mechanism of the Hanicotr malware.
After learning the theory, we discussed and implemented three approaches to get the final config.

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store