Highway to Conti: Analysis of Bazarloader
As we look back to summarize the year 2021 we observe that the biggest threat in the cybersecurity landscape is still ransomware. A large number of ransomware incidents have occurred around the world, extorting hundreds of millions overall from victims across the globe.
As the sun went down on some past major players in the ransomware ecosystem (such as REvil), the sun definitely shone on others, specifically the most prolific group in 2021: Conti.
The list of Conti’s victims is definitely long and vary, with some high profile names such as the recent incidents of the bank of Indonesia, and Delta Electronics.
Although each case has its own story to tell, it is reported that multiple incidents of attacks that ended up with Conti ransomware started or had involved BazarBackdoor or BazarLoader malware.
In this article, I will present an analysis of the BazarLoader malware, its defensive measures to hinder security researchers, and other important core functionalities.
BazarLoader has been first observed and reported in April 2020 and was associated and believed to be developed by a group called ITG23 or TrickBot gang.
The loader itself is known to be distributed by phishing campaigns that use multiple LoLbins for deployment such as Powershell, Mshta, ISO files, and eventually the involvement of Rundll32 or Regsvr32.
SHA1 hash: 94114c925eff56b33aed465fce335906f31ae1b5
Similar to many malware that comes from the e-crime scene, Bazarloader comes packed inside an initial dropper. The dropper itself is a 64-bit .dll file with a high entropy of over 6.8.
When we open the dropper in IDA, we immediately notice a large olive color in the navigation bar, in many cases (especially with packed malware) this can be a big clue for obfuscated content yet to be decrypted.
As we investigate the navigation bar, we see two interesting code blobs right at the beginning of the .rdata section, the first one is quite small, but the second is very big with a size of 156256 bytes. For convenience, we’ll convert them both to a byte array, to do so, do the following:
1) Right-click on the code blob name
2) Click Byte
3) Right-click on the code blob again
4) Click Array
5) For convenience, change the name of these blobs
Next, we’ll want to inspect where these bytes are being used by seeing their cross-reference. By tracing the usage of the big blob, we can see that it is entering to a function named “sub_1800F110” that also gets a value of 0x26260 which is 156256 in decimal, and as we know, this is the exact size of the big blob.
This function objective will be:
- Allocate a new buffer using VirtualAlloc
- Use the small blob
- Partially decrypt the buffer into the newly allocated memory
- Once the function is finished, it will return the allocated buffer to a higher function named “sub_18000FC10”.
In terms of decrypted data, in runtime, it will look like this:
Next, the partially decrypted buffer will be sent to another function called “sub_1800015D0”. This function objective will be:
- Perform further decryption using XOR loop with a designated key
- Allocate new memory
- Perform another decryption which will result in the final bazarloader payload, and copy it into the new buffer
In the end, after these two iterations of data manipulation, the two phases will look like this
Note: If we wanted to avoid this way, the easy way to unpack this dropper manually will be:
- Set a breakpoint on VirtualAlloc
- On the second instance, set a hardware breakpoint on the allocated buffer
- Hit Run until you see the final clean payload
The unpacked file (Bazarloader payload) is a 64-bit DLL file with a much lower entropy of 3.96 compared to its dropper. In addition, the malware has 8 export functions, however, all of them are empty except the function “EproyAklW”.
Also, we notice the internal name of the malware called “l_dll_rndll_eaw_64_p2_g8_v221_11_01_22_logs_no.dll”.
Also, the malware’s import function table is empty, which indicates that the API calls will be resolved dynamically by some mechanism. In addition, in terms of size, Bazarloader is small\mid size malware.
My investigation will be separated into two parts:
- Bazarloader defenses: Any method the malware used to slow down researchers and how to overcome them.
- Bazarloader operative mechanism: Basically how the malware works.
BazarLoader defenses 1: API Hashing
Right as we enter the export function “EproyAklW” we observe the first defense mechanism of Bazarloader, its dynamic API hashing resolving function (which in our case is called sub_1800AC7C).
For those who are not familiar with the term API hashing:
“API hashing is simply an arbitrary function/algorithm, that calculates a hash value for a given text string.”
In simple words, the function gets as input some hash to be computed and eventually output a pointer to an API call. Next, usually, we’ll see this pointer being used in form of a function.
To confirm our hypothesis, we can always debug the function dynamically and step over it. once we do it we’ll notice two things:
- The register EAX will hold the address of the resolved API call (in this case it is RtlExitUserProcess (which is the kernel-mode equivalent to ExitProcess).
- Three instructions later the register EAX will be executed via call, which means RtlExitUserProcess will be executed.
As can be assumed, the main advantages of this technique are:
- The malware is more stealthy because as we said, its import table is empty, thus making the analysis more challenging and slow.
- This also creates some challenges for automated security products that rely on these API calls to be present in order to determine the file’s nature.
This technique is very common in the malware world and can be found in other malware such as Emotet, Qbot, Trickbot, Conti ransomware, Lockbit, and so on.
Small Tip: In many cases, the API hashing function will result in the address of the requested API call, therefore, in many cases, they will use the Process environment block (PEB) for the part of actually resolving. Searching for the usage of the PEB in the code is a good way to smell for these resolving functions.
As security researchers, the major issue with API hashing functions is that they are being executed many times, basically each time the malware wants to use a specific function. In Bazarloader’s case, we can see “sub_18000AC7C” being used 232 times. Obviously going to each function and resolving it dynamically is time-consuming and this process needs to be scaled.
In order to speed things up, we’ll use one of my favorite tools, and a GO-TO when it comes to API hashing: HashDB.
The HashDB plugin is a community-sourced library of hashing algorithms used in malware. The plugin allows reverse engineers to test specific hashes against the algorithms that HasDB has.
Once having HashDB, do the following:
- Right-click on the hash
- Click on HashDB Hunt Algorithm
After a couple of seconds, we got a popup that tells us that the algorithm found is “rol7_xor”, then, click ok.
Now, do the following:
- Right-click again on the hash
- Choose HashDB Lookup
Then, we’ll get another popup that will tell us that the hash is translated to the API call ExitProcess, similarly to what we saw during our dynamic analysis.
Once the function has been decrypted, an Enum will be created, this Enum should be implied to all of the hashes. to do so, do the following:
- Right-click on the function name
- Click Set call type
- Change the type of the third argument to be the Enum name
BazarLoader defenses 2: Stack strings (sort of)
Usually, malware authors like to hide indicative or important strings in embedded obfuscated code blobs inside the PE itself, a good example will be Qbot which stores strings related to commands, process names, network activity, inside a code blob.
However, when we inspect this Bazarloader sample, we do not find any suspicious code blobs that could indicate hidden obfuscated data.
The reason for that is that Bazarloder store those strings in multiple small hashes that are combined during runtime and xored with a different key.
This behavior happened hundred of times during the malware operation, a good way to track them will be to use the plugin FLARE CAPA.
In order to decrypt them statically, all you need to do is the following:
- Click on the hash
- shift + E
- Copy the first 4 bytes to Cyberchef
- Do it for each hash and merge them
5. Add from hex to the recipe
6. Add XOR to the recipe
7. In the xor key take the last 4 bytes (basically similar to the hashes)
BazarLoader operative mechanisms
As mentioned, this section will be about anything related to the malware activity itself and commands.
First, like much other malware, Bazarloader will check the connectivity and try to access multiple legitimate domains.
Some of these domains are traditional for malware connectivity checks like google.com, however, some of them are more interesting such as the white house website.
Next, we observe indicative a command that instructs the malware to “download and run backdoor”, which could potentially be the BazarBackdoor.
As for network capabilities, the malware will have two ways to operate and it will depend on:
- Use hardcoded IP \ Emercoin
- Use a generated Emercoin
If the malware will choose to use the hardcoded way, it will first use the following hardcoded IP addresses and Emercoin domains
Next, it will go to a function that will use WinINet functions to communicate externally
This network function will return the status code of the network operation as an output using the API HttpQueryInfo. In other words, if the function is successful and works properly, it will return 200.
Next, the caller function will check whether the status code is indeed 200, if yes, it will call a function that contains the code injection core function.
Second method: DGA
As told, bazarloader has an option to generate an Emercoin. In this option, Bazarloader will use its domain name generator (DGA) capabilities to generate a random .bazar (which is related to Emercoin) domain.
After generating the name, the malware will add the string “.bazar” to it as a suffix.
Then, the malware will have the ability to communicate externally using the WINSOCK functions. Unlike the WinINet functions where the majority of the functions are resolved directly by the API hashing function, in the case of the WINSOCK function the malware will:
- Decrypt their name which
- Use the API hashing function to resolve GetProcAddress
- Resolve the requested function using GetProcAddress
Then, the malware will use these functions to communicate
The malware attempt to inject itself into one of the following processes:
Then, it will go to a function that iterates through the running processes using the aforementioned API calls of CreateToolhelp32Snapshot and ProcessFind32First\Next. One found it will retrieve the process ID.
Eventually, the process ID of the chosen process will be sent to another function that will deal with the code injection itself.
In the code injection function, we can see the injection technique itself which appears to be Process Hollowing.
First, a process is created with the creation flag of 0x8000014. This number is actually masking the following flags:
- 0x08000000: CREATE_NO_WINDOW
- 0x00000010: CREATE_NEW_CONSOLE
- 0x00000004: CREATE_SUSPENDED
Next, a new virtual memory will be allocated in the remote process followed by the traditional API calls we would expect to see in the Process Hollowing techniques.
Looking for security products
More activities from Bazarloader are related to security products. Bazarloader will use the API calls CreateToolHelp32Snapshot and Process32First to create a snapshot of the running processes and iterate on them to search AV products-related processes.
Also, the malware will use the traditional stack strings to search for the following processes:
- Norton Security
- nsWscSvc- Windows Security service
- ISSRV- Microsoft network real-time inspection service
In addition, as already seen from the first image, Bazarloader will search for the names of the following security products:
Bazarloader has a designated function that will create a process in order to perform specific commands, in most cases, this function occurs when a cmd command that manipulates the registry happens.
Bazarloader will use the cmd process in order to set a persistence into the traditional Microsoft\Windows\CurrentVersion\Run path.
Additional commands are also include
cmd /c choice /c /y /d y /t 10
cmd /c choice /c /y /d y /t 10 & start
cmd /c echo
cmd.exe /c reg.exe query HKCU\Software\
cmd.exe /c reg.exe query HKCU\Software\ /t REG_BINARY /d
cmd.exe /c reg.exe query HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths
Right after the WinINet network function ends, another function that is related to the malware’s cryptography starts. The function also shares an argument with the network function.
The cryptography function consists of multiple functions that each does several tasks. In order to not get into each one in detail, I will only demonstrate their important activities.
One of the functions is responsible to resolve the Crypt32.dll and Bcrypt.dll modules in the following way:
- Decrypt the names of the modules
- Use the dynamic API resolving function to resolve LoadLibrary
- Execute LoadLibrary with the decrypted module name as its argument
- Assign the handle for the DLL to an IDA variable for later usage
Then, it will do the same for the functions themselves, but with GetProcAddress. When it comes solely to Bcrypt, 14 different functions will be resolved and assigned to variables.
After the resolving part ends, Bazarloader will use the functions to ignite its cryptography session. The algorithm that will be used will be RSA, this can be seen as plain text in the ALGID parameter of the function BCryptOpenAlgorithmProvider.
The malware then continues with creating the session with the usage of the rest of the functions, including generating the key to decrypt data from the BcryptImportKeyPair. Eventually, it will return to the base caller function (dubbed above as “start_crypt”).
in the end, the malware will use the function BcryptDecrypt to decrypt requested data.
Designated strings and MD5 activity
One of the interesting activities of Bazarloader is the usage of specifically designated strings and using them as arguments in other parts of its activity.
These strings are manipulated several times before being used, thus making the process of observing their usage a little bit tricky. In order to show the general idea, I will demonstrate only one case.
Small Tip: In order to track the activity dynamically and align it with static analysis addresses, disabling the ASLR (with tools such as CFF Explorer) can be handy.
The entire activity will occur in one function that will deal with decrypting hardcoded strings using the aforementioned stack-strings and xor decryption method. However, as can be seen, before starting decrypting the strings, a different function named “sub_180005600” occurs.
The objective of this function is relatively simple, creating an additional key for further decryption activities. In general, this will happen in the following way:
- First, with the use of two API calls SHGetSpecialFolderPathA, GetFileAttributesExA, and additional functions, the malware will generate some sort of digits array.
2. The digit array will go to another function that will deal with MD5 hashing as one of its arguments.
If we inspect dynamically, we could see that after passing the MD5Final function, the digits will disappear and an MD5 hash will be produced.
After the MD5 hash is created, the function will return to the caller function and the events will continue as the following:
- The hardcoded stack strings will be decrypted and combined into one string
- The combined string and the md5 hash key will be sent to another function named “sub_180004410” that will deal with further manipulation on the string.
Inside sub_180004410, the string will go through more manipulations, one of them is a loop that will XOR between the key (the MD5 hash) and the combined string.
Next, several other manipulations will occur on the xored output until eventually a new string is generated.
Eventually, this process will happen to every string in the list of these designated strings, and as said, they will be used as an argument in further malware activity.
For example, the string from the image above will be used as the Mutex name when the malware executed CreateMutexA.