
The goal of this post is to document SolarMarker malware as seen between May 2022 and September 2022. This malware is also known under other names (Jupyter Infostealer, YellowCockatoo, Polazert). If you are interested in earlier forms of the malware, check out my previous blog posts.
The TLDR on SolarMarker is that it has been a fairly sneaky Infostealer from July 2020 to the present day. The backdoor is currently being written to disk but other modules, like the infostealer components, are not written to disk and are only executed in memory.
The goal of this post is primarily to document the form of the malware and document the general bloat of the backdoor being observed. Detection techniques are not discussed.
The First Stage Payload
The first stage payload has always been consistently 250MB+ in size. The current payload during this period is a .NET executable. This differs from previously observed forms: from July 2020 to Nov 2021, SolarMarker payloads were created as installation programs which ran a script upon execution. The installation builders included InnoSetup and various MSI builders. However, since February of 2022, the first stage has been a .NET executable.
From February 2022 until May 2022, the .NET executable loaded System.Management.Automation.DLL
which gave it the ability to execute PowerShell Scripts without starting a PowerShell Process. However, it was essentially executing a PowerShell script.
In May 2022, the first stage SolarMarker executable replaced the need for the System.Management.Automation.DLL
by using native .NET functions. This new form of the payload is the focus of this blogpost.
In both the first and second stage (the dropper and the backdoor, respectively), the internal names of the executable receive randomly generated names and the methods within the binary also receive randomly generated names. For example, the entrypoint for one recent sample was Algolagnist.Liparomphalus.Underboil
. In this blogpost, I won’t be renaming the methods as we discuss them, partly because I don’t perceive it as necessary, but also because the method names are also pretty funny sometimes (for example, porkpies
and semipractical
).

Features Common to First and Second Stage
With the shift to .NET for the dropper, the developer has began to use the same obfuscation and evasion methods for both binaries. This reduces the developer’s workload and also reduces ours. As a result, as we talk about obfuscation tactics, it is safe to assume both binaries share the obfuscation unless specified otherwise.
High Overview
Over the life of the malware, the developer has used multiple different obfuscation methods. The current form of the malware has began to increase in size gradually. The increase in size comes from a few sources: (1) meaningless words being written to console; (2) unused methods and multiple implementations of the same methods; (3) meaningless content in the methods themselves.
The goal of the 1st stage is to establish persistence and decrypt the 2nd stage payloads. The payloads include the decoy PDF editor and the backdoor. The backdoor is currently stored in an unencrypted form in the user’s AppData\Local\Temp
directory. For the past several months, the backdoor has used a Windows 10 like icon. Persistence is set using a RUN key located at HKCU\Software\Microsoft\Windows\CurrentVersion\Run
which will execute the backdoor at user login.

The backdoor requires a command line parameter. This parameter is saved in the registry RUN key but can cause difficulty in automated analysis and sandboxes: without the command parameter, the primary functions of the backdoor will not execute. If submitted to a sandbox or analyzed manually, the analyst will need to supply the command line parameter.
Size over time
The following is a graph using Flourish depicting 126 backdoor samples and their size over time. (Note: I reviewed the out-liars to confirm yes, the latest backdoor dated 2022-09-16 really is 2,258 KB; the backdoor on 2022-05-26 really was 40KB; and so forth.) An interactive version is also available.

I don’t fully understand why the binary is increasing in size. It is not adding functionality to the backdoor. What benefits does it bring? I have some ideas, but I would be happy to hear from anyone with ideas or concrete evidence based on detection techniques.
Low Overview
Here are examples of the three methods of increasing file size mentioned above:
- Meaningless words being written to console.
The binary literally just prints junk to the console. However, the presence of the text increases the size of the executable and introduces additional randomness which may (or may not) thwart automated detection.
Console.WriteLine("Unawkwardly prepossess tweesh elchee foamless teamman negator");
In the most recent sample, the write to console has become very common and has began to include the decoding methods.

Queet
method which decodes text.2. Unused methods and multiple implementations of the same methods.
The largest amount of meaningless information is introduced through the inclusion of unused methods. In the image below, internal methods of the binary are listed in yellow on the left and the method Algolagnist
is the only method invoked.

In addition to unused methods, some methods are implemented multiple times. The most obvious of these methods are the ones use for decoding .NET method calls (which will be discussed more later). To illustrate, the below image shows the methods Semipractical
, Abcessed
, Thanatophidia
, and Racemously
. Each of these perform essentially the same function: they take an input and build a string by shifting the value by a hardcoded quantity.

3. Meaningless un-used content in methods.
The following method Countrypeople
is another variant identical to Semipractical
. It takes two arguments: The first is a string and the second is an integer. The Countrypeople
method then takes a substring of the string: the substring starts at the index specified by the integer. In all cases, the substring will begin after the end of the randomly generated words. This instance decodes the Chinese UNICODE to “LoadWithPartialNameHack”.
Countrypeople("Palanquined reslay resink talcochlorite aramu dissolvability opiner quaere irrepressible infixing thereuntil staphylococci yon benefactive tosspot leased reabridge cornbird continentally敖荒瘟直瀞節視窱杖瘟襁視節瘟練朗瘟缾睊戴瘟盛絛", 186)
The text before the encoded portion is discarded and isn’t important, but it adds a layer of obfuscation for manual analysis and additional bloat to the binary.
While working on this post, new versions of the malware have come out and methods like CountryPeople
are being reworked. For example, while CountryPeople
was 10 lines of code, the latest version of the backdoor has a function named BodySurfed
which has been bloated to 1450 lines but plays the same role in the binary.

CountryPeople
and a bloated version named BodySurfed
.Conclusion
This post was largely to document my observations on the bloat being added to the SolarMarker backdoor. I believe the bloat is being added in attempt to remain evasive but I’m not observing any significant change in detection rates.
Resources
For samples, check out the following tags at MalwareBazaar: Jupyter, YellowCockatoo, Polazert, SolarMarker. Any samples I have uploaded, I have personally verified. Due to the size of the first stage payload, they have to be zipped before uploaded to MalwareBazaar.
For pulling down a lot of samples quickly, check out BazaarShopper: my python script using MalwareBazaar’s API. I have additional scripts I may release soon.
Finally, if you like my art, or would just like random art to show in up your Twitter feed; check out my art Twitter account SquiblyArt.
One thought on “SolarMarker Bloat”