"Inspector Gadget: Automated Extraction of Proprietary Gadgets from Malware Binaries"

Friday, March 12. 2010
When analyzing malware samples, a human analyst is typically interested in understanding/recovering a specific algorithms of the given sample. In the case of Conficker, for example, she might be interested in extracting the domain generation algorithm such that she can understand what domains are currently and in the future used by the malware. Or for spam bots, she might be interested in how the malware downloads spam templates, decodes them, and then generates the actual spam messages. Or for bots, she might be interested in understanding how binary updates are downloaded, decoded, and then executed.

In each case, the binary itself encodes the algorithm, but it is cumbersome and hard work to understand all of this. Thus it would be useful to have a tool that enables a malware analyst to automatically extract from a given binary sample the relevant algorithm related to a specific task. In a paper that will be presented at the 31st IEEE Symposium on Security & Privacy we introduce Inspector Gadget, a tool that implements exactly this. A gadget encapsulates all code related to a specific task and can be executed in a stand-alone fashion. A gadget player can take a gadget and replay it, for example to determine which domains are currently used by Conficker, or download and decode an update for a bot binary. Furthermore, we introduce an approach to revert gadget based on a enhanced brute-force algorithm: this is useful to understand the effects of malware in detail and we can (in certain cases) also revert obfuscation algorithms, i.e., to understand what data has been exfiltrated by a given sample. The full paper has all the details and describes Inspector Gadget in more depth. And if you are interested in the topic, you should also read the paper by Caballero et al. on BCR (paper title is "Binary Code Extraction and Interface Identification for Security Applications").

Abstract:
Unfortunately, malicious software is still an unsolved problem and a major threat on the Internet. An important component in the fight against malicious software is the analysis of malware samples: Only if an analyst understands the behavior of a given sample, she can design appropriate countermeasures. Manual approaches are frequently used to analyze certain key algorithms, such as downloading of encoded updates, or generating new DNS domains for command and control purposes.
In this paper, we present a novel approach to automatically extract, from a given binary executable, the algorithm related to a certain activity of the sample. We isolate and extract these instructions and generate a so-called gadget, i.e., a stand-alone component that encapsulates a specific behavior. We make sure that a gadget can autonomously perform a specific task by including all relevant code and data into the gadget such that it can be executed in a self-contained fashion.
Gadgets are useful entities in analyzing malicious software: In particular, they are valuable for practitioners, as understanding a certain activity that is embedded in a binary sample (e.g., the update function) is still largely a manual and complex task. Our evaluation with several real-world samples demonstrates that our approach is versatile and useful in practice.

The full paper is available at http://www.iseclab.org/papers/ieee_sp10_inspector_gadget.pdf and will be presented in May at the 31st IEEE Symposium on Security & Privacy. The paper was joint work with Clemens Kolbitsch, Christopher Kruegel, and Engin Kirda - all members of the International Secure Systems Lab.

Waledac Infection Check

Tuesday, March 2. 2010
admin
Ben Stock has implemented a web service to check a given IP address for infection with Waledac, similar to the Conficker Eye Chart. The idea is that we are currently tracking Waledac as part of the take-down effort and thus we have a pretty good overview of the individual bots within the botnet. Therefore we are in a position to determine if we have seen a given IP address in the recent past as a bot, which indicates that this IP address might be related to a Waledac infection. Of course, effects like NAT or DHCP need to be taken into account: if an IP address is not listed, this does not necessarily mean that you are not infected.

The check is available at http://mwanalysis.org/waledac/, feedback is welcome!

Waledac Takedown Successful

Thursday, February 25. 2010
A few weeks ago, I blogged about our paper "Walowdac – Analysis of a Peer-to-Peer Botnet". The paper provides an overview of the Waledac botnet and its specific aspects compared to Storm Worm and similar peer-to-peer botnets. The paper also contains some measurement results for the botnet like the typical number of online bots and similar statistics.

In the last couple of days, the situation changed a bit: we worked on an active takedown of the botnet together with experts from Microsoft, Shadowserver, the University of Mannheim, University of Bonn, University of Washington, Symantec and others. The operation is know within Microsoft as "Operation b49" and involved domain takedowns and additional technical countermeasures. Microsoft also did some fantastic work on the legal side, the complaint filed by Microsoft ("Microsoft Corporation v. John Does 1-27, et. al.") is available online. As a result, the communication infrastructure of Waledac has been disrupted to a certain extent and the botmaster can effectively not send commands to the bots. The Waledac Tracker by sudosecure.net also shows a nice decline in the number of bots for the last few days. Note, however, that the infected machines are still up and running, thus some clean-up at that side is still necessary...

You can read more about the story in a blog post by Microsoft: "Cracking Down on Botnets". And I will update the blog with new information once we start to analyze the collected data...

Data Set For Malware Clustering/Classification

Friday, January 29. 2010
About one month ago I blogged about our research on malware clustering and classification. We have now also released the full data set from our experiments, such that other people can reproduce the results and compare our approach to theirs. You can find all information at http://pi1.informatik.uni-mannheim.de/malheur/, together with a description of the different data.

Quick overview of the data:
Our reference data set is extracted from our large database of malware binaries maintained at CWSandbox. The malware binaries have been collected over a period of three years from a variety of sources. From the overall database, we select binaries which have been assigned to a known class of malware by the majority of six independent anti-virus products. We append the overall anti-virus label to the filename of each report. Although anti-virus labels suffer from inconsistency, we expect the selection using different scanners to be reasonable consistent and accurate. To compensate for the skewed distribution of classes, we discard classes with less than 20 samples and restrict the maximum contribution of each class to 300 binaries. The selected malware binaries are then executed and monitored using CWSandbox, resulting in a total of 3.133 behavior reports in MIST format.

The application data set consists of seven chunks of malware binaries obtained from the anti-malware vendor Sunbelt Software. The binaries correspond to malware collected during seven consecutive days in August 2009 and originate from a variety of sources. Sunbelt Software uses these very samples to create and update signatures for their VIPRE anti-malware product as well as for their security data feed ThreatTrack. The complete test data set consists of 33.698 behavior reports in MIST format.

The full technical report is available at http://honeyblog.org/junkyard/paper/malheur-TR-2009.pdf.

Update: I changed the terms within the description to use the correct description.

Call for Papers: LEET'10

Monday, January 25. 2010
admin
The submissions deadline for the 3rd USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET '10) is quickly approaching. Please submit your work by Thursday, February 25, 2010, 11:59 p.m. PST. The full call for papers is available at http://www.usenix.org/events/leet10/cfp/, see an overview below:
Topics
Now in its third year, LEET continues to provide a unique forum for the discussion of threats to the confidentiality of our data, the integrity of digital transactions, and the dependability of the technologies we increasingly rely on. We encourage submissions of papers that focus on the malicious activities themselves (e.g., reconnaissance, exploitation, privilege escalation, rootkit installation, attack), our responses as defenders (e.g., prevention, detection, and mitigation), or the social, political, and economic goals driving these malicious activities and the legal and ethical codes guiding our defensive responses.

Overview
Information technology (IT) adds $2 trillion annually to the US economy alone. While these technologies have enabled significant global economic growth, they have become rich targets for malicious activity. The US Federal Bureau of Investigation (FBI) indicated that cyber crime reached an all-time high in 2008; cyber crime now ranks as the FBI's third highest priority, behind such dramatic threats as counter-terrorism and counter-espionage. Much of this malicious activity is driven by economic incentives, but recently we have seen the emergence of highly visible, politically motivated attacks. While the motivations for malicious behavior and the technical mechanisms that enable them remain rich areas of research, it is clear that today our global society is faced with a wide range of cyber criminal activities: spam, phishing, denial of service, click fraud, etc.

Workshop Format
LEET aims to be a true workshop, with the twin goals of fostering the development of preliminary work and helping to unify the broad community of researchers and practitioners who focus on worms, bots, spam, spyware, phishing, DDoS, and the ever-increasing palette of large-scale Internet-based threats. Intriguing preliminary results and thought-provoking ideas will be strongly favored; papers will be selected for their potential to stimulate discussion in the workshop. Each author will have 15 minutes to present his or her work, followed by 15 minutes of discussion with the workshop participants.