Data Set For Malware Clustering/Classification

Friday, January 29. 2010
About one month ago I blogged about our research on malware clustering and classification. We have now also released the full data set from our experiments, such that other people can reproduce the results and compare our approach to theirs. You can find all information at http://pi1.informatik.uni-mannheim.de/malheur/, together with a description of the different data.

Quick overview of the data:
Our reference data set is extracted from our large database of malware binaries maintained at CWSandbox. The malware binaries have been collected over a period of three years from a variety of sources. From the overall database, we select binaries which have been assigned to a known class of malware by the majority of six independent anti-virus products. We append the overall anti-virus label to the filename of each report. Although anti-virus labels suffer from inconsistency, we expect the selection using different scanners to be reasonable consistent and accurate. To compensate for the skewed distribution of classes, we discard classes with less than 20 samples and restrict the maximum contribution of each class to 300 binaries. The selected malware binaries are then executed and monitored using CWSandbox, resulting in a total of 3.133 behavior reports in MIST format.

The application data set consists of seven chunks of malware binaries obtained from the anti-malware vendor Sunbelt Software. The binaries correspond to malware collected during seven consecutive days in August 2009 and originate from a variety of sources. Sunbelt Software uses these very samples to create and update signatures for their VIPRE anti-malware product as well as for their security data feed ThreatTrack. The complete test data set consists of 33.698 behavior reports in MIST format.

The full technical report is available at http://honeyblog.org/junkyard/paper/malheur-TR-2009.pdf.

Update: I changed the terms within the description to use the correct description.

Call for Papers: LEET'10

Monday, January 25. 2010
admin
The submissions deadline for the 3rd USENIX Workshop on Large-Scale Exploits and Emergent Threats (LEET '10) is quickly approaching. Please submit your work by Thursday, February 25, 2010, 11:59 p.m. PST. The full call for papers is available at http://www.usenix.org/events/leet10/cfp/, see an overview below:
Topics
Now in its third year, LEET continues to provide a unique forum for the discussion of threats to the confidentiality of our data, the integrity of digital transactions, and the dependability of the technologies we increasingly rely on. We encourage submissions of papers that focus on the malicious activities themselves (e.g., reconnaissance, exploitation, privilege escalation, rootkit installation, attack), our responses as defenders (e.g., prevention, detection, and mitigation), or the social, political, and economic goals driving these malicious activities and the legal and ethical codes guiding our defensive responses.

Overview
Information technology (IT) adds $2 trillion annually to the US economy alone. While these technologies have enabled significant global economic growth, they have become rich targets for malicious activity. The US Federal Bureau of Investigation (FBI) indicated that cyber crime reached an all-time high in 2008; cyber crime now ranks as the FBI's third highest priority, behind such dramatic threats as counter-terrorism and counter-espionage. Much of this malicious activity is driven by economic incentives, but recently we have seen the emergence of highly visible, politically motivated attacks. While the motivations for malicious behavior and the technical mechanisms that enable them remain rich areas of research, it is clear that today our global society is faced with a wide range of cyber criminal activities: spam, phishing, denial of service, click fraud, etc.

Workshop Format
LEET aims to be a true workshop, with the twin goals of fostering the development of preliminary work and helping to unify the broad community of researchers and practitioners who focus on worms, bots, spam, spyware, phishing, DDoS, and the ever-increasing palette of large-scale Internet-based threats. Intriguing preliminary results and thought-provoking ideas will be strongly favored; papers will be selected for their potential to stimulate discussion in the workshop. Each author will have 15 minutes to present his or her work, followed by 15 minutes of discussion with the workshop participants.

"Studying Aspects of the Underground Economy"

Wednesday, January 20. 2010
Today I gave a talk at the International Computer Science Institute (ICSI) that focussed on some of the research I did in the past year. The slides are now available.

Abstract:
With the growing digital economy, it comes as no surprise that criminal activities in digital business have lead to a digital underground economy. Because it is such a fast-moving field, tracking and understanding this underground economy is difficult and most information in this area is vague. In this talk, we discuss several approaches to study the structure of these underground markets. In particular, we present a method with which it is possible to directly analyze the amount of data harvested through keylogger-based attacks in a highly automated fashion. Based on real-world data, we can get a glimpse into the digital underground economy. However, many open questions remain that will be discussed in the last part of the talk.

You can get the slides at http:///honeyblog.org/junkyard/presentations/10_underground-economy_ICSI.pdf.

Call for Papers: WEIS'10

Monday, January 18. 2010
admin
I am happy to serve on the program committee of the 9th Workshop on the Economics of Information Security (WEIS). The Call for Papers is now available. WEIS will take place on June 7-8, 2010 at Harvard University, Cambridge, MA, USA

Important dates are:
  • Submissions due: February 22, 2010
  • Notification of acceptance: April 2, 2010
  • Workshop: June 7-8, 2010

Information security continues to grow in importance, as threats proliferate, privacy erodes, and attackers find new sources of value. Yet the security of information systems depends on more than just technology. Good security requires an understanding of the incentives and tradeoffs inherent to the behavior of systems and organizations. As society’s dependence on information technology has deepened, policy makers, including the President of the United States, have taken notice. Now more than ever, careful research is needed to accurately characterize threats and countermeasures, in both the public and private sectors.

The Workshop on the Economics of Information Security (WEIS) is the leading forum for interdisciplinary scholarship on information security, combining expertise from the fields of economics, social science, business, law, policy and computer science. Prior workshops have explored the role of incentives between attackers and defenders, identified market failures dogging Internet security, and assessed investments in cyber-defense. This workshop will build on past efforts using empirical and analytic tools to not only understand threats, but also strengthen security through novel evaluations of available solutions. How should information risk be modeled given the constraints of rare incidence and high interdependence? How do individuals’ and organizations’ perceptions of privacy and security color their decision making? How can we move towards a more secure information infrastructure and code base while accounting for the incentives of stakeholders?

The full Call for Papers is available at http://weis2010.econinfosec.org/cfp.html.

Walowdac – Analysis of a Peer-to-Peer Botnet

Sunday, January 3. 2010
One of the most interesting botnets of 2009 was Waledac: the botnet implements a peer-to-peer-based communication channel and it can be seen as the successor of Storm Worm, since it implemented many similar ideas (e.g., a very similar language for spam templates was used). The researchers from Trend Micro had published an analysis of the botnet and we also examined the botnet. The result is a paper entitled "Walowdac - Analysis of a Peer-to-Peer Botnet": instead of passively observing the network, we implemented an active infiltration component. We emulate the protocol of a bot and are able to observe the inner communication aspects of the network. As a result, we obtain an in-depth overview of the botnet that enables us to study different aspects of the network, e.g., efficiency of the spam campaigns or number of active bots. As a small peak of the results, the following pictures shows the number of active bots in different countries on a specific day in August 2009. We can for example observe diurnal patterns and clearly see the effects of timezones on the size of the botnet:


Abstract:
A botnet is a network of compromised machines under the control of an attacker. Botnets are the driving force behind several misuses on the Internet, for example spam mails or automated identity theft. In this paper, we study the most prevalent peer-to-peer botnet in 2009: Waledac. We present our infiltration of the Waledac botnet, which can be seen as the successor of the Storm Worm botnet. To achieve this we implemented a clone of the Waledac bot named Walowdac. It implements the communication features of Waledac but does not cause any harm, i.e., no spam emails are sent and no other commands are executed. With the help of this tool we observed a minimum daily population of 55,000 Waledac bots and a total of roughly 390,000 infected machines throughout the world. Furthermore, we gathered internal information about the success rates of spam campaigns and newly introduced features like the theft of credentials from victim machines.

The paper was joint work with Ben Stock, Jan Göbel, Markus Engelberth, and Felix C. Freiling. The full paper is available at http://honeyblog.org/junkyard/paper/waledac-ec2nd09.pdf and it was published at EC2ND 2009.