WOOT'08 and HotSec'08

Tuesday, July 29. 2008
Besides USENIX Security, also two interesting workshops take place this week: 2nd USENIX Workshop on Offensive Technologies (WOOT '08) and 3rd USENIX Workshop on Hot Topics in Security (HotSec '08). Both workshops have an interesting program and the proceedings are an interesting read! My favorite paper picks:

The full papers will be available a few days after the workshops took place.

USENIX Security'08

Monday, July 28. 2008
This week, the 17th USENIX Security Symposium takes place in San Jose, CA. Unfortunately I can not attend this year :-( But there are many interesting papers you should check out, for example:

The full papers will be available a few days after the conference took place. A really good conference this year with an exciting program! Looking forward to attend next year :-)

DIMVA'08 Slides

Tuesday, July 22. 2008
A quick follow-up to our DIMVA'08 paper on "Learning and Classification of Malware Behavior": the slides from Konrad's talk are now available and provide a quick overview of the topic.

In the near future, we will integrate the results of this paper to the webinterface of cwsandbox.org - stay tuned :)

Fast-Flux Data

Wednesday, July 16. 2008
Back in February, we published a paper on fast-flux service networks at NDSS'08. The basic idea behind fast-flux networks is a fast change in the mapping between a domain name and the corresponding IP addresses. The attackers use this mechanism to build a proxy-network on top of compromised machines to maintain a robust hosting infrastructure for their services. For more information on this topic, see the paper by the Honeynet Project or our NDSS paper.

To foster research in this area, the data collected during our study is available for research purposes. Up to now, quite a few people mailed me and asked for the data. To make this process a bit more scalable and also minimize the amount of work needed at my side, we decided to simply publish all the data such that everyone can download the raw data and use it for whatever purpose. Today, I uploaded a tarball which contains a summary of the fast-flux data collected over a period of several weeks. The tarball contains a potpourri of different measurements and has a total size of 7.3 MB. It contains about 55K raw dig lookup files and has an unpacked size of about 220 MB. The archive contains the following data:
  • storm-qavoter.com.log: dig lookups for domain used by the Storm Worm botnet which uses fast-flux techniques

  • asprox-damnec-hydra.log: dig lookups for Asprox/Damnec botnet which also uses fast-flux techniques

  • lookups-ff: dig lookups for fast-flux domains, confirmed manually

  • lookups-spam: dig lookups for various domains found in spam e-mails

  • lookups-benign: dig lookups for (probable) benign domains, most of them collected via dmoz or Alexa

  • lookups-ndss: part of the domains used for the NDSS paper

  • lookups-ndss-ff: suspected fast-flux domains from NDSS paper

So if you are interested in this area and want to learn more about it, just download the archive (7.3 MB) and play with the files :)

DIMVA'08: "Learning and Classification of Malware Behavior"

Thursday, July 10. 2008
Today and tomorrow DIMVA'08 takes place in Paris. DIMVA'08 is the Fifth Conference on Detection of Intrusions and Malware & Vulnerability Assessment and organized by the special interest group SIDAR of the German Informatics Society (GI).

Our paper entitled "Learning and Classification of Malware Behavior" is a joint work with Konrad Rieck, Carsten Willems, Patrick Düssel, Pavel Laskov, and Felix Freiling. The paper deals with malware classification, i.e., how to automatically learn malware families using labels. We use (noisy) labels by an anti-virus product and then apply machine learning algorithms to classify malware based on execution traces generated with the help of CWSandbox. In an experiment with over 3,000 previously undetected malware binaries, our system correctly predicted almost 70% of labels assigned by an anti-virus scanner four weeks later. Our method also detects unknown behavior, so that malware families not present in the learning corpus are correctly identified as unknown. The analysis of prominent features inferred by our discriminative models has shown interesting similarities between malware families; in particular, we have discovered that Doomber and Gobot worms derive from the same origin, with Doomber being an extension of Gobot - all in an automated way.

Abstract:
Malicious software in form of Internet worms, computer viruses, and Trojan horses poses a major threat to the security of networked systems. The diversity and amount of its variants severely undermine the effectiveness of classical signature-based detection. Yet variants of malware families share typical behavioral patterns reflecting its origin and purpose. We aim to exploit these shared patterns for classification of malware and propose a method for learning and discrimination of malware behavior. Our method proceeds in three stages: (a) behavior of collected malware is monitored in a sandbox environment, (b) based on a corpus of malware labeled by an anti-virus scanner a malware behavior classifier is trained using learning techniques and (c) discriminative features of the behavior models are ranked for explanation of classification decisions. Experiments with different heterogeneous test data collected over several months using honeypots demonstrate the effectiveness of our method, especially in detecting novel instances of malware families previously not recognized by commercial anti-virus software.

The full paper is now available.