A Crawler-based Study of Spyware on the Web

Tuesday, January 31. 2006
At NDSS'06 in February, there is an interesting paper that examines the amount of spyware on the world wide web. In the paper "A Crawler-based Study of Spyware on the Web the authors desribe their results from crawling the web for malicious content. The basic idea is to simply crawl the web and then analyze all captured binary with the help of a VM and Ad-Aware. Moreover, they also examined web sites containing malicious content that exploit browser vulnerabilities. The abstract gives some more details on the amount of malware found:

Abstract:
Malicious spyware poses a significant threat to desktop security and integrity. This paper examines that threat from an Internet perspective. Using a crawler, we performed a large-scale, longitudinal study of the Web, sampling both executables and conventional Web pages for malicious objects. Our results show the extent of spyware content. For example, in a May 2005 crawl of 18 million URLs, we found spyware in 13.4% of the 21,200 executables we identified. At the same time, we found scripted “drive-by download” attacks in 5.9% of the Web pages we processed. Our analysis quantifies the density of spyware, the types of of threats, and the most dangerous Web zones in which spyware is likely to be encountered. We also show the frequency with which specific spyware programs were found in the content we crawled. Finally, we measured changes in the density of spyware over time; e.g., our October 2005 crawl saw a substantial reduction in the presence of drive-by download attacks, compared with those we detected in May.

Unfortunately, they do not give an explanation why there is a drop in their results in October compared to May. And it would be interesting to carry out such an analysis at a larger scale, perhaps in cooperation with a search engine like Google ("A Statistical Review of 1 Billion Web Pages")...

Distribution of Filesize

Monday, January 30. 2006
The following picture shows the distribution of filesize in kilobytes for about 14,000 unique malware samples I have collected during the last few months. Uniqueness is defined in this context as "unique md5sum".

Distribution of filesize


As you can see, there are several spikes, mainly around 190KB, 45 KB, and 10 KB. The picture only shows the filesize between 0 and 250 KB. nepentes also captured some rather large bots (> 1MB) - I wonder how long it takes to infect a computer hanging on a modem line with such a large bot...

If you are interested in samples, please contact me at thorsten [dot] holz [at] gmail.com

Blog.Worm

Thursday, January 26. 2006

Blog.Worm

Slides From 17th TF-CSIRT/FIRST Meeting

Tuesday, January 24. 2006
You can now download the slides from my talk about the German Honeynet Project at the 17th TF-CSIRT and FIRST joint event.

Effektives Sammeln von Malware mit Honeypots

Saturday, January 21. 2006
(Sorry folks, this posting is in German...)

Anlässlich des 13. DFN-CERT Workshop "Sicherheit in vernetzten Systemen" gibt es einen Artikel, der das Sammeln von Malware mit Hilfe von mwcollect beschreibt.

Abstract:
Ein Großteil der sich heutzutage autonom verbreitenden Malware infiziert weitere Opfer über bereits bekannte Schwachstellen in Netzwerkdiensten, die sich automatisiert exploiten lassen. Darüber hinaus tauchen immer mehr Bots auf, die auf der gleichen Quellcode-Familie basieren, jedoch oft mit unterschiedlichen und teilweise modifizierten Packern gepackt sind. Daher ist es wichtig, solche Malware automatisiert sammeln zu können, um effektiv neue Signaturen für Virenscanner zu erstellen oder das Verhalten von Botnetzen zu studieren.

Da es sich um bekannte Schwachstellen handelt, lassen sich reaktiv Pattern für diese Schwachstellen erstellen und ein Daemon kann implementiert werden, der verwundbare Services gegenüber sich autonom verbreitender Malware simuliert. Dabei ist es nicht nötig, diese Services vollständig und korrekt nachzubilden, sondern es ist ausreichend, eine vereinfachten Emulation der Dienste zu implementieren.

Einen solchen Daemon stellt das seit März 2005 vom Honeynet Project entwickelte Projekt mwcollect bereit.

Den vollständigen Artikel gibt es als effektives-sammeln-von-malware.pdf.