Technical Report: "Abusing Social Networks for Automated User Profiling"

Wednesday, March 17. 2010
We recently published a technical report on another project related to social networks. The paper is entitled "Abusing Social Networks for Automated User Profiling" and we focus on automatically collecting information about users based on the information available in different networks.

Imagine that you have a profile on Facebook, on LinkedIn, and on MySpace. Perhaps you do not want to directly link these profiles, for example since you want to have a more serious profile on LinkedIn, while having a more relaxed one on MySpace and Facebook. Thus you use different pseudonym/names on the different profiles and expect that the information can not be correlated. However, there is a problem with that assumption: during the registration on the different networks, you used the same e-mail address. And a social network typically enables a user to search for e-mail addresses in order to find friends (a convenient feature, after all you want to network with your friends). An attacker can thus go ahead and search on each network for a given e-mail address, scrape the profile related to that address, and then correlate the information found on different network. At the end, an attacker can thus enrich a given e-mail address with information collected on different social networks.

An attacker can not only search for one e-mail address at a time, but typically for hundreds or even thousands. And he can not only do this once, but thousands of times per day. For example, we were able to check about 10 million e-mail addresses on Facebook per day. A spammer could use this "feature" to verify e-mail addresses by using Facebook as an oracle to determine whether or not a given e-mail address is valid. Furthermore, the correlation aspect is of course also a privacy problem since an attacker can find "hidden" information and correlate information across different networks.

We have contacted different social networks. Facebook and XING have already addressed the problem - thanks a lot!

Recently, social networks such as Facebook have experienced a huge surge in popularity. The amount of personal information stored in these sites calls for appropriate security precautions to protect this data.
In this paper, we describe how we are able to take advantage of a common weakness, namely the fact that an attacker can query the social network for registered e-mail addresses on a large scale. Starting with a list of about 10.4 million email addresses, we were able to automatically identify more than 1.2 million user profiles associated with these addresses. By crawling these profiles, we collect publicly available personal information about each user, which we use for automated profiling (i.e., to enrich the information available from each user).
Finally, we propose a number of mitigation techniques to protect the user’s privacy. We have contacted the most popular providers, who acknowledged the threat and are currently implementing our countermeasures. Facebook and XING in particular have recently fixed the problem.

The technical report is available at and it was joint work with Marco Balduzzi, Christian Platzer, Engin Kirda, Davide Balzarotti, and Christopher Kruegel.

Twitter Spamdetector Service

Tuesday, March 16. 2010
At the International Secure Systems Lab, we have developed a couple of services like Anubis, Wepawet, or FIRE. Lately, we have worked on a mechanism to detect spammers on Twitter, a popular microblogging service. We have developed several heuristics to detect spamming profiles, and have already reported thousands of these profiles to Twitter, who then shut down these profiles. Now we have created a profile to which users can flag spammers on Twitter: the flagged accounts are added to our database, allowing us to detect profiles from campaigns we did not observe before.

The profile is @spamdetector, and the messages it accepts are of the format
"@spamdetector @spamaccount"

Whenever you see a suspicious account, you can simply send us a notification and our system will check if this account is likely a spammer or not. This helps us to improve our heuristics, and we can help Twitter to shut down suspicious profiles, leading to a better service.

This work was carried out by Gianluca Stringhini, a PhD student at University of California, Santa Barbara, working as research assistant at the Computer Security lab. And you can find my tweets at @thorstenholz.

"Inspector Gadget: Automated Extraction of Proprietary Gadgets from Malware Binaries"

Friday, March 12. 2010
When analyzing malware samples, a human analyst is typically interested in understanding/recovering a specific algorithms of the given sample. In the case of Conficker, for example, she might be interested in extracting the domain generation algorithm such that she can understand what domains are currently and in the future used by the malware. Or for spam bots, she might be interested in how the malware downloads spam templates, decodes them, and then generates the actual spam messages. Or for bots, she might be interested in understanding how binary updates are downloaded, decoded, and then executed.

In each case, the binary itself encodes the algorithm, but it is cumbersome and hard work to understand all of this. Thus it would be useful to have a tool that enables a malware analyst to automatically extract from a given binary sample the relevant algorithm related to a specific task. In a paper that will be presented at the 31st IEEE Symposium on Security & Privacy we introduce Inspector Gadget, a tool that implements exactly this. A gadget encapsulates all code related to a specific task and can be executed in a stand-alone fashion. A gadget player can take a gadget and replay it, for example to determine which domains are currently used by Conficker, or download and decode an update for a bot binary. Furthermore, we introduce an approach to revert gadget based on a enhanced brute-force algorithm: this is useful to understand the effects of malware in detail and we can (in certain cases) also revert obfuscation algorithms, i.e., to understand what data has been exfiltrated by a given sample. The full paper has all the details and describes Inspector Gadget in more depth. And if you are interested in the topic, you should also read the paper by Caballero et al. on BCR (paper title is "Binary Code Extraction and Interface Identification for Security Applications").

Unfortunately, malicious software is still an unsolved problem and a major threat on the Internet. An important component in the fight against malicious software is the analysis of malware samples: Only if an analyst understands the behavior of a given sample, she can design appropriate countermeasures. Manual approaches are frequently used to analyze certain key algorithms, such as downloading of encoded updates, or generating new DNS domains for command and control purposes.
In this paper, we present a novel approach to automatically extract, from a given binary executable, the algorithm related to a certain activity of the sample. We isolate and extract these instructions and generate a so-called gadget, i.e., a stand-alone component that encapsulates a specific behavior. We make sure that a gadget can autonomously perform a specific task by including all relevant code and data into the gadget such that it can be executed in a self-contained fashion.
Gadgets are useful entities in analyzing malicious software: In particular, they are valuable for practitioners, as understanding a certain activity that is embedded in a binary sample (e.g., the update function) is still largely a manual and complex task. Our evaluation with several real-world samples demonstrates that our approach is versatile and useful in practice.

The full paper is available at and will be presented in May at the 31st IEEE Symposium on Security & Privacy. The paper was joint work with Clemens Kolbitsch, Christopher Kruegel, and Engin Kirda - all members of the International Secure Systems Lab.

Waledac Infection Check

Tuesday, March 2. 2010
Ben Stock has implemented a web service to check a given IP address for infection with Waledac, similar to the Conficker Eye Chart. The idea is that we are currently tracking Waledac as part of the take-down effort and thus we have a pretty good overview of the individual bots within the botnet. Therefore we are in a position to determine if we have seen a given IP address in the recent past as a bot, which indicates that this IP address might be related to a Waledac infection. Of course, effects like NAT or DHCP need to be taken into account: if an IP address is not listed, this does not necessarily mean that you are not infected.

The check is available at, feedback is welcome!