Imagine that you have a profile on Facebook, on LinkedIn, and on MySpace. Perhaps you do not want to directly link these profiles, for example since you want to have a more serious profile on LinkedIn, while having a more relaxed one on MySpace and Facebook. Thus you use different pseudonym/names on the different profiles and expect that the information can not be correlated. However, there is a problem with that assumption: during the registration on the different networks, you used the same e-mail address. And a social network typically enables a user to search for e-mail addresses in order to find friends (a convenient feature, after all you want to network with your friends). An attacker can thus go ahead and search on each network for a given e-mail address, scrape the profile related to that address, and then correlate the information found on different network. At the end, an attacker can thus enrich a given e-mail address with information collected on different social networks.
An attacker can not only search for one e-mail address at a time, but typically for hundreds or even thousands. And he can not only do this once, but thousands of times per day. For example, we were able to check about 10 million e-mail addresses on Facebook per day. A spammer could use this "feature" to verify e-mail addresses by using Facebook as an oracle to determine whether or not a given e-mail address is valid. Furthermore, the correlation aspect is of course also a privacy problem since an attacker can find "hidden" information and correlate information across different networks.
We have contacted different social networks. Facebook and XING have already addressed the problem - thanks a lot!
Recently, social networks such as Facebook have experienced a huge surge in popularity. The amount of personal information stored in these sites calls for appropriate security precautions to protect this data.
In this paper, we describe how we are able to take advantage of a common weakness, namely the fact that an attacker can query the social network for registered e-mail addresses on a large scale. Starting with a list of about 10.4 million email addresses, we were able to automatically identify more than 1.2 million user profiles associated with these addresses. By crawling these profiles, we collect publicly available personal information about each user, which we use for automated profiling (i.e., to enrich the information available from each user).
Finally, we propose a number of mitigation techniques to protect the user’s privacy. We have contacted the most popular providers, who acknowledged the threat and are currently implementing our countermeasures. Facebook and XING in particular have recently fixed the problem.
The technical report is available at http://www.iseclab.org/papers/socialabuse-TR.pdf and it was joint work with Marco Balduzzi, Christian Platzer, Engin Kirda, Davide Balzarotti, and Christopher Kruegel.