Can Your Twitter Posts Disclose Your Address?
In order to raise awareness about just how much privacy people may be giving up when they use social media, researchers at MIT and
It must be noted that Twitter's location-reporting service is 'off' by default, but many Twitter users choose to activate it. In this study, which is also a part of a more general project at MIT's Internet Policy Research Initiative, lead researcher Ilaria Liccardi said, "Many people have this idea that only machine-learning techniques can discover interesting patterns in location data. And they feel secure that not everyone has the technical knowledge to do that."
Liccardi further added, "With this study, what we wanted to show is that when you send location data as a secondary piece of information, it is extremely simple for people with very little technical knowledge to find out where you work or live."
While conducting the study, the researchers used real tweets from Twitter users in Boston. The users consented to the use of their data and confirmed their home and work addresses, their commuting routes, and the locations of various leisure destinations from which they sent tweets.
According to the researchers, "The time and location data associated with the tweets were then presented to a group of 45 study participants, who were asked to try to deduce whether the tweets had originated at the Twitter users' homes, their workplaces, leisure destinations, or locations along their commutes."
This study had presented the data in three different forms:
· The first was a static
· The second was an animated version of the same map, in which the pins appeared on-screen in chronological order
· The third was a table listing geographical coordinates, street names, and times of day
Apart from these, researchers also varied the volume of data that the participants were asked to consider: one day's, three days', or five days' worth. To avoid biasing, there was no overlap between data sets of different sizes.
The study revealed that "participants fared better with map-based representations, correctly identifying Twitter users' homes roughly 65% of the time and their workplaces at closer to 70%. Even the tabular representation was informative, however, with accuracy rates of just under 50% for homes and a surprisingly high 70% for workplaces."
It further added, "Participants also fared better with five days' worth of data as compared to that of the three or one. Across all three representations, participants with five days' worth of data could correctly identify workplaces, for example, with more than 85% accuracy."
This research paper "puts two significant bricks in the wall of our privacy understanding. First, her survey shows how people can learn sensitive information from seemingly innocuous facts, and, second, people will easily share information they believe is innocuous," said Latanya Sweeney, professor of government and technology in residence at
The study was presented at the Association for Computing Machinery's Conference on Human Factors in Computing Systems.