Summary of “‘Data is a fingerprint’: why you aren’t as anonymous as you think online”

In August 2016, the Australian government released an “Anonymised” data set comprising the medical billing records, including every prescription and surgery, of 2.9 million people.
“It’s convenient to pretend it’s hard to re-identify people, but it’s easy. The kinds of things we did are the kinds of things that any first year data science student could do,” said Vanessa Teague, one of the University of Melbourne researchers to reveal the flaws in the open health data.
“The point is that data that may look anonymous is not necessarily anonymous,” she said in testimony to a Department of Homeland Security privacy committee.
More recently, Yves-Alexandre de Montjoye, a computational privacy researcher, showed how the vast majority of the population can be identified from the behavioural patterns revealed by location data from mobile phones.
“Location data is a fingerprint. It’s a piece of information that’s likely to exist across a broad range of data sets and could potentially be used as a global identifier,” de Montjoye said.
Even if location data doesn’t reveal an individual’s identity, it can still put groups of people at risk, she explained.
Montjoye and others have shown time and time again that it’s simply not possible to anonymise unit record level data – data relating to individuals – no matter how stripped down that data is.
“There are firms that specialise in combining data about us from different sources to create virtual dossiers and applying data mining to influence us in various ways.”

The orginal article.