Big
data — huge data sets that are often made publicly available to anyone
who wants to analyze it — are supposed to be anonymous. The idea is to
leave out key pieces of information about the people involved, such as
name or home address, and leave only the larger trends.
But
such specifics are not needed to reveal exactly who you are, according
to researchers who were able to identify "anonymous" participants in a
big data set 90% of the time.
The
study — published in the journal Science — posits that researchers were
able to identify "anonymous" shoppers from a big-data set based on
credit card metadata: vague things such as the type of venue (a gym, for
example) or the amount spent on a purchase.
The
team went through three months of credit card data, which encompassed
1.1 million people shopping at 10,000 shops. The shopping happened in an
undisclosed country, sourced from records provided by a "major bank."
All
of the "sensitive information," such as name, credit card number and
the time of the purchase were taken out of the equation — but the
shoppers' activities had unique qualities nonetheless. The research team
was able to accurately identify a shopper 90% of the time by using just
four pieces of data on customer location, coupled with some other
information about the shoppers.
A
location-stamped tweet, for example, could be used and crossed with the
metadata to directly identify a shopper. It's what the researchers call
a "correlation attack," or learning personal details about someone by
correlating seemingly innocuous data with outside information.
It's
a chilling concept — that the digital footprint we leave, no matter how
vague, could be traced back to us in a very specific way.
Posted by : Gizmeon
No comments:
Post a Comment