Avast is harvesting users’ browser histories on the pretext that the data has been ‘de-identified,’ thus protecting your privacy. But the data, which is being sold to third parties, can be linked back to people’s real identities, exposing every click and search they’ve made.
Your antivirus should protect you, but what if it’s handing over your browser history to a major marketing company?
Relax. That’s what Avast told the public after its browser extensions were found harvesting users’ data to supply to marketers. Last month, the antivirus company tried to justify the practice by claiming the collected web histories were stripped of users’ personal details before being handed off.
“The data is fully de-identified and aggregated and cannot be used to personally identify or target you,” Avast told users, who opt in to the data sharing. In return, your privacy is preserved, Avast gets paid, and online marketers get a trove of “aggregate” consumer data to help them sell more products.
There’s just one problem: What should be a giant chunk of anonymized web history data can actually be picked apart and linked back to individual Avast users, according to a joint investigation by PCMag and Motherboard.
How ‘De-Identification’ Can Fail
The Avast division charged with selling the data is Jumpshot, a company subsidiary that’s been offering access to user traffic from 100 million devices, including PCs and phones. In return, clients—from big brands to e-commerce providers—can learn what consumers are buying and where, whether it be from a Google or Amazon search, an ad from a news article, or a post on Instagram.
The data collected is so granular that clients can view the individual clicks users are making on their browsing sessions, including the time down to the millisecond. And while the collected data is never linked to a person’s name, email or IP address, each user history is nevertheless assigned to an identifier called the device ID, which will persist unless the user uninstalls the Avast antivirus product.
For instance, a single click can theoretically look like this:
Device ID: abc123x Date: 2019/12/01 Hour Minute Second: 12:03:05 Domain: Amazon.com Product: Apple iPad Pro 10.5 – 2017 Model – 256GB, Rose Gold Behavior: Add to Cart
At first glance, the click looks harmless. You can’t pin it to an exact user. That is, unless you’re Amazon.com, which could easily figure out which Amazon user bought an iPad Pro at 12:03:05 on Dec. 1, 2019. Suddenly, device ID: 123abcx is a known user. And whatever else Jumpshot has on 123abcx’s activity—from other e-commerce purchases to Google searches—is no longer anonymous.
PCMag and Motherboard learned about the details surrounding the data collection from a source familiar with Jumpshot’s products. And privacy experts we spoke to agreed the timestamp information, persistent device IDs, along with the collected URLs could be be analyzed to expose someone’s identity.
“Most of the threats posed by de-anonymization—where you are identifying people—comes from the ability to merge the information with other data,” said Gunes Acar, a privacy researcher who studies online tracking.
He points out that major companies such as Amazon, Google, and branded retailers and marketing firms can amass entire activity logs on their users. With Jumpshot’s data, the companies have another way to trace users’ digital footprints across the internet.
“Maybe the (Jumpshot) data itself is not identifying people,” Acar said. “Maybe it’s just a list of hashed user IDs and some URLs. But it can always be combined with other data from other marketers, other advertisers, who can basically arrive at the real identity.”
The ‘All Clicks Feed’
According to internal documents, Jumpshot offers a variety of products that serve up collected browser data in different ways. For example, one product focuses on searches that people are making, including keywords used and results that were clicked.
We viewed a snapshot of the collected data, and saw logs featuring queries on mundane, everyday topics. But there were also sensitive searches for porn—including underage sex—information no one would want tied to them.
Other Jumpshot products are designed to track which videos users are watching on YouTube, Facebook, and Instagram. Another revolves around analyzing a select e-commerce domain to help marketers understand how users are reaching it.
But in regards to one particular client, Jumpshot appears to have offered access to everything. In December 2018, Omnicom Media Group, a major marketing provider, signed a contract to receive what’s called the “All Clicks Feed,” or every click Jumpshot is collecting from Avast users. Normally, the All Clicks Feed is sold without device IDs “to protect against triangulation of PII (Personally Identifiable Information),” says Jumpshot’s product handbook. But when it comes to Omnicom, Jumpshot is delivering the product with device IDs attached to each click, according to the contract.
In addition, the contract calls for Jumpshot to supply the URL string to each site visited, the referring URL, the timestamps down to the millisecond, along with the suspected age and gender of the user, which can inferred based on what sites the person is visiting.
It’s unclear why Omnicom wants the data. The company did not respond to our questions. But the contract raises the disturbing prospect Omnicom can unravel Jumpshot’s data to identify individual users.
Although Omnicom itself doesn’t own a major internet platform, the Jumpshot data is being sent to a subsidiary called Annalect, which is offering technology solutions to help companies merge their own customer information with third-party data. The three-year contract went into effect in January 2019, and gives Omnicom access to the daily click-stream data on 14 markets, including the US, India, and the UK. In return, Jumpshot gets paid $6.5 million.
Who else might have access to Jumpshot’s data remains unclear. The company’s website says it’s worked with other brands, including IBM, Microsoft, and Google. However, Microsoft said it has no current relationship with Jumpshot. IBM, on the other hand, has “no record” of being a client of either Avast or Jumpshot. Google did not respond to a request for comment.
Other clients mentioned in Jumpshot’s marketing cover consumer product companies Unilever, Nestle Purina, and Kimberly-Clark, in addition to TurboTax provider Intuit. Also named are market research and consulting firms McKinsey & Company and GfK, which declined to comment on its partnership with Jumpshot. Attempts to confirm other customer relationships were largely met with no responses. But documents we obtained show the Jumpshot data possibly going to venture capital firms.
‘It’s Almost Impossible to De-Identify Data’
Wladimir Palant is the security researcher who initially sparked last month’s public scrutiny of Avast’s data-collection policies. In October, he noticed something odd with the antivirus company’s browser extensions: They were logging every website visited alongside a user ID and sending the information to Avast.
The findings prompted him to call out the extensions as spyware. In response, Google and Mozilla temporarily removed them until Avast implemented new privacy protections. Still, Palant has been trying to understand what Avast means when it says it “de-identifies” and “aggregates” users’ browser histories when the antivirus company has refrained from publicly revealing the exact technical process.