Data harvested without consent and before forms are submitted in many cases, researchers claim
Email addresses typed into online forms are often handed over to web trackers before being submitted and without user consent, a systematic study by computer scientists has discovered.
Email addresses – or identifiers derived from them – are apparently being used by data brokers and advertisers for cross-site and cross-platform identification of computer users.
As part of an investigation into how data from online forms is used for tracking, a team of four computer scientists measured the extent of email and password collection prior to form submission by analyzing the top 100,000 websites.
Catch up on the latest privacy-related security news and analysis
The researchers – Asuman Senol from KU Leuven in the Netherlands, Mathias Humbert (University of Lausanne, Switzerland), and Gunes Acar and Frederik Zuiderveen Borgesius (both from Radboud University, Belgium) – compared results from two vantage points, in the US and EU, as well as between mobile and desktop browsers.
Tracking domains
The team found that email addresses were exfiltrated to tracking domains before form submission and without giving consent on 1,844 websites in the EU crawl and 2,950 websites in the US crawl.
In the majority of cases, data was extracted to well-known tracker domains, but the researchers also identified 41 tracker domains omitted from popular blocklists.
Profiling for ad-serving purposes is not the only concern.
The researchers also identified seemingly inadvertent password collection from 52 websites by third-party relay scripts.
A research paper (PDF) based on the study is due to be presented at the upcoming Usenix ’22 security conference.
Web users typically enter their email addresses into online forms for reasons including signing up to a service or subscribing to a newsletter.
The research shows that any data entered into such forms may end up in the hands of data brokers – sometimes even in cases where individuals have second thoughts about signing up to something and didn’t hit ‘send’.
The researchers used an online crawler to systematically examine what happens when users close their session before submitting data entered on forms.
GDPR violation concerns
Although not a lawyer, Gunes Acar, one of the four main researchers on the project, told The Daily Swig that the behavior of some of the websites may be in violation of stricter data privacy regulations such as the European Union’s General Data Protection Regulation (GDPR).
“Surreptitious email exfiltration for tracking purposes may breach some GDPR principles such as transparency, purpose limitation, and legal basis, but we cannot say for certain whether individual websites violate the GDPR (or other laws) without looking at the specifics,” Acar explained.
Almost half of the websites contacted responded to the researchers’ GDPR-related requests (see response sample here).
“Some websites said they didn’t know that their visitors’ emails were collected by third parties, and they fixed the issue,” Acar said. “That was the most positive outcome.
“Other websites informed us how they were using the emails collected through this behavior,” they added.
Countermeasures
Privacy-conscious internet users might well recoil from the revelations, as summarized in a blog post featuring screen captures and videos.
Fortunately, some countermeasures are already available.
Acar explained: “Adblockers (e.g. uBlock Origin) and privacy-focused browsers (Brave, DuckDuckGo) block requests to tracker and advertising domains, and hence may prevent this type of data collection. Only blocking the cookies wouldn’t provide any protection.
“Email relay services may be used to avoid giving the same email address to different online and offline businesses. Apple, DuckDuckGo, and Mozilla offer such services, which can be used to generate alias addresses,” Acer concluded.
The researchers have developed a proof-of-concept browser plugin, LeakInspector, which informs users when their email and passwords are scraped from forms, in addition to blocking “leaky” requests to tracker domains.
“Unfortunately, the add-on is not available on Chrome Web Store, because it relies on APIs that Google disallows in Manifest v3,” Acar said. “We are working on publishing the add-on to Firefox’s add-on repository.”
RECOMMENDED Facebook account takeover: Researcher scoops $40k bug bounty for chained exploit