Google and Mozilla unveil plans to bake HTML sanitization into their browsers

Tech giants take aim at client-side scripting attacks

Engineers at Google, Mozilla, and security firm Cure53 have come together to develop an application programming interface (API) that provides a systematic solution to HTML sanitization.

The API, which will be integrated into future versions of Mozilla Firefox and Google Chrome browsers, will enable web developers to sanitize HTML input strings and prevent cross-site scripting (XSS) attacks without the need for third-party libraries.

Preventing XSS attacks

Web applications are often required to handle dynamic HTML content in the browser, such as when rendering user-generated content or dealing with templates.

Client-side HTML processing often causes security holes and malicious actors use them to stage XSS attacks and steal user information or send web commands on users’ behalf.

“Historically, the web has been confronted with XSS issues ever since the inception of JavaScript,” Frederik Braun, security engineer at Mozilla, told The Daily Swig.

“The web has an increase in browser capabilities with new APIs and can thus be added to the attacker’s toolbox.”

Sanitization libraries

Many developers rely on open-source JavaScript libraries such as DOMPurify to prevent XSS attacks. DOMPurify takes an HTML string as input and sanitizes it by escaping and removing potentially vulnerable parts.

DOMPurify has become a very popular library among front-end JavaScript developers. But keeping the library in sync with changes to web standards and different versions of browsers presents challenges.

“The issue with parsing HTML is that it is [a] living standard and thus a quickly moving target,” Braun said. “To ensure that the HTML sanitizer works correctly on new input, it needs to keep up with this standard. The failure to do so can be catastrophic and lead to sanitizer bypasses.”

The Mozilla engineer added that HTML sanitizers must also solve the challenge of matching the exact parsing behavior of each browser.

The new HTML Sanitizer API has been developed by Google, Mozilla, and the wider security community

Attacking from all angles

Michał Bentkowski, a security researcher with extensive experience of hunting XSS bugs, told The Daily Swig that JavaScript-based sanitizers present other hurdles as well.

“They increase the bundle size. I’ve met developers that were skeptical of including a sanitizer because they make their JS code much larger,” Bentkowski said.

“The sanitizers also need to deal with certain oddities of JS/DOM. For instance, they need to have some kind of protection against DOM clobbering attacks because they would be easily bypassable otherwise.”

‘POSIX’, another web security researcher, raised concern about the fragmented state of sanitization libraries.

“Each of the libraries for [XSS mitigation] has been developed separately by different vendors,” he told The Daily Swig. “Even safe libraries can become vulnerable if there’s not continuous maintenance.”

Native HTML sanitization 

The HTML Sanitizer API bakes XSS mitigation functionality directly into the browser. Developers can instantiate and use the API’s sanitizer class without the need to import external libraries.

“This moves the responsibility for correct parsing into a piece of software that is already getting frequent security updates and has proven successful in doing it timely,” Braun said.

Browsers already have baked-in sanitizers for clipboard data, Bentkowski says, so reusing the code to expand native sanitization capabilities would make perfect sense.

“In general, I believe that it is a right call to implement the sanitizer directly in the browser,” Bentkowski says.

“The message to developers that need to make use of a sanitizer is going to be simple: ‘Just use the built-in sanitizer; you don't need to include any external libraries!’ This should also raise awareness about the need for sanitization to average developers.”

POSIX added that “we don’t have to take time for thinking about which library to use anymore. And as [HTML Sanitizer API] is dependent on the browser, we can expect it to be managed more carefully than before, like react immediately to changes in browser functionalities”. But the researcher also warned against against putting excessive trust against a single API.

Testing the API

According to its specification document, the HTML Sanitizer API will not prevent mutated XSS attacks and attacks that exploit server-side vulnerabilities. With extra configuration, however, it can prevent DOM clobbering and XSS attacks based on script gadgets.

An experimental implementation of the API has been integrated into Firefox Nightly and Chrome Canary, the test versions of the browsers. A test page for the API is also available online.

Braun said that the API is ripe for testing for anyone who wants to critically analyze and test its security properties. But the API methods and the internal behavior are still unstable, therefore developers should not write applications against it yet, he added.

“This specification is incubating and thus in W3C’s Web Platform Incubator Community Group (WICG), but I’m personally hopeful it will be adopted by a W3C working group before this year ends,” Braun said.

Bentkowski, who has tested the API, said it has some issues but he believes these are “just teething troubles” that will be fixed soon.

“I hope that built-in sanitizer will reduce the amount of cross-site scripting issues in applications that deal with WYSIWYG editors or Markdown processing because these seem to be the main beneficiaries of the new standard,” he added.

YOU MIGHT ALSO LIKE Machine learning vulnerabilities are a growing threat to the web, report highlights