HTML-to-PDF converters open to denial-of-service, SSRF, directory traversal attacks

Infosec intern assailed eight open source libraries in 11 different ways

HTML-to-PDF converters open to DoS, SSRF, directory traversal attacks

Five popular open source libraries used to convert HTML files to PDF documents are vulnerable to server-side request forgery (SSRF), directory traversal, and denial-of-service (DoS) attacks.

Discovered by an intern with just four months’ infosec industry experience, the findings emerged from an ambitious project that tested eight libraries – each written in a different language – against 11 hypotheses.

“Possibly the most interesting question of all” was whether the libraries would allow JavaScript code execution during the file conversion process, said Eduardo Müller, now security analyst at Tempest Security Intelligence after concluding his internship at the firm’s Brazil office, in a blog post.

This is because the candidates that affirmed the hypothesis would then be testable for further JavaScript-based attacks, added Müller, who also presented his findings at OWASP Latam@Home 2020.

Not only did all five viable libraries – Node-HTML-PDF, Go-wkHTML, DinkToPDF, wkHTML, and PDFKit – allow JavaScript execution, but all did so by default.

More egregiously still, JavaScript execution could only be disabled on wkHTML.

100% hit rate

The quintet of libraries open to JavaScript execution all fell foul of SSRF attacks, redirecting the hypothetical user to an external domain lurking within the HTML file following the conversion process.

Müller, 24, also successfully generated PDFs containing the passwd file, exfiltrated from the server in all five instances by using the XMLHttpRequest object, but failed to replicate the achievement when using the iframe, object, and portal tags.

While Müller said the use of an infinite loop successfully interrupted JavaScript execution and enabled DoS attacks, he expressed “doubts” about the findings that were confirmed, to some degree, by his colleagues.

In none of the libraries vulnerable to JavaScript-driven SSRF could Müller detect a mechanism to cap requests, which would mitigate SSRF and DoS attacks.

The three libraries that did block some of the 200 requests he directed against the server to test this hypothesis were the same trio that did not allow JavaScript execution: Flying Saucer, WeasyPrint, and DomPDF.

The researcher also praised DomPDF for being the only library to provide security recommendations in its documentation, which also references known vulnerabilities.

Attack scenarios

There is a “plethora of attack scenarios” for exploiting these flaws, Müller tells The Daily Swig. “It really depends on the target’s application/infrastructure.

“A pretty straightforward example is an arbitrary file read, where an adversary could obtain configuration files, PIIs, or even [an] application’s source code. Our experience here at Tempest shows that by having this type of information an attacker usually ends up having a remote code execution.

“A more limited scenario (not less dangerous though) may happen in a cloud environment such as in Amazon AWS,” he adds. “By exploiting a SSRF vulnerability it would be possible to access the AWS EC2 Instance Metadata service, generate temporary tokens, and eventually access other services in AWS.”

He expressed surprise that HTML-PDF conversion could yield so many flaws, and was “astonished” at “how easy it was to read arbitrary files.

“Because of its impacts, it represents a great risk to the application,” he said.

Security recommendations

Although the research didn’t surface fixable security vulnerabilities per se, Müller says vendors could still avoid having certain features activated by default, and better document the potential risks of activating them.

They could also “create features that allow the developers to distinguish between data and code”, although “this would require a major redesign and probably break API backward compatibility”.

As well as applying software updates, the researcher recommends that, if possible, users disable JavaScript execution when it isn’t required, as well as use a sandbox to restrict permissions, and validate and filter HTML data before converting it into PDF form.

Müller thanked colleagues who helped him with the project. “They really made the difference,” he says.

Other projects to recently emerge from Tempest’s internship program include the development of Burp Suite extensions that automate failure detection in HTML development, and detect cypher code injection in applications that use Neo4j databases.

HTML-to-PDF converters open to denial-of-service, SSRF, directory traversal attacks

100% hit rate

Attack scenarios

Security recommendations

We’re going teetotal – It’s goodbye to The Daily Swig

Bug Bounty Radar

Indian gov flaws allowed creation of counterfeit driving licenses