Trellix automates tackling open source vulnerabilities at scale

More than 61,000 vulnerabilities patched and counting

Trellix has patched over 61,000 vulnerabilities using an automated tool designed to automate the fixing of code vulnerable to a severe Python bug

Trellix has patched over 61,000 open source projects against a severe Python bug with the help of an automated tool that dramatically accelerated the process.

Last year, the Trellix Advanced Research Center team stumbled upon a 15-year-old vulnerability embedded in Python’s tarfile module. Tracked as CVE-2007-4559, the vulnerability is described as a path traversal issue leading to “user-assisted remote attackers” being able to overwrite arbitrary files via “a .. (dot dot) sequence in filenames in a TAR archive”.

BACKGROUND Tarfile path traversal bug from 2007 still present in 350k open source repos

According to Trellix researcher Douglas McKee, while the security flaw was reported back in 2017, it was “left unchecked” or unresolved. As a result, the vulnerability has been unwittingly included in approximately 350,000 open source projects – and is considered “prevalent” in many closed source projects.

However, as documented in a blog post dated January 23, Trellix has been working with GitHub to contain the flaw – a tall order when so many projects are vulnerable.

“The vulnerable tarfile module is included in the base Python package [...] it is also, without a direct fix from Python, firmly embedded in the supply chain of many projects,” the cybersecurity company says.

Led by Kasimir Schulz and Charles McFarland, the months-long project centered on the automatic patching of open source repositories containing vulnerable code.

Bulk pull request tactic

The inspiration apparently came from Jonathan Leitschuh’s DEFCON 2022 presentation, which discussed using automated bulk pull request generation as a scalable methodology for fixing open source vulnerabilities.

Trellix and GitHub separated the process into two phases, both of which were automated and just required execution, with quality control – and acceptance – left up to project owners.

The first step was to develop the patch itself. Trellix obtained a list of repositories and files containing the keyword “import tarfile” and then cloned and scanned each repository using Creosote.

“If a repository was determined to contain the vulnerability, we patched the file and created a local patch diff containing the patched file so users can easily compare the two files, the original file, and some metadata about the repository,” McKee explained.

In the pull request phase, the cybersecurity team created repository forks, cloned them, and replaced the original file with the patched version if the original file had not changed. This check was implemented to ensure the patched replacement avoided ignoring or overwriting recent additions to the project’s code.

Lastly, the file was committed, a pull request was generated, and a message was sent explaining the fork and asking the owner to accept or reject the changes.

Scale up

Speaking to The Daily Swig, Kasimir Schulz, vulnerability researcher at Trellix’s Advanced Research Center, said Creosote and the patcher can together perform repository scans, detect the bug, and apply a patch in a matter of seconds – whereas even the most skilled developer would take minutes to do the same without the tool’s help.

“While this difference doesn’t matter much for a handful of repositories, it can be felt quickly as the scale increases,” Schulz noted.

Through GitHub, the Trellix team has patched 61,895 open-source projects to date.

Schulz said that recent discussions at ShmooCon have created “new momentum” for the vulnerability to be patched in Python itself, and there may even be “the possibility of a financial reward being offered in return for a fix”.

Schulz concluded: “Software and supply chains keep getting more complex. There are more people and companies building tons of different software. Because of this, trying to reduce the attack surface is a losing battle. Instead, we should focus on auditing our own supply chains through automated tooling, securing the attack surface rather than wasting time in a battle that can’t be won.”

Trellix automates tackling open source vulnerabilities at scale

Bulk pull request tactic

Scale up

We’re going teetotal – It’s goodbye to The Daily Swig

Bug Bounty Radar

Indian gov flaws allowed creation of counterfeit driving licenses