Novel supply chain attack detected in the wild just days after security researcher disclosed the technique

Dependency confusion attack mounted via PyPi repo exposes flawed Python package installer behavior

UPDATED The default behavior of pip, the Python package installer, leaves the software development process vulnerable to ‘dependency confusion’ attacks, a software vendor has discovered.

Use of the novel supply chain attack technique has been detected in the wild only a week after it was disclosed by its architect.

Pip’s insecure behavior highlights a “major problem in the way code is being shared and reused through node package manager [NPM], PyPi, and other online repositories”, says Henri Terho, chief R&D evangelist at Qentinel, in a blog post.

RELATED Software supply chain attacks – everything you need to know

Infiltrating the build process

The attack came to light on February 16 when a developer at the automated software testing specialist reported the mysterious failure of a build pipeline when fetching internal libraries.

The company then traced the problem to suspicious packages in the Python Package Index (PyPi) repository.

With the help of Python’s security team, these packages were blocklisted the next day (February 17) in order to prevent them from infiltrating any more builds.

Qentinel also “registered the domains that the packages were supposedly registered from” to themselves to prevent the rogue libraries’ creator from abusing them to spoof emails.

Dependency confusion: A growing threat

As reported by The Daily Swig last week, security researcher Alex Birsan fashioned and successfully deployed the ‘dependency confusion’ technique against more than 35 organizations, including Apple, Microsoft, and PayPal.

BACKGROUND Researcher hacks Apple, Microsoft, and other major tech companies in novel supply chain attack

“As described by Alex, dependency confusion attack exploits misconfigured build scripts and one-off mistakes of developers to pull the malicious library from the public repository and not the actual library from a private one,” says Terho.

“The publicly released package then contains malicious code which phones home and even allows for remote code execution.”

The attack surface for such attacks is enormous, given how routinely private and public dependencies are pulled into applications’ source code.

“These package indexes are constantly growing in their role as an attack vector against companies,” Terho tells The Daily Swig, noting the progression from ‘typosquatting’ techniques to “more advanced attack vectors”.

And as “more software is being developed on top of shared libraries – for example a quick web project I did only had about 100 lines of own code but over 10 libraries from NPM – this leads to growing risk for everyone,” he adds.

“I think the community now has to start taking a more proactive stance on ‘LibSec’, [the] security of imported public libraries, and there are now even companies operating on that space, such as Snyk.

Wake-up call

As with Birsan’s NPM packages, the PyPi packages thankfully seemed to contain no malicious code – giving the software development ecosystem a salutary wake-up call about the threat.

The packages “were empty placeholder libraries”, found Qentinel.

“For us it was lucky,” says Terho. “Seems like it was a trial run by somebody, but I can easily see an automated script scanning for private packages and deploying public malicious packages to these ecosystems as a way to access a lot of companies who are not as awake on the infra security side.”

Quentinel identified three rogue libraries created by an unknown PyPi account that were being used by four of its products: Qentinel Pace, QWeb, QVision, and QMobile.

Since pip defaults to fetching libraries from PyPi, “those external libraries were fetched, but not the actual libraries from our private repositories, explains Terho.

“The newly created public repositories did not contain our source code, so the dependencies failed in build.”

Insecure by default

Pip’s insecure behavior centered on the –extra-index-url parameter, which checks whether the library exists in the specified and public package indexes, then, if more than one version is found, installs the package with the highest version number.

All the PyPi attacker had to do was upload a library with a very high version number.

“This problem has to be solved at the build pipeline level in updating” the “default behavior of pip and other tools,” says Terho.

In the meantime, developers can mitigate the problem by only using -index-url to specify the pip’s custom repository address, thereby retrieving the package from the custom, rather than public, repository.

Terho advises “clients to purge all caches in their build pipelines which might contain the fake repositories and check that their build scripts are configured correctly.”

He also recommends that anyone who has updated or installed Pace Connect between February 15-17 reinstall the packages.

This article was updated on February 22 with additional comments from Henri Terho of Qentinel.

RELATED Centris: New tool helps prevent software supply chain attacks by flagging modified open source components