Attacks could put ‘millions of users’ at risk

Open source ecosystem ripe for dependency confusion attacks, research finds

UPDATED The most popular open source repositories are rife with misconfigurations that leave countless downstream applications at risk from dependency confusion attacks, security researchers have discovered.

Of the 1,000 organizations whose GitHub accounts were analyzed based on their star rating and activity levels, more than one in five – 212 – contained at least one dependency confusion-related misconfiguration in their codebase.

“If any of their projects get affected, there’s a high probability that millions of users will be at risk,” said Somdev Sangwan, security researcher at RedHunt Labs, in a blog post.

Spate of attacks

Dependency confusion attacks infiltrate the open source ecosystem with malicious components by overriding privately-used dependency packages with malicious, public packages with the exact same name.

BACKGROUND Researcher hacks Apple, Microsoft, and other major tech companies in novel supply chain attack

There has been a spate of these supply chain attacks since the technique’s architect, security researcher Alex Birsan, revealed in February that he had successfully compromised ecosystems maintained by Apple, Microsoft, and PayPal with ‘dummy’ packages.

Only a week later, suspicious packages bearing the name of legitimate libraries were found in the Python Package Index (PyPI) repository, and soon after PyPI and NPM Registry maintainers removed thousands of malicious packages smuggled into repositories using the same ‘substitution’ technique.

Ripe for hijacking

Of 38,691 individual repositories scanned by RedHunt Labs, 20,220 contained files used to store dependencies.

Defying the researchers’ expectations, the most common issue surfaced among these were packages with ‘unreachable’, and therefore hijackable, sources. This comprised 169 repositories that had installed packages from expired domains, and 126 that contained packages owned by non-existent GitHub or GitLab profiles.

The researchers also found numerous instances of publicly unavailable packages, meaning they had been deleted or mistyped, or were private packages that could be registered publicly.

Almost 10% of organizations (93) were using at least one package that didn’t exist on a public package index, the researchers found.

Flip the Script

JavaScript packages tend to have the most dependencies, with 12,212 of 17,496 JavaScript repositories containing the telltale package.json files used by JavaScript package managers NPM and Yarn.

JavaScript dependencies are also, along with those connected to Golang, most likely to be from unreachable sources.

JavaScript repos collectively contained 345 such packages, along with 72 publicly unavailable packages.

Catch up on the latest software supply chain attack news

The corresponding numbers for the 4,198 Golang/Go repos, 2,052 of which used a go.mod dependency management file, were 69 from unreachable sources and zero from unavailable packages.

Of 8,614 Python repositories, only 2,906 appeared to used dependencies, which contained 40 unavailable packages and seven from unreachable sources.

Ruby repos didn’t contain a single unavailable package and used just seven packages from sources that were not publicly reachable, despite having a significant number of dependencies. Of 4,538 Ruby repositories, 3,044 used a dependency storage ‘Gemfile’.

Of 33 PHP repositories with the equivalent composer.json file, one had an unavailable package and two were from unreachable sources.

Exploiting the trust model

“It was surprising to see how such large organizations that are very security conscious could have also been affected by simply exploiting the trust model (supply chain),” Somdev Sangwan of RedHunt Labs told The Daily Swig.

“Some dependencies managers have responded responsibly by patching the problematic components. However, it’s up to end-users to implement them securely, as humans are the weakest link after all.

“While the industry has already learnt about these attacks, it still keeps coming back in various forms,” continued the researcher.

“Python’s mismatch of PyPI and import names is such an example. It is not inherently a bug on Python’s end but simply a common human error.

“Having said that, it’s time to reassess our trust in resources maintained by third parties.”

DON’T FORGET TO READ Researchers find denial-of-service equivalent against machine learning systems

Major software vendors have been scrambling to contribute to a defense-in-depth response to the dependency confusion threat in the wake of Alex Birsan’s research.

In February, Google set out proposals for creating “well-defined, verifiable official versions” of ‘critical’ software, which received pushback and prompted alternative, supposedly lower-friction suggestions such as fine-grained sandboxing and a dependency tagging system from Firefox CTO Eric Rescorla.

Sonatype, meanwhile, launched a ‘dependency/namespace confusion checker’ tool that detects the signatures of dependency confusion attacks.

Microsoft (PDF) and GitHub have also both, along with RedHunt Labs themselves, set out ways to mitigate the threat.

The Daily Swig has also contacted GitHub and the NPM Registry for further comment. We will update this article if and when we hear back.

This article was updated on May 25 with comments from Somdev Sangwan of RedHunt Labs.

RECOMMENDED SolarWinds hack: Nation-state attackers could have launched supply chain attack nine months before previously thought