Two packages lay undiscovered for 10 months
UPDATED The Python Package Index (PyPI) has removed malware-deploying and data-stealing packages that were collectively downloaded thousands of times.
The trio of malicious packages duped unsuspecting users by typosquatting the names of legitimate packages.
In the case of two packages that exfiltrated data from compromised systems, the number of downloads was also potentially inflated by how the authors deceptively burnished their credibility.
“Both of these packages included their source code URL as an existing popular library, so anyone browsing to the package in PyPI or analyzing how popular the library was would see a large number of GitHub stars and forks – indicating a good reputation,” said Andrew Scott, product manager at Palo Alto and maintainer of Python security project Ochrona Security, in a Medium blog post.
Uploaded by the same user, the two packages – ‘dpp-client’ and ‘dpp-client1234’ – appeared to target users of Apache Mesos, which is used to manage computer clusters.
They were uploaded onto PyPI in February 2021, after which dpp-client was downloaded more than 10,000 times, including more than 600 downloads in the last month alone.
Scott thanked the Python security team for removing the packages promptly on December 13, the same day he notified them.
A third, Trojan-smuggling package dubbed ‘aws-login0tool’ notched roughly 600 downloads between surfacing on PyPI on December 1 and its removal when PyPI admins were alerted on December 10.
“I believe that aws-login0tool was meant to confuse users of a tool called aws-login-tool which no longer exists on PyPI, but is on some older mirrors,” Scott told The Daily Swig.
“The dpp-client packages I have to assume are maybe [intended to imitate] an internal component of some kind of data processing pipeline tool, but I haven’t been able to confirm that.”
All three packages were identified as potentially malicious via the import urllib.request string, “since this is commonly used to exfiltrate data or download malicious files”, said Scott.
The data-stealing pair gathered environment variables and file listings, apparently seeking Apache Mesos-related files, and relayed them “to an unknown web service”.
The aws-login0tool performed a standard package install, before fetching an .exe file “from a nondescript domain” and attempting to execute the file, a known Windows trojan.
“It’s hard to know what the impact of these would be,” said Scott. “The Trojan package would only be limited to the malware capabilities and the data extraction will really depend on your environment – but I could definitely see this being able to harvest things like AWS creds and other API keys. I’m less sure what Mesos-specific information is stored in the targeted directories.”
The findings emerged from a static analysis of around 200,000 PyPI packages – approaching two-thirds of the total – after downloading them with Bandersnatch.
He extracted the packages by creating “a pretty simple Python script to recursively iterate through Bandersnatch’s somewhat complicated folder structure then decompressed and extracted each sdist, egg, or wheel out to a flat directory.
“Once extracted I ran a number of string and regex searches using grep, then manually reviewed the results,” said Scott.
This technique also uncovered a minor vulnerability in an open source package developed by a commercial vendor.
Scott said Ochrona, an open source software composition analysis tool, can help developers if they’re using a mirror or want to ascertain whether the packages are present in their project.
He also intends to update and refine his package analysis and will later publish additional findings.
This article was updated with additional comment from Andrew Scott on December 14