When pip install Becomes the Attack Vector: Inside the PyPI Supply-Chain Breach

What Happened, How It Worked, and How to Defend Against It

Software supply-chain attacks don’t usually look dramatic.
There’s no pop-up, no crash, no obvious warning. Most of the time, everything looks normal — and that’s exactly why they work.

The PyPI attack that targeted the PyTorch ecosystem is a perfect example. It didn’t break cryptography, exploit kernels, or compromise source repositories. Instead, it quietly abused trust, automation, and default behaviors most developers never think twice about.

This article breaks the incident down end-to-end and then goes deep on prevention, mitigation, and practical analysis, including how to safely audit setup.py, where most of the damage actually happens.


What the attack actually was

This incident was a software supply-chain attack targeting developers who use PyTorch via the Python packaging ecosystem.

Key facts up front:

  • PyTorch’s official source code was not compromised
  • No GitHub repositories were hacked
  • The attack lived entirely in malicious third-party packages
  • Distribution happened through the Python Package Index (PyPI)

The attackers uploaded packages that looked legitimate, relied on normal pip install behavior, and executed malicious code during installation.

No exploit required.
No vulnerability scanner would scream.
Everything behaved “normally”.


Why PyTorch and ML environments were ideal targets

Machine-learning environments are unusually valuable and unusually soft.

They often include:

  • Cloud GPU instances
  • Expensive compute credits
  • CI/CD pipelines
  • Research data
  • Proprietary models
  • Long-lived cloud credentials in environment variables

At the same time:

  • Dependency trees are large
  • Nightly and experimental builds are common
  • Security hardening is often secondary to speed
  • Developers copy installation commands from blogs, issues, and notebooks

From an attacker’s perspective, this is a high-return environment with low friction.


How the attack worked

The attackers used two main techniques.

1 Typosquatting

They published packages with names that were almost identical to real PyTorch-related packages:

  • small spelling changes
  • extra hyphens or underscores
  • names resembling internal or nightly components

A quick glance wouldn’t reveal the difference.


2 Dependency confusion

Some builds referenced internal package names that were never meant to exist publicly.

Attackers:

  1. Discovered those names
  2. Uploaded packages with the same names to PyPI
  3. Assigned higher version numbers
  4. Let pip resolve the dependency automatically

Because pip prefers public packages with higher versions, the malicious package won.


Where the malicious code lived

The most important detail in this entire story:

Installing a Python package executes code.

The malicious logic was placed in:

  • setup.py
  • custom install hooks
  • occasionally top-level __init__.py

That means:

  • You didn’t have to import the package
  • You didn’t have to run your application
  • Running pip install was enough

This is not a bug. It’s how Python packaging works.


What the malicious code actually did

The malware was small, quiet, and deliberate.

Step 1: Execute silently during install

The code ran automatically as soon as installation started and was designed to:

  • avoid errors
  • avoid output
  • let installation succeed

Failed installs attract attention. Successful ones don’t.


Step 2: Collect system information

The code gathered basic context:

  • OS type
  • username
  • execution path
  • Python version
  • whether it was running in CI or cloud

This helped the attacker understand the environment.


Step 3: Steal environment variables

This was the real payload.

The malware scanned environment variables for:

  • cloud credentials (AWS, GCP, Azure)
  • CI tokens
  • GitHub/GitLab tokens
  • database passwords
  • API keys

In CI and ML workflows, secrets are often exposed this way by design.


Step 4: Decide whether to act

If nothing valuable was found, the malware sometimes did nothing at all.

This reduced noise and helped it stay invisible.


Step 5: Exfiltrate quietly

If secrets were found:

  • data was encoded
  • sent over HTTPS
  • sent once
  • no retries
  • no output

From the developer’s perspective:

“pip install worked fine.”

From the attacker’s perspective:

“We just got cloud credentials.”


Step 6: Exit cleanly

No persistence.
No backdoor.
No dropped files.

That’s why many victims never knew they were compromised.


Why this was hard to detect

Traditional security tooling struggles with attacks like this because:

  • Code runs during installation
  • Static scanners often don’t inspect setup.py
  • There’s no persistence to detect later
  • Network traffic looks normal
  • Package names look legitimate

This is supply-chain abuse, not classic malware.


Prevention: how to avoid installing malicious packages

Prevention matters more than cleanup.

1 Control where packages come from

  • Use private mirrors or internal registries
  • Block direct internet installs in CI
  • Allow-list approved packages

This alone stops most dependency-confusion attacks.


2 Pin dependency versions

Never allow floating versions in CI or production.

Bad:

torch>=2.0

Good:

torch==2.1.0

Attackers rely on version bumps to get installed.


3 Be suspicious of install commands

If the command didn’t come from official documentation:

  • slow down
  • verify the package name
  • check the publisher

Speed is the enemy of supply-chain security.


4 Treat CI as hostile terrain

CI systems are prime targets.

Defensive steps:

  • minimize secrets
  • rotate credentials frequently
  • restrict outbound network access
  • log DNS and HTTPS traffic during builds

Mitigation: limiting damage when prevention fails

Assume something will eventually slip through.

1 Monitor outbound traffic

Most malicious packages need to exfiltrate data.

Unexpected outbound HTTPS during installs is a strong signal.


2 Rotate secrets immediately

If you suspect a malicious install:

  • rotate all exposed credentials
  • assume environment variables were stolen
  • don’t wait for proof

3 Use scanners — but don’t trust them blindly

Dependency scanners help, but:

  • many ignore install-time code
  • simple malware looks “legitimate”

They are necessary, not sufficient.


How to safely analyze a suspicious Python package

Never analyze suspicious packages on your main machine.

1 Use an isolated environment

  • VM or container
  • no secrets
  • no cloud access
  • no SSH keys
  • no shared directories

Treat it like malware analysis.


2 Download — don’t install

Do not run pip install.

Instead:

  • download the archive
  • extract it manually
  • inspect files offline

Installing executes attacker code.


How to audit setup.py properly

This is where most supply-chain attacks hide.

1 What normal setup.py looks like

Legitimate scripts usually:

  • define metadata
  • list dependencies
  • maybe compile extensions

They do not:

  • access the network
  • read environment variables
  • execute shell commands
  • decode or execute hidden payloads

2 Red flags to look for

Be suspicious if you see:

  • HTTP or socket usage
  • environment variable scraping
  • exec() or eval()
  • base64 or compressed blobs
  • dynamically generated code
  • custom install hooks doing “extra work”

If the installer is doing more than installing, ask why.


Typical malicious install flow

  1. Malicious package uploaded to PyPI
  2. Developer or CI installs dependency
  3. setup.py executes automatically
  4. Secrets are collected
  5. Data is exfiltrated
  6. Installation completes normally

Response and ecosystem impact

Once discovered:

  • malicious packages were removed
  • dependency references were audited
  • internal naming practices were hardened
  • the Python Software Foundation reviewed registry safeguards

However, stolen credentials can’t be recalled.
Some damage is permanent and invisible.


The most important takeaway

The key lesson is simple but uncomfortable:

Installing a dependency is executing untrusted code.

Once teams internalize that:

  • they audit installers
  • they isolate builds
  • they lock down CI
  • they stop trusting names alone

That mindset shift matters more than any single tool.


Aegiron

Backed by 11+ years in cybersecurity and incident response, we decode the latest threats shaping today’s digital battlefield. This blog cuts through the noise with clear insights on vulnerabilities, emerging exploits, and the cyber news defenders can’t afford to miss.