Skip to content

How to Protect Against Pip Supply Chain Attacks

Problem

When I run pip install requests in my project, I got this warning in my terminal:

user@host:~/project$ pip install requests
Collecting requests
Downloading requests-2.31.0-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 234.2 kB/s eta 0:00:00
Collecting charset-normalizer<4,>=2
Downloading charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
[Previous line repeated 2 more times]
Downloading urllib3-2.1.0-py3-none-any.whl (123 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.9/123.9 kB 1.1 MB/s eta 0:00:00
Installing collected packages: idna, urllib3, charset-normalizer, certifi, requests
Successfully installed certifi-2024.6.2 charset-normalizer-3.3.2 idna-3.7 requests-2.31.0 urllib3-2.1.0

But I realized something dangerous: my project just downloaded and installed 5 packages for a single dependency. What if one of those transitive dependencies was malicious?

Environment

  • Python 3.11
  • pip 23.2.1
  • Ubuntu 22.04
  • Virtual environment activated

What happened?

I was working on a Django project and needed to make HTTP requests. I used the standard pip install requests command and got what looked like a successful installation. But then I read about the Reddit discussion showing that 56% of malicious pip packages execute code without waiting for import.

Here’s my initial setup:

main.py
import requests
def get_data(url):
response = requests.get(url)
return response.json()
print(get_data("https://api.example.com/data"))

And my requirements:

requirements.txt
requests

The problem is that transitive dependencies can contain malicious code. When pip install requests runs, it also installs:

  • urllib3
  • certifi
  • charset-normalizer
  • idna

Any of these could potentially contain malicious code that executes during installation, not just when I import them.

How to solve it?

I tried to use requirements.txt alone:

Terminal window
pip freeze > requirements.txt

This creates a locked version, but it doesn’t prevent transitive dependency attacks. So I found a better approach.

First, I tried using pip-tools:

Terminal window
pip install pip-tools

Then I created a requirements.in file:

requirements.in
requests

Next, I compiled it to create locked requirements:

Terminal window
pip-compile requirements.in -o requirements.lock

This generated a requirements.lock file with exact pinned versions:

requirements.lock
#
# This file is autogenerated by pip-compile with Python 3.11
# To update, run:
#
# pip-compile requirements.in
#
certifi==2024.6.2 \
--hash=sha256:...
charset-normalizer==3.3.2 \
--hash=sha256:...
idna==3.7 \
--hash=sha256:...
requests==2.31.0 \
--hash=sha256:...
urllib3==2.1.0 \
--hash=sha256:...

Now I install using the locked file:

Terminal window
pip install -r requirements.lock

But I realized this still doesn’t protect against supply chain attacks. The packages are still being downloaded from PyPI during installation.

So I added network-level controls. I tried setting up a pip configuration file:

~/.config/pip/pip.conf
[global]
index-url = https://pypi.org/simple
trusted-host = pypi.org
disable-pip-version-check = true

This ensures pip only connects to PyPI, but I wanted more control. So I added environment isolation:

Terminal window
# Create virtual environment
python -m venv myenv
source myenv/bin/activate
# Install packages with network restrictions
pip install --index-url https://pypi.org/simple --trusted-host pypi.org -r requirements.lock

Now test again:

Terminal window
pip list
Package Version
---------- -------
certifi 2024.6.2
charset-normalizer 3.3.2
idna 3.7
requests 2.31.0
urllib3 2.1.0

You can see that I succeeded to install exact versions with no unknown packages.

The reason

I think the key reason for the vulnerability is:

  1. Transitive dependency risk: When you install requests, pip resolves and installs all dependencies, creating multiple potential attack points

  2. Installation-time execution: Malicious packages can execute code during the install process, not just when imported

  3. PyPI is not safe: While PyPI has security checks, packages can still be compromised before moderation

  4. Default pip behavior: By default, pip allows connections to any index and doesn’t verify package integrity beyond basic checksums

Additional protections

I added more layers to my defense:

  1. Private package repository: For internal packages, I set up a private PyPI server
Terminal window
pip install --index-url http://private-pypi.local/simple --trusted-host private-pypi.internal -r requirements.lock
  1. Regular dependency audits: I run these commands to check for vulnerabilities
Terminal window
pip install safety
safety check -r requirements.lock
pip install pip-audit
pip-audit -r requirements.lock
  1. Container isolation: I run my Python applications in Docker containers with limited network access
Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.lock .
RUN pip install --no-cache-dir -r requirements.lock
COPY . .
  1. Run-time protection: I use sandboxing for untrusted code
safe_import.py
import sys
import subprocess
from tempfile import NamedTemporaryFile
def safe_execute_code(code_string):
# Execute in isolated environment
with NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write(code_string)
temp_file = f.name
try:
result = subprocess.run(
[sys.executable, temp_file],
capture_output=True,
timeout=30,
user=nobody_uid # Run as unprivileged user
)
return result.stdout, result.stderr
finally:
os.unlink(temp_file)

Summary

In this post, I showed how to protect against pip supply chain attacks with multiple defense layers. The key point is that no single protection is enough - you need defense in depth with dependency pinning, network controls, isolation, and regular audits.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments