How to Protect Against Pip Supply Chain Attacks
Problem
When I run pip install requests in my project, I got this warning in my terminal:
user@host:~/project$ pip install requestsCollecting requests Downloading requests-2.31.0-py3-none-any.whl (62 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 234.2 kB/s eta 0:00:00Collecting charset-normalizer<4,>=2 Downloading charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl [Previous line repeated 2 more times] Downloading urllib3-2.1.0-py3-none-any.whl (123 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.9/123.9 kB 1.1 MB/s eta 0:00:00Installing collected packages: idna, urllib3, charset-normalizer, certifi, requestsSuccessfully installed certifi-2024.6.2 charset-normalizer-3.3.2 idna-3.7 requests-2.31.0 urllib3-2.1.0But I realized something dangerous: my project just downloaded and installed 5 packages for a single dependency. What if one of those transitive dependencies was malicious?
Environment
- Python 3.11
- pip 23.2.1
- Ubuntu 22.04
- Virtual environment activated
What happened?
I was working on a Django project and needed to make HTTP requests. I used the standard pip install requests command and got what looked like a successful installation. But then I read about the Reddit discussion showing that 56% of malicious pip packages execute code without waiting for import.
Here’s my initial setup:
import requests
def get_data(url): response = requests.get(url) return response.json()
print(get_data("https://api.example.com/data"))And my requirements:
requestsThe problem is that transitive dependencies can contain malicious code. When pip install requests runs, it also installs:
- urllib3
- certifi
- charset-normalizer
- idna
Any of these could potentially contain malicious code that executes during installation, not just when I import them.
How to solve it?
I tried to use requirements.txt alone:
pip freeze > requirements.txtThis creates a locked version, but it doesn’t prevent transitive dependency attacks. So I found a better approach.
First, I tried using pip-tools:
pip install pip-toolsThen I created a requirements.in file:
requestsNext, I compiled it to create locked requirements:
pip-compile requirements.in -o requirements.lockThis generated a requirements.lock file with exact pinned versions:
## This file is autogenerated by pip-compile with Python 3.11# To update, run:## pip-compile requirements.in#certifi==2024.6.2 \ --hash=sha256:...charset-normalizer==3.3.2 \ --hash=sha256:...idna==3.7 \ --hash=sha256:...requests==2.31.0 \ --hash=sha256:...urllib3==2.1.0 \ --hash=sha256:...Now I install using the locked file:
pip install -r requirements.lockBut I realized this still doesn’t protect against supply chain attacks. The packages are still being downloaded from PyPI during installation.
So I added network-level controls. I tried setting up a pip configuration file:
[global]index-url = https://pypi.org/simpletrusted-host = pypi.orgdisable-pip-version-check = trueThis ensures pip only connects to PyPI, but I wanted more control. So I added environment isolation:
# Create virtual environmentpython -m venv myenvsource myenv/bin/activate
# Install packages with network restrictionspip install --index-url https://pypi.org/simple --trusted-host pypi.org -r requirements.lockNow test again:
pip listPackage Version---------- -------certifi 2024.6.2charset-normalizer 3.3.2idna 3.7requests 2.31.0urllib3 2.1.0You can see that I succeeded to install exact versions with no unknown packages.
The reason
I think the key reason for the vulnerability is:
-
Transitive dependency risk: When you install
requests, pip resolves and installs all dependencies, creating multiple potential attack points -
Installation-time execution: Malicious packages can execute code during the install process, not just when imported
-
PyPI is not safe: While PyPI has security checks, packages can still be compromised before moderation
-
Default pip behavior: By default, pip allows connections to any index and doesn’t verify package integrity beyond basic checksums
Additional protections
I added more layers to my defense:
- Private package repository: For internal packages, I set up a private PyPI server
pip install --index-url http://private-pypi.local/simple --trusted-host private-pypi.internal -r requirements.lock- Regular dependency audits: I run these commands to check for vulnerabilities
pip install safetysafety check -r requirements.lock
pip install pip-auditpip-audit -r requirements.lock- Container isolation: I run my Python applications in Docker containers with limited network access
FROM python:3.11-slim
WORKDIR /appCOPY requirements.lock .RUN pip install --no-cache-dir -r requirements.lockCOPY . .- Run-time protection: I use sandboxing for untrusted code
import sysimport subprocessfrom tempfile import NamedTemporaryFile
def safe_execute_code(code_string): # Execute in isolated environment with NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f: f.write(code_string) temp_file = f.name
try: result = subprocess.run( [sys.executable, temp_file], capture_output=True, timeout=30, user=nobody_uid # Run as unprivileged user ) return result.stdout, result.stderr finally: os.unlink(temp_file)Summary
In this post, I showed how to protect against pip supply chain attacks with multiple defense layers. The key point is that no single protection is enough - you need defense in depth with dependency pinning, network controls, isolation, and regular audits.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments