BD005: ReproducibleBuilds
Overview
| Property | Value |
|---|---|
| ID | BD005 |
| Name | ReproducibleBuilds |
| Group | build |
| Severity | NOTE |
Description
Verifies that package builds are reproducible by building twice and comparing checksums of the resulting artifacts.
Reproducible builds are important because:
- They enable verification that the source code matches distributed binaries
- They help detect supply chain attacks or compromised build systems
- They support auditability and trust in software distribution
- They are increasingly required by security-conscious organizations
What it checks
The check builds the package twice with a fixed SOURCE_DATE_EPOCH and compares:
- PASSED: Both builds produce identical wheel and sdist files (matching checksums)
- FAILED: Artifacts differ between builds (reports which files don’t match)
- NOT_APPLICABLE: No pyproject.toml found or
buildpackage not installed
How it works
- Sets
SOURCE_DATE_EPOCH=1577836800(2020-01-01 00:00:00 UTC) for timestamp normalization - Runs
python -m buildin a temporary directory - Runs
python -m buildagain in a different temporary directory - Computes SHA256 checksums of all artifacts
- Compares checksums between builds
How to fix
Common causes of non-reproducible builds
- Embedded timestamps: Files contain build-time timestamps
- Random ordering: File order in archives varies between builds
- Build path inclusion: Absolute paths embedded in artifacts
- Non-deterministic code generation: Generated files differ between runs
Honor SOURCE_DATE_EPOCH
Ensure your build process respects SOURCE_DATE_EPOCH:
# In setup.py or build scripts
import os
import time
if "SOURCE_DATE_EPOCH" in os.environ:
build_time = int(os.environ["SOURCE_DATE_EPOCH"])
else:
build_time = int(time.time())Use modern build backends
Modern build backends like hatchling and flit support reproducible builds:
# pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"Configure setuptools for reproducibility
# pyproject.toml
[tool.setuptools]
include-package-data = true
[tool.setuptools.packages.find]
where = ["src"]Avoid embedding build paths
# Bad: embeds absolute path
__file_path__ = __file__
# Good: use relative paths or omit
from pathlib import Path
PACKAGE_DIR = Path(__file__).parentSort file lists
When generating files lists, ensure consistent ordering:
# Bad: non-deterministic order
files = list(directory.glob("*.py"))
# Good: sorted for reproducibility
files = sorted(directory.glob("*.py"))Check for timestamp issues
Inspect your wheel contents for timestamp variations:
# Build twice and compare
python -m build --wheel -o dist1/
python -m build --wheel -o dist2/
# Compare zip contents
unzip -l dist1/*.whl > wheel1.txt
unzip -l dist2/*.whl > wheel2.txt
diff wheel1.txt wheel2.txtWhy NOTE severity?
This check is a NOTE because:
- Not all projects require reproducible builds
- Achieving reproducibility can require significant effort
- Some build backends don’t fully support reproducibility
- It’s an aspirational best practice rather than a hard requirement
Configuration
Set timeout
For complex packages with slow builds:
[tool.pycmdcheck.checks.BD005]
timeout = 600 # 10 minutes (default: 300 seconds)Skip this check
[tool.pycmdcheck]
skip = ["BD005"]CLI
pycmdcheck --skip BD005Best practices
- Use modern build backends: Prefer hatchling, flit, or poetry-core
- Test locally: Build twice and compare before CI
- Pin build dependencies: Ensure consistent build environment
- Document requirements: Note any reproducibility limitations
- Use lockfiles: Pin exact versions of build dependencies
Verify reproducibility in CI
# .github/workflows/reproducible.yml
jobs:
reproducible:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install build
- run: |
SOURCE_DATE_EPOCH=1577836800 python -m build -o dist1/
SOURCE_DATE_EPOCH=1577836800 python -m build -o dist2/
sha256sum dist1/* > sums1.txt
sha256sum dist2/* > sums2.txt
diff sums1.txt sums2.txt