Baseline Management
Baseline Management
Baselines allow you to acknowledge existing issues while preventing new ones from being introduced. This is essential for gradually improving code quality without blocking development on legacy issues.
What Are Baselines For?
Baselines solve the “big bang” problem: when you adopt a new linter or quality tool, you often face hundreds of existing issues. You have three options:
- Fix everything first - Blocks all development until cleanup is complete
- Disable the checks - Loses the benefit of the tool entirely
- Use a baseline - Track existing issues, fail only on new ones
Baselines enable option 3: you can adopt pycmdcheck immediately, baseline your current issues, and ensure no new issues are introduced while you gradually address the backlog.
Quick Start
Generate a Baseline
Run pycmdcheck and save current failures to a baseline file:
pycmdcheck check --generate-baseline baseline.jsonOutput:
Baseline generated: baseline.json (5 check(s))
Check Against Baseline
Use the baseline to skip known issues:
pycmdcheck check --baseline baseline.jsonNow only new failures will cause a non-zero exit code. Baselined checks are skipped.
Baseline File Format
The baseline is a JSON file with a simple structure:
{
"ignored": [
{
"check_id": "ST001",
"reason": "Baselined from current failures"
},
{
"check_id": "MT003",
"reason": "Baselined from current failures"
},
{
"check_id": "CQ002",
"reason": "Need to add py.typed - tracked in #123",
"file": "src/mypackage/__init__.py"
}
]
}Entry Fields
| Field | Required | Description |
|---|---|---|
check_id |
Yes | The check ID to ignore (e.g., “ST001”) |
reason |
No | Explanation of why the check is baselined |
file |
No | Specific file path for location-specific ignores |
Manual Editing
You can manually edit the baseline file to:
- Add reasons for each entry
- Remove entries when issues are fixed
- Add file-specific ignores
{
"ignored": [
{
"check_id": "MT003",
"reason": "Waiting for upstream fix - issue #456"
},
{
"check_id": "CQ002",
"reason": "Will add type hints in v2.0",
"file": "src/legacy/old_module.py"
}
]
}Workflow Examples
Initial Adoption
When first adopting pycmdcheck on an existing project:
# 1. Run initial check to see all issues
pycmdcheck check
# 2. Generate baseline from current state
pycmdcheck check --generate-baseline .pycmdcheck-baseline.json
# 3. Commit the baseline
git add .pycmdcheck-baseline.json
git commit -m "Add pycmdcheck baseline for existing issues"
# 4. Add to CI (see CI Integration below)Gradual Improvement
As you fix issues, update the baseline:
# 1. Fix an issue (e.g., add README.md)
echo "# My Package" > README.md
# 2. Re-run check - ST001 now passes
pycmdcheck check --baseline .pycmdcheck-baseline.json
# 3. Regenerate baseline to remove fixed issue
pycmdcheck check --generate-baseline .pycmdcheck-baseline.json
# 4. Commit the updated baseline
git add .pycmdcheck-baseline.json
git commit -m "chore: remove ST001 from baseline (README added)"Preventing New Issues
With a baseline in place, new issues will fail:
# Someone removes the LICENSE file
rm LICENSE
# Check fails on the NEW issue (ST002)
pycmdcheck check --baseline .pycmdcheck-baseline.json
# Exit code: 1 - ST002 is not baselined!CI Integration
GitHub Actions
name: Package Quality
on: [push, pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install pycmdcheck
run: pip install pycmdcheck
- name: Check package (with baseline)
run: pycmdcheck check --baseline .pycmdcheck-baseline.jsonGitLab CI
pycmdcheck:
stage: lint
script:
- pip install pycmdcheck
- pycmdcheck check --baseline .pycmdcheck-baseline.jsonBaseline + Profile
Combine baselines with profiles:
# Use strict profile but respect baseline
pycmdcheck check --profile strict --baseline baseline.jsonManaging Baselines
Viewing Baseline Contents
The baseline is plain JSON, so you can view and edit it with any text editor or JSON tool:
# View baseline
cat baseline.json | python -m json.tool
# Count baselined issues
cat baseline.json | jq '.ignored | length'
# List baselined check IDs
cat baseline.json | jq -r '.ignored[].check_id'Validating Baseline
If a baselined check now passes, the baseline still works (the check is simply skipped). To find “stale” baseline entries that are no longer needed:
# Run without baseline
pycmdcheck check --format json > current.json
# Compare with baseline
# Entries in baseline but not in current failures are staleSplitting Baselines
For large projects, you might want separate baselines:
# Generate separate baselines per group
pycmdcheck check --only-group structure --generate-baseline baseline-structure.json
pycmdcheck check --only-group metadata --generate-baseline baseline-metadata.json
# Use in CI (check one at a time)
pycmdcheck check --only-group structure --baseline baseline-structure.json
pycmdcheck check --only-group metadata --baseline baseline-metadata.jsonBest Practices
1. Document Reasons
Always add meaningful reasons to baseline entries:
{
"ignored": [
{
"check_id": "MT003",
"reason": "Blocked by upstream issue python/mypy#12345"
},
{
"check_id": "CQ002",
"reason": "Adding type hints in milestone 2.1 - see ROADMAP.md"
}
]
}2. Track Baseline Size
Monitor your baseline size over time. It should shrink as you fix issues:
# In CI, report baseline size
echo "Baselined issues: $(cat baseline.json | jq '.ignored | length')"3. Set Reduction Goals
Create issues or milestones to reduce baseline size:
- “Reduce pycmdcheck baseline from 15 to 10 issues by Q2”
- “Clear all structure group issues by v2.0”
4. Review Baseline in PRs
Make baseline changes visible in code review. Consider adding a CI check that comments on baseline changes:
# Example: Warn if baseline grows
- name: Check baseline size
run: |
OLD_SIZE=$(git show HEAD~1:baseline.json | jq '.ignored | length')
NEW_SIZE=$(cat baseline.json | jq '.ignored | length')
if [ "$NEW_SIZE" -gt "$OLD_SIZE" ]; then
echo "::warning::Baseline grew from $OLD_SIZE to $NEW_SIZE issues"
fi5. Don’t Baseline Everything
Resist the temptation to baseline every failure. Some issues should block CI:
- Security issues (
securitygroup) - Missing required metadata (
metadatagroup) - Build-breaking issues
Consider using profiles to separate must-fix from nice-to-have:
# Critical checks without baseline
pycmdcheck check --profile release
# All checks with baseline
pycmdcheck check --baseline baseline.jsonPython API
For programmatic baseline management:
from pathlib import Path
from pycmdcheck.baseline import Baseline, BaselineEntry
# Create a baseline
baseline = Baseline()
baseline.add("ST001", reason="Will add README in v2.0")
baseline.add("MT003", reason="Metadata issue tracked in #123")
# Save to file
baseline.save(Path("baseline.json"))
# Load from file
loaded = Baseline.load(Path("baseline.json"))
# Check if a check is ignored
if loaded.is_ignored("ST001"):
print("ST001 is baselined")
# Create from check results
from pycmdcheck import check
results = check("./my-package")
baseline = Baseline.from_check_results(
results.all_results,
reason="Initial baseline - 2024-01-15"
)
baseline.save(Path("baseline.json"))Troubleshooting
Baseline Not Working
If checks still fail despite being in the baseline:
- Verify the check ID matches exactly (case-sensitive)
- Ensure the baseline file path is correct
- Check for JSON syntax errors in the baseline file
# Validate JSON syntax
python -m json.tool baseline.json > /dev/null && echo "Valid JSON"Stale Baseline Entries
If you want to clean up entries for checks that now pass:
# Regenerate baseline from current failures only
pycmdcheck check --generate-baseline baseline.jsonNote: This will lose your custom reasons. Consider manually removing fixed entries instead.