Baseline Management

Track and ignore known issues while preventing new ones

Baseline Management

Baselines allow you to acknowledge existing issues while preventing new ones from being introduced. This is essential for gradually improving code quality without blocking development on legacy issues.

What Are Baselines For?

Baselines solve the “big bang” problem: when you adopt a new linter or quality tool, you often face hundreds of existing issues. You have three options:

Fix everything first - Blocks all development until cleanup is complete
Disable the checks - Loses the benefit of the tool entirely
Use a baseline - Track existing issues, fail only on new ones

Baselines enable option 3: you can adopt pycmdcheck immediately, baseline your current issues, and ensure no new issues are introduced while you gradually address the backlog.

Quick Start

Generate a Baseline

Run pycmdcheck and save current failures to a baseline file:

pycmdcheck check --generate-baseline baseline.json

Output:

Baseline generated: baseline.json (5 check(s))

Check Against Baseline

Use the baseline to skip known issues:

pycmdcheck check --baseline baseline.json

Now only new failures will cause a non-zero exit code. Baselined checks are skipped.

Baseline File Format

The baseline is a JSON file with a simple structure:

{
  "ignored": [
    {
      "check_id": "ST001",
      "reason": "Baselined from current failures"
    },
    {
      "check_id": "MT003",
      "reason": "Baselined from current failures"
    },
    {
      "check_id": "CQ002",
      "reason": "Need to add py.typed - tracked in #123",
      "file": "src/mypackage/__init__.py"
    }
  ]
}

Entry Fields

Field	Required	Description
`check_id`	Yes	The check ID to ignore (e.g., “ST001”)
`reason`	No	Explanation of why the check is baselined
`file`	No	Specific file path for location-specific ignores

Manual Editing

You can manually edit the baseline file to:

Add reasons for each entry
Remove entries when issues are fixed
Add file-specific ignores

{
  "ignored": [
    {
      "check_id": "MT003",
      "reason": "Waiting for upstream fix - issue #456"
    },
    {
      "check_id": "CQ002",
      "reason": "Will add type hints in v2.0",
      "file": "src/legacy/old_module.py"
    }
  ]
}

Workflow Examples

Initial Adoption

When first adopting pycmdcheck on an existing project:

# 1. Run initial check to see all issues
pycmdcheck check

# 2. Generate baseline from current state
pycmdcheck check --generate-baseline .pycmdcheck-baseline.json

# 3. Commit the baseline
git add .pycmdcheck-baseline.json
git commit -m "Add pycmdcheck baseline for existing issues"

# 4. Add to CI (see CI Integration below)

Gradual Improvement

As you fix issues, update the baseline:

# 1. Fix an issue (e.g., add README.md)
echo "# My Package" > README.md

# 2. Re-run check - ST001 now passes
pycmdcheck check --baseline .pycmdcheck-baseline.json

# 3. Regenerate baseline to remove fixed issue
pycmdcheck check --generate-baseline .pycmdcheck-baseline.json

# 4. Commit the updated baseline
git add .pycmdcheck-baseline.json
git commit -m "chore: remove ST001 from baseline (README added)"

Preventing New Issues

With a baseline in place, new issues will fail:

# Someone removes the LICENSE file
rm LICENSE

# Check fails on the NEW issue (ST002)
pycmdcheck check --baseline .pycmdcheck-baseline.json
# Exit code: 1 - ST002 is not baselined!

CI Integration

GitHub Actions

name: Package Quality

on: [push, pull_request]

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install pycmdcheck
        run: pip install pycmdcheck

      - name: Check package (with baseline)
        run: pycmdcheck check --baseline .pycmdcheck-baseline.json

GitLab CI

pycmdcheck:
  stage: lint
  script:
    - pip install pycmdcheck
    - pycmdcheck check --baseline .pycmdcheck-baseline.json

Baseline + Profile

Combine baselines with profiles:

# Use strict profile but respect baseline
pycmdcheck check --profile strict --baseline baseline.json

Managing Baselines

Viewing Baseline Contents

The baseline is plain JSON, so you can view and edit it with any text editor or JSON tool:

# View baseline
cat baseline.json | python -m json.tool

# Count baselined issues
cat baseline.json | jq '.ignored | length'

# List baselined check IDs
cat baseline.json | jq -r '.ignored[].check_id'

Validating Baseline

If a baselined check now passes, the baseline still works (the check is simply skipped). To find “stale” baseline entries that are no longer needed:

# Run without baseline
pycmdcheck check --format json > current.json

# Compare with baseline
# Entries in baseline but not in current failures are stale

Splitting Baselines

For large projects, you might want separate baselines:

# Generate separate baselines per group
pycmdcheck check --only-group structure --generate-baseline baseline-structure.json
pycmdcheck check --only-group metadata --generate-baseline baseline-metadata.json

# Use in CI (check one at a time)
pycmdcheck check --only-group structure --baseline baseline-structure.json
pycmdcheck check --only-group metadata --baseline baseline-metadata.json

Best Practices

1. Document Reasons

Always add meaningful reasons to baseline entries:

{
  "ignored": [
    {
      "check_id": "MT003",
      "reason": "Blocked by upstream issue python/mypy#12345"
    },
    {
      "check_id": "CQ002",
      "reason": "Adding type hints in milestone 2.1 - see ROADMAP.md"
    }
  ]
}

2. Track Baseline Size

Monitor your baseline size over time. It should shrink as you fix issues:

# In CI, report baseline size
echo "Baselined issues: $(cat baseline.json | jq '.ignored | length')"

3. Set Reduction Goals

Create issues or milestones to reduce baseline size:

“Reduce pycmdcheck baseline from 15 to 10 issues by Q2”
“Clear all structure group issues by v2.0”

4. Review Baseline in PRs

Make baseline changes visible in code review. Consider adding a CI check that comments on baseline changes:

# Example: Warn if baseline grows
- name: Check baseline size
  run: |
    OLD_SIZE=$(git show HEAD~1:baseline.json | jq '.ignored | length')
    NEW_SIZE=$(cat baseline.json | jq '.ignored | length')
    if [ "$NEW_SIZE" -gt "$OLD_SIZE" ]; then
      echo "::warning::Baseline grew from $OLD_SIZE to $NEW_SIZE issues"
    fi

5. Don’t Baseline Everything

Resist the temptation to baseline every failure. Some issues should block CI:

Security issues (security group)
Missing required metadata (metadata group)
Build-breaking issues

Consider using profiles to separate must-fix from nice-to-have:

# Critical checks without baseline
pycmdcheck check --profile release

# All checks with baseline
pycmdcheck check --baseline baseline.json

Python API

For programmatic baseline management:

from pathlib import Path
from pycmdcheck.baseline import Baseline, BaselineEntry

# Create a baseline
baseline = Baseline()
baseline.add("ST001", reason="Will add README in v2.0")
baseline.add("MT003", reason="Metadata issue tracked in #123")

# Save to file
baseline.save(Path("baseline.json"))

# Load from file
loaded = Baseline.load(Path("baseline.json"))

# Check if a check is ignored
if loaded.is_ignored("ST001"):
    print("ST001 is baselined")

# Create from check results
from pycmdcheck import check
results = check("./my-package")
baseline = Baseline.from_check_results(
    results.all_results,
    reason="Initial baseline - 2024-01-15"
)
baseline.save(Path("baseline.json"))

Troubleshooting

Baseline Not Working

If checks still fail despite being in the baseline:

Verify the check ID matches exactly (case-sensitive)
Ensure the baseline file path is correct
Check for JSON syntax errors in the baseline file

# Validate JSON syntax
python -m json.tool baseline.json > /dev/null && echo "Valid JSON"

Stale Baseline Entries

If you want to clean up entries for checks that now pass:

# Regenerate baseline from current failures only
pycmdcheck check --generate-baseline baseline.json

Note: This will lose your custom reasons. Consider manually removing fixed entries instead.