Comparing Runs

Detect regressions by comparing check results between runs

Comparing Runs

The --diff option allows you to compare current check results against a previous run. This is invaluable for detecting regressions in CI/CD pipelines and tracking improvements over time.

Quick Start

# Save results from one run
pycmdcheck check --format json -o previous.json

# Later, compare against that run
pycmdcheck check --diff previous.json

The diff output shows:

  • New failures: Checks that now fail but passed before
  • Fixed issues: Checks that now pass but failed before
  • Unchanged: Issues that remain in the same state

Use Cases

Detecting Regressions in CI

The primary use case is catching regressions in pull requests:

# In CI: Compare PR against main branch
# 1. Checkout main and save baseline results
git checkout main
pycmdcheck check --format json -o main-results.json

# 2. Checkout PR and compare
git checkout pr-branch
pycmdcheck check --diff main-results.json

If new failures are introduced, the output highlights them clearly.

Tracking Progress Over Time

Store results periodically to track improvement:

# Weekly snapshot
pycmdcheck check --format json -o "checks-$(date +%Y%m%d).json"

# Compare against last week
pycmdcheck check --diff checks-20240108.json

Validating Fixes

After fixing issues, confirm they’re actually resolved:

# Before fix
pycmdcheck check --format json -o before-fix.json

# Make fixes...

# Verify fix worked
pycmdcheck check --diff before-fix.json

Understanding Diff Output

Console Output

With the default rich output format:

Structure (3 checks)
  [PASS] ST001 HasReadme - README file exists
  [PASS] ST002 HasLicense - LICENSE file exists
  [FAIL] ST003 HasPyproject - pyproject.toml exists

Summary: 2 passed, 1 failed

Diff Summary:
  New failures (1): ST003
  Fixed (2): MT001, MT002

JSON Output

When using --format json, the diff information is included in the output:

{
  "summary": {
    "total": 15,
    "passed": 12,
    "failed": 3
  },
  "diff": {
    "new_failures": ["ST003"],
    "fixed_issues": ["MT001", "MT002"],
    "unchanged_failures": ["CQ001"],
    "unchanged_passes": ["ST001", "ST002"]
  }
}

CI Integration

GitHub Actions

name: Package Quality

on:
  pull_request:
    branches: [main]

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Need full history for checkout

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install pycmdcheck
        run: pip install pycmdcheck

      # Get baseline from main branch
      - name: Checkout main and get baseline
        run: |
          git checkout main
          pycmdcheck check --format json -o /tmp/main-results.json
          git checkout -

      # Compare PR against main
      - name: Check for regressions
        run: pycmdcheck check --diff /tmp/main-results.json

GitHub Actions with Artifacts

For more sophisticated tracking, store results as artifacts:

name: Package Quality

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install pycmdcheck
        run: pip install pycmdcheck

      # Download previous results (if exists)
      - name: Download previous results
        uses: dawidd6/action-download-artifact@v2
        with:
          workflow: ${{ github.workflow }}
          name: pycmdcheck-results
          path: /tmp/previous
          if_no_artifact_found: warn
        continue-on-error: true

      # Run checks with diff
      - name: Run checks
        run: |
          if [ -f /tmp/previous/results.json ]; then
            pycmdcheck check --diff /tmp/previous/results.json --format json -o results.json
          else
            pycmdcheck check --format json -o results.json
          fi

      # Upload results for next run
      - name: Upload results
        uses: actions/upload-artifact@v4
        with:
          name: pycmdcheck-results
          path: results.json
          retention-days: 30

GitLab CI

stages:
  - quality

pycmdcheck:
  stage: quality
  script:
    - pip install pycmdcheck
    # Try to get previous results from artifacts
    - |
      if [ -f previous-results.json ]; then
        pycmdcheck check --diff previous-results.json --format json -o results.json
      else
        pycmdcheck check --format json -o results.json
      fi
  artifacts:
    paths:
      - results.json
    expire_in: 1 week
  cache:
    key: pycmdcheck-results
    paths:
      - previous-results.json
  after_script:
    - cp results.json previous-results.json

Fail on New Issues Only

To fail CI only when new issues are introduced (not for existing ones):

- name: Check for new issues
  run: |
    pycmdcheck check --diff previous.json --format json -o current.json

    # Check if there are new failures
    NEW_FAILURES=$(cat current.json | jq '.diff.new_failures | length')
    if [ "$NEW_FAILURES" -gt 0 ]; then
      echo "New failures detected!"
      cat current.json | jq '.diff.new_failures'
      exit 1
    fi

Combining with Other Features

Diff + Baseline

Use baselines for known issues and diff for regression detection:

# Baseline ignores known issues, diff catches new ones
pycmdcheck check --baseline baseline.json --diff previous.json

Diff + Profile

Use a specific profile for comparisons:

# Compare strict profile results
pycmdcheck check --profile strict --format json -o strict-main.json
# ... later ...
pycmdcheck check --profile strict --diff strict-main.json

Diff + Watch Mode

Watch mode doesn’t support diff directly, but you can use it in your workflow:

# 1. Save current state
pycmdcheck check --format json -o before.json

# 2. Use watch mode during development
pycmdcheck check --watch

# 3. After changes, compare
pycmdcheck check --diff before.json

JSON File Format

The diff feature works with pycmdcheck’s JSON output format:

{
  "summary": {
    "total": 15,
    "passed": 10,
    "failed": 5,
    "skipped": 0,
    "errors": 0
  },
  "groups": [
    {
      "name": "structure",
      "results": [
        {
          "check_id": "ST001",
          "check_name": "HasReadme",
          "status": "passed",
          "message": "README file exists"
        }
      ]
    }
  ]
}

The diff feature also supports a flat format:

{
  "results": [
    {
      "check_id": "ST001",
      "status": "passed"
    }
  ]
}

DiffResult Properties

When using the Python API, the DiffResult object provides:

Property Type Description
new_failures list[str] Check IDs that failed in current but not previous
fixed_issues list[str] Check IDs that passed in current but failed in previous
unchanged_failures list[str] Check IDs that failed in both runs
unchanged_passes list[str] Check IDs that passed in both runs
has_changes bool True if there are new failures or fixed issues
summary str Human-readable summary of changes

Python API

For programmatic comparison:

from pathlib import Path
from pycmdcheck import check
from pycmdcheck.diff import compare_runs, DiffResult

# Run current checks
results = check("./my-package")

# Compare with previous run
diff = compare_runs(
    current_results=results.all_results,
    previous_path=Path("previous.json")
)

# Inspect diff
if diff.has_changes:
    print(f"Changes detected: {diff.summary}")

    if diff.new_failures:
        print(f"New failures: {', '.join(diff.new_failures)}")

    if diff.fixed_issues:
        print(f"Fixed: {', '.join(diff.fixed_issues)}")
else:
    print("No changes from previous run")

# Use in CI logic
import sys
if diff.new_failures:
    print("Failing due to new issues")
    sys.exit(1)

Tips

  1. Store results consistently - Always use the same options when generating results to compare

  2. Use meaningful filenames - Include date, branch, or commit hash in filenames

  3. Archive results - Keep historical results for trend analysis

  4. Fail on new issues - In CI, fail only on new_failures to avoid blocking on existing issues

  5. Celebrate fixes - Report fixed_issues in PR comments to highlight improvements