SC009: NoUnsafeDeserialization

Overview

Property Value
ID SC009
Name NoUnsafeDeserialization
Group security
Severity ERROR

Description

Scans Python source files for unsafe deserialization calls using pickle, marshal, and shelve modules, which can execute arbitrary code when deserializing untrusted data.

Using unsafe deserialization is a critical security vulnerability that can lead to:

  • Arbitrary code execution during deserialization
  • Remote code execution if processing user-supplied data
  • Complete system compromise
  • Privilege escalation and data theft

What it checks

The check scans all .py files (excluding test files, .venv/, and __pycache__/) for:

  • pickle.load(): Load pickled object from file
  • pickle.loads(): Load pickled object from bytes
  • marshal.load(): Load marshalled object from file
  • marshal.loads(): Load marshalled object from bytes
  • shelve.open(): Open a persistent dictionary (uses pickle internally)

Result states

  • PASSED: No unsafe deserialization calls found
  • FAILED: One or more unsafe deserialization calls detected

How to fix

Use JSON for data serialization

import json

# Bad: pickle for data storage
import pickle
with open("data.pkl", "rb") as f:
    data = pickle.load(f)

# Good: JSON is safe for data serialization
with open("data.json", "r") as f:
    data = json.load(f)

Use structured data formats

import tomllib  # Python 3.11+
import yaml

# Good: TOML for configuration
with open("config.toml", "rb") as f:
    config = tomllib.load(f)

# Good: YAML with safe_load for configuration
with open("config.yaml", "r") as f:
    config = yaml.safe_load(f)

Use dataclasses with JSON for complex objects

from dataclasses import dataclass, asdict
import json

@dataclass
class User:
    name: str
    email: str

# Good: serialize to JSON
user = User("Alice", "alice@example.com")
with open("user.json", "w") as f:
    json.dump(asdict(user), f)

# Good: deserialize from JSON
with open("user.json", "r") as f:
    data = json.load(f)
    user = User(**data)

If pickle is absolutely necessary, use hmac verification

import pickle
import hmac
import hashlib

SECRET_KEY = b"your-secret-key"

def secure_dumps(obj):
    data = pickle.dumps(obj)
    signature = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
    return signature + data

def secure_loads(signed_data):
    signature = signed_data[:32]
    data = signed_data[32:]
    expected = hmac.new(SECRET_KEY, data, hashlib.sha256).digest()
    if not hmac.compare_digest(signature, expected):
        raise ValueError("Invalid signature - data may be tampered")
    return pickle.loads(data)

Why ERROR severity?

This check is an ERROR because:

  • Pickle/marshal can execute arbitrary code during deserialization
  • Attackers can craft malicious payloads to compromise systems
  • The Python documentation explicitly warns against unpickling untrusted data
  • Safe alternatives like JSON exist for most use cases

Configuration

Skip this check

[tool.pycmdcheck]
skip = ["SC009"]

CLI

pycmdcheck --skip SC009