Model Supply Chain Security
AI model supply chains introduce unique security risks. Oculum detects unsafe model loading, unverified sources, and compromised fine-tuning pipelines.
Supply Chain Risks
Model supply chains can be compromised at multiple points:
- Model Hosting — Compromised model files on HuggingFace, etc.
- Fine-tuning — Backdoors introduced during training
- Serialization — Pickle exploits and unsafe formats
- Verification — Models downloaded without integrity checks
Detectors
unsafe_model_load
Severity: Critical
Detects unsafe model loading patterns that could execute arbitrary code.
Triggers on:
pickle.load()with untrusted datatorch.load()withoutweights_only=Truejoblib.load()from network sourceseval()during model deserialization
Example Vulnerable Code:
# VULNERABLE: Pickle can execute arbitrary code
import pickle
with open("model.pkl", "rb") as f:
model = pickle.load(f) # Could run malicious code!
# VULNERABLE: PyTorch default allows pickle
import torch
model = torch.load("model.pt") # Unsafe by default
unverified_model_source
Severity: High
Detects models loaded from unverified sources without integrity checks.
Triggers on:
- Direct downloads without checksums
- Models from unknown repositories
- Missing signature verification
- HTTP (not HTTPS) model URLs
Example Vulnerable Code:
# VULNERABLE: No verification
import requests
model_url = "https://random-site.com/model.bin"
response = requests.get(model_url)
with open("model.bin", "wb") as f:
f.write(response.content) # No checksum!
unsafe_finetuning
Severity: High
Detects security issues in fine-tuning pipelines.
Triggers on:
- Training data from untrusted sources
- No data validation before training
- Missing output filtering
- Checkpoint saving to public locations
Example Vulnerable Code:
# VULNERABLE: Untrusted training data
training_data = load_from_public_source(url)
# No validation!
model.finetune(training_data)
model.save("./public_model") # Saved without review
Remediation
Safe Model Loading
# SAFE: Use weights_only mode
import torch
# PyTorch 2.0+
model = torch.load("model.pt", weights_only=True)
# Or use safetensors (recommended)
from safetensors.torch import load_model
load_model(model, "model.safetensors")
Verify Model Integrity
# SAFE: Verify checksums
import hashlib
from huggingface_hub import hf_hub_download
# Download with verification
model_path = hf_hub_download(
repo_id="organization/model-name",
filename="model.safetensors",
revision="main", # Pin to specific revision
cache_dir="./models"
)
# Additional checksum verification
def verify_checksum(path: str, expected: str) -> bool:
sha256 = hashlib.sha256()
with open(path, "rb") as f:
for chunk in iter(lambda: f.read(8192), b""):
sha256.update(chunk)
return sha256.hexdigest() == expected
assert verify_checksum(model_path, EXPECTED_CHECKSUM)
Trusted Model Sources
# SAFE: Use trusted sources with verification
from transformers import AutoModel
# HuggingFace with trust_remote_code=False (default)
model = AutoModel.from_pretrained(
"meta-llama/Llama-3-8b",
trust_remote_code=False, # Don't run arbitrary code
revision="abc123def" # Pin to specific commit
)
Safe Fine-tuning
# SAFE: Validate training data
from datasets import load_dataset
def validate_training_example(example):
# Check for injection patterns
if contains_injection_patterns(example["text"]):
return False
# Check for sensitive content
if contains_pii(example["text"]):
return False
return True
# Load and filter
dataset = load_dataset("organization/dataset")
dataset = dataset.filter(validate_training_example)
# Fine-tune with validated data
trainer = Trainer(
model=model,
train_dataset=dataset,
args=TrainingArguments(
output_dir="./private_output", # Private location
save_strategy="epoch"
)
)
trainer.train()
# Review before publishing
review_model_outputs(trainer.model)
Model Format Security
| Format | Risk Level | Notes |
|---|---|---|
.safetensors | Low | No code execution, recommended |
.pt (weights_only) | Low | Safe with weights_only=True |
| GGUF | Low | Safe format for llama.cpp |
| ONNX | Medium | Generally safe, verify source |
.pt (default) | High | Uses pickle, can run code |
.pkl / .pickle | Critical | Arbitrary code execution |
.joblib | Critical | Uses pickle internally |
Supply Chain Best Practices
1. Pin Model Versions
# Bad: Latest version
model = AutoModel.from_pretrained("org/model")
# Good: Pinned revision
model = AutoModel.from_pretrained(
"org/model",
revision="v1.2.3" # or commit hash
)
2. Use Private Model Registry
# Self-hosted or private registry
from transformers import AutoModel
model = AutoModel.from_pretrained(
"company/internal-model",
token=os.environ["HF_TOKEN"],
cache_dir="/secure/models"
)
3. Scan Models Before Use
# Use Oculum to scan model loading code
oculum scan ./model_loader.py --depth deep
4. Disable Remote Code
# Never trust remote code by default
model = AutoModel.from_pretrained(
"some/model",
trust_remote_code=False # Explicit denial
)
Common Vulnerabilities
| Vulnerability | Attack Vector | Impact |
|---|---|---|
| Pickle deserialization | Malicious model file | Remote code execution |
| Model poisoning | Compromised HuggingFace repo | Backdoored outputs |
| Fine-tuning injection | Malicious training data | Behavior manipulation |
| Unverified download | Man-in-the-middle | Model substitution |
Related
- Package Hallucination — Malicious dependencies
- Unsafe Execution — Code execution risks
- CI/CD Setup — Automate security checks