Documentation
Everything you need to start catching prompt regressions.
Quick Start
Install the SDK and set your API keys. You'll be running checks in under 2 minutes.
pip install windtunnel-ai
export WINDTUNNEL_API_KEY=wt_your_key_here
export ANTHROPIC_API_KEY=sk-ant-...Find your API key in the API Keys section of the dashboard.
Record Interactions
Wrap your agent to automatically record every interaction to Windtunnel.
from windtunnel import WindTunnel
wt = WindTunnel(api_key="wt_your_key")
# In your agent handler:
response = your_agent.run(user_message)
wt.record(
user_input=user_message,
agent_output=response,
prompt_version="v1", # track your prompt version
model="claude-haiku-4-5", # optional
)Recorded interactions become the test suite for future checks. The more you record, the better your coverage.
Run a Check
Compare your baseline prompt against a challenger. Windtunnel replays your recorded interactions through both and scores the results with an LLM judge.
windtunnel check \
--baseline @prompts/v1.txt \
--challenger @prompts/v2.txt \
--n 20 \
--fail-on-regressionCI/CD Integration
Add Windtunnel to your GitHub Actions workflow to automatically block merges when prompt quality degrades. Copy windtunnel.yml from the dashboard into .github/workflows/ in your repo, then add your API key as a repository secret.
# .github/workflows/windtunnel.yml
name: Windtunnel Check
on:
pull_request:
branches: [main]
jobs:
windtunnel-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- run: pip install windtunnel-ai
- name: Run Windtunnel check
env:
WINDTUNNEL_API_KEY: ${{ secrets.WINDTUNNEL_API_KEY }}
run: |
python - <<'EOF'
import os, sys, json
from pathlib import Path
from windtunnel import WindTunnel
wt = WindTunnel(api_key=os.environ["WINDTUNNEL_API_KEY"])
# Reads prompts from env vars or falls back to files
baseline = os.environ.get("BASELINE_PROMPT") or Path("prompts/baseline.txt").read_text()
challenger = os.environ.get("CHALLENGER_PROMPT") or Path("prompts/challenger.txt").read_text()
# Loads tests from windtunnel_tests.json or uses 3 example interactions
if Path("windtunnel_tests.json").exists():
interactions = json.loads(Path("windtunnel_tests.json").read_text())
else:
interactions = [
{"user_input": "What is 2+2?",
"baseline_output": "4", "challenger_output": "4"},
{"user_input": "What is the capital of France?",
"baseline_output": "Paris.", "challenger_output": "Paris is the capital of France."},
{"user_input": "Reverse a string in Python.",
"baseline_output": "Use s[::-1].", "challenger_output": "Use s[::-1] or reversed(s)."},
]
result = wt.check(baseline_prompt=baseline, challenger_prompt=challenger,
interactions=interactions)
print(f"Verdict: {result['verdict']} | Regression rate: {result['regression_rate']:.0%}")
sys.exit(1 if result["verdict"] == "BLOCKED" else 0)
EOF1. Download windtunnel.yml from the dashboard and place it in .github/workflows/.
2. Add WINDTUNNEL_API_KEY to your repo's Settings → Secrets and variables → Actions.
3. Optionally add windtunnel_tests.json to your repo root with your test interactions.
Exits with code 1 if verdict is BLOCKED, failing the PR check and preventing the merge automatically.
Python SDK Reference
wt = WindTunnel(
api_key: str, # required — your wt_* key
anthropic_api_key: str = None, # falls back to ANTHROPIC_API_KEY env
supabase_url: str = None, # optional override
supabase_key: str = None # optional override
)wt.record(
user_input: str, # required
agent_output: str, # required
prompt_version: str = 'v1',
model: str = 'claude-haiku-4-5',
metadata: dict = {},
session_id: str = None # auto-generated if not provided
) -> dictwt.run_windtunnel(
baseline_prompt: str, # required
challenger_prompt: str, # required
n_interactions: int = 10,
baseline_version: str = 'v1',
challenger_version: str = 'v2',
run_name: str = None
) -> {
run_id: str,
verdict: 'APPROVED' | 'BLOCKED' | 'NEUTRAL',
total: int,
better: int,
worse: int,
neutral: int,
regression_rate: float # 0.0 – 1.0
}CLI Reference
windtunnel check [OPTIONS]
Options:
--api-key TEXT Windtunnel API key [env: WINDTUNNEL_API_KEY]
--anthropic-key TEXT Anthropic API key [env: ANTHROPIC_API_KEY]
--baseline TEXT Baseline prompt or @file.txt [required]
--challenger TEXT Challenger prompt or @file.txt [required]
--n INTEGER Interactions to test [default: 10]
--fail-on-regression Exit 1 if verdict is BLOCKEDwindtunnel status [OPTIONS]
Options:
--api-key TEXT Windtunnel API key [env: WINDTUNNEL_API_KEY]
Verifies your connection and prints your project ID.