The method and harness are public; the corpus is private (anti-gaming). You can reproduce the pipeline on the shipped example corpus, or point it at your own (see CORPUS_FORMAT.md).
Source: everything here lives in one repo — github.com/kurtpayne/skillscan-scoreboard: the harness (scoreboard/), the site generator (scripts/build_site.py), this methodology, the example corpus, and the board JSON the site is rendered from. Clone it and the commands below run as-is.
| component | pin |
|---|---|
| SkillSpector | git cff7ecc (static layer) |
| Cisco AI Defense skill-scanner | git ff708ea (static layer) |
| Snyk Agent Scan | 0.5.10 (cloud) |
| LLM baseline (in-set) | Qwen/Qwen2.5-72B-Instruct-AWQ, temp 0, generic prompt |
| LLM baseline (disjoint control) | microsoft/phi-4, temp 0, generic prompt |
| Frontier baselines | gpt-4o, gpt-4o-mini, claude-sonnet-4-6, claude-haiku-4-5 (raw read, temp 0) |
The graded scanners are run from their published source, cloned into a staging dir with each tool's own venv (binaries at <staging>/<tool>/.venv/bin/<tool>). Point the harness at it:
export SCOREBOARD_STAGING=/path/to/_scoreboard_staging # else auto-resolves to ./_scoreboard_staging
# or ~/skillscan-family/_scoreboard_staging
If a scanner binary is missing the harness now errors out rather than silently scoring every sample ERROR (= a false 0% board). Static scanners (--no-llm) need no API key.
SKILLSCAN_CORPUS=example_corpus python3 -m scoreboard.run_board --no-baseline --benign-cap 0 \
--out board.json
python3 scripts/build_site.py --board board.json # renders docs/
Put your corpus in the layout in CORPUS_FORMAT.md, then:
python3 -m scoreboard.run_board --corpus /path/to/your/corpus --snyk --out board.json
Flags: --scanner-llm (+llm via an OpenAI-compatible endpoint), --frontier-model <id> (separate frontier board), --k N (repeats for stochastic scanners), --workers N.
Our corpus is private (anti-gaming), but the non-generated parts come from public, independent sources — linked here so the provenance is checkable. (What we did generate is disclosed separately: organic malicious via gpt-4o / claude / deepseek, defanged-synthetic malicious via open-weight models — see Methodology §2.6. Those are ours and are not in this list.)
| Set (provenance) | n | Source (external, not ours) | License |
|---|---|---|---|
published_independent — the headline | 84 | Skill-Inject — github.com/aisa-group/skill-inject · arXiv:2602.20156 | verify (reconstructed locally, not redistributed) |
dual_use FP-bait (the X-axis) | 1588 | MaliciousAgentSkillsBench — github.com/protectskills/MaliciousAgentSkillsBench (mas-bench "suspicious"-recovered) | MIT |
benign control | 500 | real public GitHub skills (harvested; the openclaw/skills archive is the canonical public index — per-repo provenance is encoded in each sample id) | upstream repo licenses |
wild_verbatim real malicious | 5 | real disclosed skills — full per-sample source URLs + sha256 in WILD_PROVENANCE.md | as published |
Threat-prevalence grounding (cited, not a corpus input): Liu et al., USENIX Security 2026 — arXiv:2602.06547 (157 malicious in 98,380 skills). Scanner sources + licenses are in Notices.
Harness, adapters, analysis, and methodology are in this repo. Tests: pytest -q (40 tests).