skillscan.sh
independent scoreboard for AI-skill security scanners

Reproduce

The method and harness are public; the corpus is private (anti-gaming). You can reproduce the pipeline on the shipped example corpus, or point it at your own (see CORPUS_FORMAT.md).

Source: everything here lives in one repo — github.com/kurtpayne/skillscan-scoreboard: the harness (scoreboard/), the site generator (scripts/build_site.py), this methodology, the example corpus, and the board JSON the site is rendered from. Clone it and the commands below run as-is.

Pinned versions (this board run · 2026-06-17 · corpus v1.1)

componentpin
SkillSpectorgit cff7ecc (static layer)
Cisco AI Defense skill-scannergit ff708ea (static layer)
Snyk Agent Scan0.5.10 (cloud)
LLM baseline (in-set)Qwen/Qwen2.5-72B-Instruct-AWQ, temp 0, generic prompt
LLM baseline (disjoint control)microsoft/phi-4, temp 0, generic prompt
Frontier baselinesgpt-4o, gpt-4o-mini, claude-sonnet-4-6, claude-haiku-4-5 (raw read, temp 0)

Scanner staging (required for the real scanners)

The graded scanners are run from their published source, cloned into a staging dir with each tool's own venv (binaries at <staging>/<tool>/.venv/bin/<tool>). Point the harness at it:

export SCOREBOARD_STAGING=/path/to/_scoreboard_staging   # else auto-resolves to ./_scoreboard_staging
                                                         # or ~/skillscan-family/_scoreboard_staging

If a scanner binary is missing the harness now errors out rather than silently scoring every sample ERROR (= a false 0% board). Static scanners (--no-llm) need no API key.

Run the pipeline on the example corpus (offline, static scanners)

SKILLSCAN_CORPUS=example_corpus python3 -m scoreboard.run_board --no-baseline --benign-cap 0 \
    --out board.json
python3 scripts/build_site.py --board board.json     # renders docs/

Run on your own corpus

Put your corpus in the layout in CORPUS_FORMAT.md, then:

python3 -m scoreboard.run_board --corpus /path/to/your/corpus --snyk --out board.json

Flags: --scanner-llm (+llm via an OpenAI-compatible endpoint), --frontier-model <id> (separate frontier board), --k N (repeats for stochastic scanners), --workers N.

Data sources (the parts of the corpus we did not author)

Our corpus is private (anti-gaming), but the non-generated parts come from public, independent sources — linked here so the provenance is checkable. (What we did generate is disclosed separately: organic malicious via gpt-4o / claude / deepseek, defanged-synthetic malicious via open-weight models — see Methodology §2.6. Those are ours and are not in this list.)

Set (provenance)nSource (external, not ours)License
published_independent — the headline84Skill-Injectgithub.com/aisa-group/skill-inject · arXiv:2602.20156verify (reconstructed locally, not redistributed)
dual_use FP-bait (the X-axis)1588MaliciousAgentSkillsBenchgithub.com/protectskills/MaliciousAgentSkillsBench (mas-bench "suspicious"-recovered)MIT
benign control500real public GitHub skills (harvested; the openclaw/skills archive is the canonical public index — per-repo provenance is encoded in each sample id)upstream repo licenses
wild_verbatim real malicious5real disclosed skills — full per-sample source URLs + sha256 in WILD_PROVENANCE.mdas published

Threat-prevalence grounding (cited, not a corpus input): Liu et al., USENIX Security 2026arXiv:2602.06547 (157 malicious in 98,380 skills). Scanner sources + licenses are in Notices.

What's reproducible vs not

Code

Harness, adapters, analysis, and methodology are in this repo. Tests: pytest -q (40 tests).