Security Tooling Landscape

Research and evaluation of automated security tooling for open source projects. This directory tracks tools we‘ve evaluated, how they compare to what we’ve built, and where they complement each other.

Tooling-Agents Pipelines

The Tooling team maintains two pipelines, both running on the Gofannon agent platform:

ASVS Security Audit Pipeline (ASVS/) — LLM-driven code analysis against OWASP ASVS v5.0.0 requirements. Uses architecture-aware domain scoping to audit 345 requirements across 3 levels. Produces per-requirement reports, consolidated findings, and GitHub issues. In production, piloted on ATR and Apache Steve.

GitHub Actions Security Review (gha-review/) — Automated analysis of GHA workflows across an entire GitHub organization. Combines LLM classification (which repos publish what, where) with static pattern matching (12 check types from CRITICAL to INFO). Scanned 2,500+ Apache repos, found 22 CRITICAL findings across publishing pipelines. Six agents: pre-fetch, publishing, security, review, brief, JSON export.

External Tools

Code and Application Security

OpenSSF ScorecardOSS-CRSStrixBriefScrutineer
What it doesScores project security posture (practices, not code)Autonomous bug-finding and patching via fuzzing + LLMAI pentest agents that exploit and validate vulns with PoCsDetects project toolchain, config, and conventionsAutomated penetration testing with modular LLM skills
Approach20+ automated heuristic checksCyber Reasoning Systems: fuzz → find → confirm → patchMulti-agent exploitation: recon, browser, proxy, terminal, code analysisSingle-binary CLI, static file detection, structured JSON/markdown outputContainerized Go tool, skill-based testing via LLM
What it findsMissing branch protection, no SAST, unpinned deps, no SECURITY.mdMemory safety bugs, crashes, confirmed with PoV + patchesXSS, IDOR, SQLi, CSRF, auth bypass — validated with proof-of-conceptLanguages, frameworks, package managers, CI config, conventionsTBD
Language supportLanguage-agnostic (checks practices, not code)C, C++, Java (OSS-Fuzz format required)Any (runs apps dynamically in Docker sandbox)35+ package managers, any languageTBD
LLM usageNoneOptional: LLM-augmented fuzz harness generationCore: LLM plans and drives attack toolsNoneCore: LLM skills for analysis
MaturityMature, widely adoptedSandbox stage at OpenSSF (Apr 2026)Active, 20k+ stars, 319 commitsEarly (26 commits, 1 star)Very early (2 commits)
OriginOpenSSF / GoogleDARPA AIxCC → OpenSSFusestrix (Apache-2.0)git-pkgsAlpha-Omega
URLscorecard.devossf/oss-crsusestrix/strixgit-pkgs/briefalpha-omega-security/scrutineer

CI/CD and Workflow Security

GHA Review PipelineASF infrastructure-actionszizmor
What it doesOrg-wide GHA security audit: finds exploitable workflows across all reposAllowlist governance: controls which external actions ASF projects can usePer-repo static analysis of workflow YAML for security issues
ScopeAll repos in a GitHub org (scanned 2,500+ Apache repos)All apache/* repos (policy enforcement)Single repo or directory of workflows
What it findspull_request_target + checkout exploits, unpinned actions, broad permissions, supply chain risks (12 check types)N/A — it's a gate, not a scanner. utils/action-usage.sh checks if an action is still used org-wideInjection via ${{ }} interpolation, unpinned actions, excessive permissions, cache poisoning, known vulnerable actions
ApproachLLM classification (publishing analysis) + static pattern matching (security scan), org-wideCurated allowlist (actions.yml) with security review for each action/version. Dependabot updates pinned SHAsRust-based static analysis of workflow YAML, SARIF output for GitHub code scanning
LLM usageYes (Sonnet for publishing classification)NoneNone
OutputExecutive brief, combined review, per-repo security findings, JSON exportAllowlist config, action usage reportsSARIF findings, annotations, human-readable reports
MaturityIn productionIn production (ASF Infra)Mature, adopted by Grafana Labs and others at scale
URLgha-review/apache/infrastructure-actionsdocs.zizmor.sh

How They Relate

Scorecard checks whether you have good security practices (branch protection, dependency management, SAST enabled) but never reads your code. Complementary to our ASVS pipeline, which does the opposite: deep code analysis against a formal standard. We ran Scorecard on ATR and scored 6.2/10 — the findings (missing SAST, branch protection gaps) were entirely disjoint from what our ASVS audit found (missing rate limiting, session fixation, weak crypto).

OSS-CRS finds memory safety bugs through fuzzing and generates patches. Completely different class of vulnerability from what ASVS covers. Requires OSS-Fuzz-compatible build configuration, so it's limited to projects already set up for fuzzing. The 20-40% semantically incorrect patch rate underscores the need for human review.

Strix is the closest analog to the ASVS pipeline in terms of ambition — it also uses LLMs to find security issues in application code. But the approach is fundamentally different: Strix acts as a pentester (running code, probing endpoints, exploiting vulns) while the ASVS pipeline acts as an auditor (reading code against a compliance standard). Strix finds exploitable runtime vulnerabilities with PoCs; the ASVS pipeline finds architectural gaps and missing controls against ASVS requirements. Both are valuable, and they'd find different things on the same codebase.

Brief is interesting as a potential complement to the discover_codebase_architecture agent. Brief does static project structure detection (languages, frameworks, package managers, CI config) as a fast Go binary. The discover agent uses an LLM to map code into ASVS-relevant security domains. Brief could provide the initial inventory that the LLM then reasons about, potentially reducing token usage and improving accuracy.

Scrutineer is early but likely headed toward automated pentesting, similar to Strix. The Alpha-Omega team's focus is finding real vulnerabilities in critical OSS projects, and the repo structure (Go binary, modular skills, Docker runner) suggests a tool for active security testing rather than static analysis. Worth monitoring — Alpha-Omega has the funding and OSS relationships to make this impactful.

The GHA review pipeline is unique among these tools — none of the others analyze CI/CD workflow security at the organization level. It complements two tools already in use at ASF:

ASF infrastructure-actions is the governance layer: it maintains an allowlist of approved actions and requires security review before any external action can be used. utils/action-usage.sh checks whether an action is still used anywhere across the org. This is policy enforcement — it controls what can run but doesn't audit what is running. The GHA review pipeline fills that gap by scanning all 2,500+ repos to find exploitable patterns in the workflows themselves.

zizmor is the per-repo static scanner recommended by ASF Infra. It's excellent for individual repos — it finds injection via ${{ }} interpolation, unpinned actions, cache poisoning, and known vulnerable actions, and outputs SARIF for GitHub code scanning. The GHA review pipeline operates at a different level: org-wide risk assessment, cross-referencing security findings with publishing analysis to identify which vulnerable repos actually push packages to public registries. zizmor tells you “this workflow has an injection”; the GHA review pipeline tells you “this repo publishes to PyPI AND has a CRITICAL injection — this is your P0.”

Evaluation Results

ToolProjectDateReport location
ScorecardATRJan 2026results/scorecard-atr.md (details)
ASVS Pipeline (L1+L2)ATR (da901ba)Mar 2026ASVS/reports/tooling-trusted-releases/da901ba/
ASVS Pipeline (L3)Steve v3 (d0aa7e9)Apr 2026ASVS/reports/steve/v3/d0aa7e9/
GHA ReviewApache orgApr 2026gha-review/reports/

Links