Verify Engine

The Verify Engine proves code is correct. Deterministic tools find issues. AI generates fixes. Humans review. Feedback improves everything.

The Problem

AI-generated code ships with vulnerabilities, dead code, hallucinated imports, and copy-paste patterns that no single linter catches. Manual code review cannot keep pace with AI generation speed.

The solution: a hybrid verification pipeline that runs deterministic tools first, filters to changed lines only, and uses AI only for fix generation and review.

Pipeline

Code change (diff)
    |
    v
Syntax Guard (Biome) ---------- REJECT if invalid (< 500ms)
    |                            Prevents bad states early
    v
Parallel Deterministic Analysis
    |   Biome (423+ rules)
    |   Semgrep (2,000+ SAST rules + custom)
    |   SonarQube CE (quality gates)
    |   Secretlint (secrets detection)
    |   Trivy (dependency CVEs)
    |   diff-cover (test coverage on changed code)
    |   Stryker (mutation testing)
    |   AI Slop Detector
    |   typecheck (tsc --noEmit)
    |   consistency (AST cross-function check)
    |   ZAP (DAST)
    |   Lighthouse (perf/a11y/SEO)
    |
    v
Diff-only Filter --------------- Only findings on changed lines
    |                            No pre-existing noise
    v
AI Fix Layer ------------------- Context Engine provides surrounding code
    |                            Prompt Engine provides team-tuned prompts
    |                            Cache checks for identical findings
    |                            Generates contextual fix as diff
    |                            Validates fix compiles + tests pass
    v
Two-stage Review
    |   Stage 1: Spec compliance (matches PLAN.md?)
    |   Stage 2: Code quality (clean, tested, safe?)
    v
Feedback Loop ------------------ Accept/reject stored in feedback.db
                                 Prompt Engine evolves from outcomes

Key Principles

Syntax guard

Biome runs first, in under 500ms. If the code does not parse, the entire pipeline is skipped. This prevents wasting time and tokens on syntactically invalid code. Inspired by SWE-agent’s finding that preventing bad states beats recovering from them.

Diff-only filtering

Only findings on changed lines are reported. If a file had 50 pre-existing warnings and you changed 3 lines, you see findings for those 3 lines only. This eliminates the noise that makes developers ignore tool output. Inspired by Reviewdog.

Auto-detection

Tools are auto-detected at runtime. If Semgrep is installed, it runs. If it is not, it is skipped — not an error. Maina works with zero external tools installed (Biome is built-in), and gets more powerful as you add tools.

Check what is installed:

maina doctor

Single LLM call

Every command makes at most one AI call. All intelligence goes into what context enters that call. The exception: PR review gets two calls (spec compliance + code quality), because splitting catches more than combining.

Slop detection

The AI Slop Detector catches patterns common in AI-generated code:

Filler phrases (“certainly”, “as you can see”, “it’s worth noting”)
Hallucinated imports (modules that do not exist)
Dead code (unused variables, unreachable branches)
Copy-paste patterns (duplicated blocks with minor variations)

Slop detection runs as part of the parallel analysis phase and uses the mechanical model tier for cost efficiency.

Doc-claim verification

The doc-claims tool catches a specific class of bug: a subagent asked to summarize a package’s “API surface” returns a narrative that mixes real exports with plausible-looking fabrications, and the fabrications ship to docs. (The bug that motivated this gate: workkit#43, which shipped 20+ wrong API claims.)

For each changed .md / .mdx file the tool:

Parses fenced code blocks for import and require statements.
Resolves the module specifier to a file inside the workspace — relative paths resolve against the doc’s directory; workspace package names (@scope/name) are resolved by scanning <cwd>/packages/* for a matching package.json#name.
Collects every top-level exported identifier from the resolved source (direct exports, export { ... } lists, re-exports).
Emits a warning finding for each imported symbol that is not in the export set, pointing at the import line.

The check is mechanical — no LLM in the loop — and runs in parallel with the other verify tools.

Limitations (v1):

External packages (react, lodash, anything in node_modules) are skipped silently. We do not walk node_modules; verification of external surface is a future improvement.
Member-access claims (obj.method() in a code sample) are not validated. That requires type information, which is out of scope for v1.
export * re-exports are treated as a wildcard — any symbol claimed against such a package is accepted. This trades precision for fewer false positives.
Severity is warning, not error. Promote it via .maina/constitution.md once the false-positive rate is well understood.

Silencing a finding: if a flagged import is actually correct (for example the source uses an exotic export form the regex misses), tune the rule through .maina/constitution.md — drop its severity to info or skip it entirely:

## Verification

- `doc-claims/missing-export`: severity = info

The same constitution surface controls every other verify tool, so the silencing path is uniform across the pipeline.

Tools

Tool	What it checks	Required
Biome	Lint (423+ rules) + formatting	Built-in (always available)
Semgrep	2,000+ SAST rules + custom rules	Optional
SonarQube CE	Quality gates, code smells, complexity	Optional
Secretlint	Hardcoded secrets, API keys, tokens	Optional
Trivy	Dependency CVEs, container vulnerabilities	Optional
diff-cover	Test coverage on changed lines only	Optional
Stryker	Mutation testing — are your tests actually testing?	Optional
AI Slop Detector	AI-generated filler, hallucinations, dead code	Built-in
typecheck	`tsc --noEmit` type checking with zero extra installs	Built-in (TS projects)
consistency	AST-based cross-function consistency check	Built-in
doc-claims	Verifies `import` statements in changed `.md` / `.mdx` files reference symbols actually exported by the resolved package source	Built-in
ZAP	Dynamic Application Security Testing (Docker-based)	Optional
Lighthouse	Performance, accessibility, and SEO auditing	Optional

Commands

Command	Description
`maina verify`	Run the full pipeline on your diff. Diff-only by default.
`maina commit`	Syntax guard + parallel gates + git commit. The fast path.
`maina review`	Two-stage code review: spec compliance then code quality.
`maina pr`	Create a PR with two-stage review attached.
`maina doctor`	Check which tools are installed and engine health.

Two-Stage Review

The review splits into two focused passes:

Stage 1 — Spec compliance: Does the code match the plan? Are the requirements in PLAN.md satisfied? Are there missing implementations or scope creep?

Stage 2 — Code quality: Is the code clean, tested, and safe? Does it follow the constitution? Are there performance, security, or maintainability concerns?

Splitting the review into two stages catches more issues than a single combined review. Each stage has a focused prompt tuned for its specific concern.

Cloud Verification

Run the full verification pipeline on Maina Cloud instead of locally. This is useful when you want consistent results across a team without requiring every developer to install 19+ tools, or when running verification in CI.

How it works

maina verify --cloud
    |
    v
Authenticate (maina login, one-time)
    |
    v
Collect diff (staged changes or branch diff)
    |
    v
Submit to Maina Cloud API
    |
    v
Workers Queue picks up the job (@workkit/queue)
    |
    v
Full pipeline runs in the cloud
    |   Syntax guard
    |   Parallel analysis (all 19+ tools)
    |   Diff-only filter
    |   AI fix suggestions
    |   Two-stage review
    |
    v
Proof artifacts stored in R2 (@workkit/r2)
    |
    v
Results returned to CLI / posted as PR comments

Authentication

maina login     # GitHub OAuth device flow, stores token in ~/.maina/auth.json
maina logout    # Clear stored credentials

Usage

maina verify --cloud          # Run hosted verification on current diff
maina verify --cloud --deep   # Include AI semantic review in the cloud

Flags like --deep and --visual work the same way in cloud mode. The cloud pipeline has all tools pre-installed, so results are consistent regardless of your local setup.

Proof artifacts

Every cloud verification produces a proof artifact stored in R2. These artifacts include:

Full tool output from all 19+ tools
Diff-only filtered findings
AI fix suggestions (if AI review was enabled)
Timestamp and content hash for auditability

Proof artifacts are referenced in PR bodies created by maina pr and in GitHub Action check runs.

CI integration

Use the mainahq/maina/.github/actions/verify@main GitHub Action for cloud verification in CI:

- uses: mainahq/maina/.github/actions/verify@main
  with:
    token: ${{ secrets.MAINA_TOKEN }}

See the CI Integration docs for full configuration options.