🎲

StackScan Fuzz Harness

500-scan invariant sweeps across 12 rules. Catches pricing drift, circular logic, and phantom savings before users do.

Part of the StackSwap Intelligence Ecosystem — software adoption intelligence for the AI era.

What Is the StackScan Fuzz Harness?

The StackScan Fuzz Harness (scripts/stackscan-fuzz.ts) is a property-based tester that runs hundreds of synthetic stacks through the full scan pipeline — scanStack → plan-row builder → LLM fallback → final plan — and asserts 12 invariants on every run. The invariants codify the bug classes surfaced in manual QA: money sanity (no negative savings, savings ≤ spend), tool coverage (every submitted tool ends up in the final plan), REPLACE must name a swap target, REMOVE impact must not exceed the removed tool's monthly cost, canonical savings must equal the plan-row sum, post-LLM savings must not exceed pre-LLM, and no rationale can cite a tool that is itself being removed (the circular-redundancy trap). A seeded PRNG makes runs reproducible; 500 scans complete in ~30 seconds.

How It Fits the StackSwap Intelligence Ecosystem

The harness runs against the real scan engine in-process — no browser, no mocks — so it catches pipeline regressions the unit tests miss and the UI smoke tests are too slow to find. Each invariant failure dumps the offending input (tools + team size + industry) so the minimal repro is in the report. Re-running after every fix confirms the fix actually holds: a recent pass caught 293/500 "phantom savings" violations (REMOVE claimed more savings than the tool cost); one clamp later, 2,500 runs across two seeds were clean.

Why This Matters for Launch Readiness and Trust

Manual QA via screenshots scales linearly with effort and misses most combinations. The fuzz harness scales with compute — we can go from 500 scans to 50,000 for a release gate without writing more tests. Based on analysis of 100k+ scans, most GTM teams waste 30-40% of their stack spend on overlapping tools — and the harness proves every savings number is internally consistent before a customer sees it. When a metric is shown on a StackScan report, the harness has already proved that metric can't be internally inconsistent on thousands of synthetic stacks. That's the foundation we use to stand behind the numbers a customer sees on their unlock page.