StackSwap · Methodology v1.0.0
How we model 100,000 GTM stacks
Every cited statistic on stackswap.ai — "38% of modeled stacks contained both Outreach and HubSpot," "median annual recoverable: $93,240", "81.66% of stacks have at least one detected overlap" — comes from this simulation. Synthetic stacks, real engine, reproducible code. Here's exactly how it works.
The pipeline
Three steps, all deterministic from a fixed seed:
- Generate. A seeded RNG (Mulberry32) picks an archetype weighted by realistic operator distribution, samples team headcount within that archetype's range, picks an industry, then assembles a tool list from the archetype's core tools, common additions (probabilistic), optional additions (probabilistic), and legacy drift (probabilistic). Every tool name is validated against TOOL_LIST at script start.
- Score. Each generated stack is passed to
scanStack(input)— the same pure function that powers /stackscan. No HTTP, no DB. Output: current spend, optimized spend, per-tool verdicts, modeled recovery, overlap pairs detected, AI readiness scores, and tier pricing. - Aggregate. Across all 100,000 runs we compute: tool prevalence (% of stacks containing each tool), overlap-pair prevalence + recovery distributions (median, p25/p75/p95), team-size-bucketed recovery medians, and headline summary stats. The result writes to
data/stack-simulation/aggregates.json— committed in the repo, ~25KB. Raw per-stack CSV (~18MB) writes tosimulations/which is gitignored.
The 12 archetypes
Weights sum to 1.0. Distribution reflects estimated operator population prevalence — not even-split. Tools assigned to each archetype are sourced from realistic patterns we observe in the GTM operator community.
| Archetype | Team-size range | Weight |
|---|---|---|
| Early-stage B2B SaaS (founder-led) | 1-10 | 12% |
| Mid-market B2B SaaS (sales-led) | 15-50 | 20% |
| Growth-stage mid-market (multi-channel) | 30-100 | 15% |
| Enterprise RevOps | 100-500 | 8% |
| Dev-tools PLG | 10-50 | 10% |
| AI-native modern team | 10-40 | 6% |
| Late-stage enterprise (multi-region) | 500-2000 | 5% |
| Post-acquisition tangled (Outreach + Salesloft) | 50-300 | 3% |
| PLG + sales-assist hybrid | 25-150 | 7% |
| Mid-market sales with Apollo + HubSpot | 10-80 | 6% |
| Bootstrapped lean (1-15) | 1-15 | 5% |
| Marketing-led B2B (HubSpot heavy) | 20-80 | 3% |
What the engine produces
From the current run (SIM_SEED=42 npm run simulate:100k):
- 81.66% of modeled stacks contained at least one detected overlap pair.
- Median monthly recoverable per stack: $7,770 (annual: $93,240).
- Distribution: p25 $280/mo · p75 $21,321/mo · p95 $119,460/mo.
- Median current monthly stack spend: $20,700.
- Median overlap-pair count per stack: 2.
Top citable findings
- 81.66% of modeled stacks contained at least one overlap pair flagged by the engine.
- Median annual recoverable across 100k modeled stacks: $93,240.
- 29.97% of modeled stacks contained both HubSpot Marketing Hub and Salesforce; median modeled annual recovery from consolidating: $1,800.
- 23.79% of modeled stacks contained both Clari and Gong; median modeled annual recovery from consolidating: $1,200.
- 20.49% of modeled stacks contained both Apollo.io and ZoomInfo; median modeled annual recovery from consolidating: $2,400.
- 18.74% of modeled stacks contained both Outreach and Salesloft; median modeled annual recovery from consolidating: $1,200.
- 17.75% of modeled stacks contained both Apollo.io and Outreach; median modeled annual recovery from consolidating: $1,200.
Tool prevalence and engine verdicts
Top 20 tools by prevalence across 100,000 modeled stacks. Replace rate = % of stacks containing that tool where the engine flagged a replacement candidate (e.g. Apollo for Outreach). Remove rate = % where another tool already in the stack made it redundant.
| Tool | % of stacks | Replace rate | Remove rate |
|---|---|---|---|
| Slack | 100% | — | — |
| Notion | 62.81% | — | — |
| ZoomInfo | 61.37% | 84.19% | 1.78% |
| Outreach | 54.38% | 100% | — |
| Salesforce | 50.88% | — | 39.51% |
| Gong | 50.88% | — | 46.75% |
| HubSpot | 49.12% | — | 1.31% |
| LinkedIn Sales Navigator | 46.49% | — | — |
| Calendly | 44.6% | — | — |
| Apollo.io | 34.69% | — | 0.5% |
| HubSpot Marketing Hub | 32.94% | — | 29.95% |
| Marketo | 27.67% | 18.4% | — |
| Clari | 23.79% | — | — |
| Loom | 20.11% | — | — |
| Mixpanel | 19.94% | — | — |
| Linear | 19.68% | — | — |
| Salesloft | 18.74% | 100% | — |
| Segment | 18.07% | — | — |
| Clearbit | 13.99% | 61.76% | 32.17% |
| Chorus | 12.98% | 100% | — |
Most prevalent redundant pairs
The 15 overlap pairs that show up most often across the modeled population. Recovery values are deterministic per pair (see limitations) — consider them modeled annual upper bounds for the consolidation move.
| Pair | % of stacks with both | Median annual recovery |
|---|---|---|
| HubSpot Marketing Hub + Salesforce | 29.97% | $1,800/yr |
| Clari + Gong | 23.79% | $1,200/yr |
| Apollo.io + ZoomInfo | 20.49% | $2,400/yr |
| Outreach + Salesloft | 18.74% | $1,200/yr |
| Apollo.io + Outreach | 17.75% | $1,200/yr |
| Linear + Notion | 15.2% | $960/yr |
| Clearbit + ZoomInfo | 13.99% | $3,600/yr |
| Chorus + Gong | 12.98% | $1,200/yr |
| HubSpot Marketing Hub + Marketo | 11.91% | $17,280/yr |
| HubSpot + Mailchimp | 11.78% | $1,200/yr |
| 6sense + ZoomInfo | 11.49% | $15,600/yr |
| Clearbit + LinkedIn Sales Navigator | 9.63% | $540/yr |
| Salesforce + Salesforce Tableau | 5.97% | $7,200/yr |
| Marketo + Pardot | 5.75% | $33,000/yr |
| Bombora + ZoomInfo | 3.11% | $15,600/yr |
Highest-value consolidation opportunities
Top 10 overlap pairs by modeled annual recovery, ranked by dollars. Less prevalent than the list above, but higher-impact when present.
| Pair | Median annual recovery | % of stacks with both |
|---|---|---|
| Marketo + Pardot | $33,000/yr | 5.75% |
| 6sense + Demandbase | $24,000/yr | 2.72% |
| HubSpot Marketing Hub + Marketo | $17,280/yr | 11.91% |
| 6sense + ZoomInfo | $15,600/yr | 11.49% |
| Bombora + ZoomInfo | $15,600/yr | 3.11% |
| 6sense + Bombora | $12,000/yr | 1.76% |
| 6sense + Cognism | $12,000/yr | 1.16% |
| Bombora + Cognism | $12,000/yr | 0.53% |
| Salesforce + Salesforce Tableau | $7,200/yr | 5.97% |
| HubSpot + HubSpot Marketing Hub | $7,200/yr | 2.97% |
Recoverable spend by team size
Median modeled monthly waste scales sharply with headcount. The biggest single jump is 6–15 to 16–25 — the inflection point where stack drift starts costing real money. Useful framing for “when to audit your stack” conversations.
| Team size | % of all stacks | Median monthly recoverable | Median monthly spend |
|---|---|---|---|
| 1-5 | 7.81% | $0/mo | $580/mo |
| 6-15 | 13.04% | $80/mo | $1,950/mo |
| 16-25 | 11.05% | $5,120/mo | $11,750/mo |
| 26-50 | 32.03% | $8,680/mo | $21,090/mo |
| 51-100 | 18.01% | $21,410/mo | $46,030/mo |
| 101-200 | 5.97% | $25,890/mo | $74,050/mo |
| 201-500 | 7.15% | $76,280/mo | $178,300/mo |
| 501-1000 | 1.61% | $131,550/mo | $407,360/mo |
| 1000+ | 3.33% | $242,910/mo | $751,620/mo |
Independent validation
Modeled stacks are how StackSwap measures the patterns. The State of GTM Engineering 2026 (OneGTM, n=228) is how the operators in those stacks describe themselves. The numbers below are theirs — included here to show that the things our engine scores for (overlap, consolidation, AI-readiness, fit-for-stage) match what the audience actually says they need.
Honest limitations
- Synthetic, not empirical. These are modeled stacks, not real customer scans. We're explicit about that everywhere the data is cited. Pre-revenue means no customer base; we'd rather model honestly than fabricate a customer count.
- Slack at 100% prevalence. Every archetype includes Slack as a core tool, so it shows up in 100% of stacks — that's an over-representation we know about. Doesn't affect overlap-pair statistics but inflates Slack's prevalence number.
- Operator-judged weights. The 12 archetype weights are estimated from operator pattern recognition, not external market research. We could be off by 5-10 percentage points on any given archetype's real-world prevalence.
- Deterministic recovery values. Each overlap pair has a fixed modeled annual recovery from the OVERLAPS table (e.g. Apollo+ZoomInfo = $2,400/yr modeled recovery, every detection). The p25-p75 tightness in our distributions reflects this engine determinism, not bug.
- No business-context override. The engine flags overlap based on capability redundancy. In practice some teams run both tools for legitimate reasons (regulated industry, post-acquisition transition, specific motion). The modeled recovery is the upper bound, not the recommended action.
Reproducibility contract
Same seed (SIM_SEED=42) + same archetype templates + same engine version = bit-identical aggregates.json. Anyone can verify the cited statistics by running the script themselves. The methodology_version field bumps when:
- The scoring engine logic changes (new overlap pairs, updated cost modeling, etc.)
- Archetype templates are added, removed, or rebalanced
- Aggregation logic changes (new metrics, different percentile cutoffs)
When the version bumps, all content pages automatically pick up the new numbers on next deploy — the citation helpers in lib/stack-simulation/citations.ts always read the latest aggregates.
How to cite this dataset
Journalists and researchers are welcome to cite these statistics. Below are ready-to-paste citation formats. Please link back to this page so readers can verify the methodology.
Short citation (inline)
Source: StackSwap, "100,000 GTM Stack Simulation" (methodology v1.0.0, https://stackswap.ai/methodology).
Long citation (academic / report style)
StackSwap. (2026). 100,000 GTM Stack Simulation: Modeled tool prevalence, overlap, and recoverable spend across 12 operator archetypes (methodology v1.0.0, seed 42). Retrieved from https://stackswap.ai/methodology.
Example sentences (for journalists / writers)
- “According to a 100,000-stack simulation by StackSwap, 81.66% of B2B SaaS GTM stacks contain at least one redundant tool pair.”
- “StackSwap's modeled dataset of 100,000 synthetic GTM stacks puts the median annual recoverable spend at $93,240 per company.”
- “A modeled analysis of 100,000 GTM configurations found that [Tool A] and [Tool B] appear together in [X]% of stacks — see stackswap.ai/methodology for the full table.”
Pre-built citation helpers for every overlap pair and tool are available in lib/stack-simulation/citations.ts — citationForOverlapPair() and citationForToolPrevalence() return formatted sentences for any tool or pair in the dataset.
Source
- Simulator script:
scripts/simulate-stacks-100k.ts - Scoring engine:
lib/scanStack.ts(same function powers /stackscan) - Citation helpers:
lib/stack-simulation/citations.ts - Output dataset:
data/stack-simulation/aggregates.json(committed) - Tool universe:
data/tools.json
FAQ
Related
- 90-day consolidation runbook (uses these statistics)
- Are you wasting money on Outreach? (cites Outreach prevalence + overlap stats)
- Are you wasting money on Apollo? (cites Apollo + ZoomInfo overlap)
- All tool overlap pairs (per-pair pages)
- GTM tools directory (every tool reviewed)
Canonical URL: https://stackswap.ai/methodology
Generated at: 2026-05-06T01:56:31.303Z