Data Ethics

Why Data Brokers Are Dying (and What's Replacing Them)

Roughly a decade ago, serious outbound still meant wiring budget to a broker. ZoomInfo, DiscoverOrg before the merger, later generations of Clearbit-class products - you paid for the database, dumped it into Salesforce, and argued about credits. The category was expensive, entrenched, and painful to rip out. Plenty of teams renewed because the alternative sounded like unemployment for whoever owned the renewal. That posture is weak now - not because one hero vendor won a shootout, but because the economics underneath static brokerage are sliding. Four pressures arrived together: privacy law raising the cost and scrutiny of supply, sources locking the front door, mid-market budgets refusing premium seat math, and on-demand research replacing "biggest CSV" as the enrichment job. Below is the structural case, not a morality play. If you are staring at a six-figure data renewal, you deserve a straight answer about whether the category still earns the check. Brokers are not about to vanish overnight - legacy coverage still wins some rooms - but the default "sign and forget" stance is harder to defend every quarter. For how enrichment work actually shifted in reps' day jobs, read how AI is changing sales operations after you finish the business-model lens here.

The business model that built the category

Broker revenue always rested on three boring competencies: 1. Aggregation at scale - acquire millions of rows from scraping, licensors, acquisitions, and public filings, then stuff them behind a login. 2. Normalization - dedupe titles, stitch accounts, append phones, and ship a schema Salesforce admins recognize. 3. Access gating - charge per seat or per credit for that warehouse. The moat was never a clever model weight. It was logistics at pile height: burn cash on ingestion pipelines, legal cover, and sales teams wide enough to justify enterprise ACV. ZoomInfo, Cognism, Lusha, Seamless.AI, legacy Clearbit placements, Apollo's data-heavy SKU - same shape, different bundling. Smaller shops often relicensed upstream or leaned on the same public sources everyone else used. Strong claim: data brokers never defended a pure tech moat; they defended a throughput moat. When upstream supply frays and buyers can assemble cheaper routes to the same answer, throughput stops translating to pricing power. That is why the current squeeze feels like category fatigue instead of a single bad product cycle. None of this requires cheering for any particular startup - it requires understanding that the broker P&L was always sensitive to supply cost, and supply cost went up while substitutes got cheaper.

The four forces eroding the model

Each vector below hurts margins a little on its own. Together they change renewal math.

1. Regulatory pressure is making the supply side expensive

GDPR in the EU, CCPA/CPRA in California, plus a growing U.S. state patchwork tightened how personal data can be collected, resold, and queried. DSAR workflows, documented bases for processing, opt-out mechanics, and vendor diligence are now part of the contract packet - not a footnote. Brokers responded with compliance teams, "legitimate interest" playbooks, and cleaner supplier contracts, but each increment adds lawyer hours and caps cavalier collection. Fortune 500 procurement often runs privacy review like security review: if legal cannot trace consent or legitimate-interest documentation for a dataset, the deal stalls even when GTM wants the workflow. Mid-market teams feel the same drag later - usually one bad audit away from a forced template. Practical effect: sourcing the same rows costs more; renewals carry thicker security addenda; verbal assurances from 2016 do not clear a modern vendor questionnaire. The work still happens in many jurisdictions - the difference is friction and price, which is enough to open the door to orchestration stacks that can prove lineage credit by credit. When you model the renewal, ask legal what they would need to defend the vendor in a regulator inquiry tomorrow, not what passed five years ago.

2. Supply is collapsing faster than aggregation can replace it

Brokers lived on the long tail of inputs: LinkedIn-shaped professional graphs, public web scraps, smaller data shops happy to wholesale rows, email-validation vendors, intent feeds. That tail is thinner. LinkedIn has spent years litigating and tooling against scraping; hiQ Labs v. LinkedIn is the headline reminder that treating member data as a public commons does not fly cleanly anymore. Mailbox providers tightened APIs and anti-abuse rules. Niche licensors got bought, shut down, or raised rates. Every lost feeder shows up six months later as stale titles, blank direct dials, and "coverage" charts that look better in the demo org than in your territory. Sales engineering will swear freshness improved; reps mutter otherwise. Practical effect: the database advantage brokers sell is perishable. Freshness SLAs matter more because decay accelerated. Teams that budgeted a broker as "set and forget truth" now pay the same renewal for data that behaves more like a perishable good - expensive if you still treat it like infrastructure stone. That decay is structural: you cannot scrape your way back to 2015 abundance when platforms gate the graph and regulators watch the pipes.

3. The economics don't work for mid-market anymore

Enterprise pricing assumed no viable substitute. ZoomInfo-class quotes of $40k-$80k+ for a mid-market pod made sense when the alternative was hiring researchers. They sting now that Apollo bundles data with outbound execution at a fraction of the cash, and Clay charges against actual waterfall usage instead of every seat inheriting an expensive login "just in case." Annual all-you-can-eat contracts also clash with how lean teams buy SaaS: month-to-month trials, credit pools, and swap-in vendors when a waterfall underperforms. Brokers built for CFOs who sign once a year; modern stacks want finance to see utilization per workflow, not a flat entitlement spreadsheet. Seat models bill everyone with a login; credit models bill the work that actually ran. Practical effect: the rep's pitch - "nobody else has this file" - dies faster when procurement can open Apollo versus ZoomInfo or Clay versus ZoomInfo and see landed cost for the same motion. You may still pick the broker after that exercise - but you are choosing it eyes open, which is new. Mid-market teams feel this first because they lack the procurement theater that lets enterprise bury line items.

4. AI research workflows are eating the static enrichment job

The broker promise was deterministic: hand me a domain, get a row. The modern job is contextual: what changed since last quarter, who cares, and what proof matters for this account. AI-assisted research stacks (Clay tables, Apollo research modes, signal products like Common Room for community, Koala-style site telemetry, others) assemble live context from websites, filings, social posts, and job ladders without pretending one warehouse holds the world. That shifts spend from "rent the pile" to "pay for the answer." The pile still matters at the bottom of the waterfall, but the differentiation moved upstack to orchestration and QA - exactly where mid-market teams feel pain today. Practical effect: "largest static database" is a weaker solo argument every quarter. Buyers compare outcome quality on a dozen accounts, not row counts. For the workflow read across layers, tie back to how AI is changing sales operations; this section is why the bill attached to that workflow is moving off pure brokers. Brokers are responding with Assist features and partnerships - the counter is bundling, not denying the job moved.

What's still working for data brokers (the honest counter-view)

Writing brokers off entirely misreads the market. Three strengths still clear procurement: 1. Compliance packaging for global enterprises. Vendors like ZoomInfo and Cognism employ armies of counsel and field maps you can hand auditors. If your procurement desk demands paper trails across regions, that matters more than a slick orchestration demo. 2. Salesforce entrenchment. Years of native field mapping, packaged flows, and admin habit mean switching costs are real. Rip-and-replace is a quarter-long project with political risk - brokers earn fees for that stickiness. 3. Systematic reach for classic prospecting fields - direct dials, functional headcount, org chart guesses - where research agents still wobble on accuracy or policy. Large enterprises with embedded workflows and global compliance load can still justify the renewal. Mid-market teams with lighter governance overhead rarely get the same return per dollar. Do not confuse "still useful somewhere" with "automatically worth last year's renewal." The honest split is segment-specific.

What operators should do at renewal time

Run four checks before you countersign: 1. Login reality: what share of licensed reps used the broker last month? Most teams over-seat; trimming inactive logins alone often cuts thirty to fifty percent of spend without touching workflow. 2. Replacement pricing: price Apollo for full-stack motion, Clay for research-orchestration, and whatever sits beside them for send - before you negotiate the incumbent discount. Use the compare pages as anchors: Apollo versus ZoomInfo, Clay versus ZoomInfo. 3. Migration feasibility: can RevOps port field mappings and playbooks to the alternative in a quarter without breaking pipeline? If yes, integration lock-in was overstated. If no, document the real exit plan anyway; dependence without a map is how renewals get punitive. 4. Legal exposure: have counsel review DPAs against current GDPR/CCPA/CPRA expectations for your business - not 2019 folklore. Answers are rarely "zero risk" anymore. If three of four point to exit, plan the swap across one quarter, validate on a thirty-day pilot, and stage cutover on Salesforce sandboxes first. For the broader sequence (owners, overlap, savings), use how to audit your GTM stack. If you are building the privacy-first side of the same story for leads who opt in through product, read privacy-first lead intelligence next in this cluster.

What this looks like in practice (the StackSwap moment)

StackScan tends to flag legacy brokers as obvious consolidation candidates in mid-market stacks - not out of ideology, because landed cost per engaged rep rarely beats a modern bundle once you line up ZoomInfo-plus-nav-plus-sidecar enrichers against Apollo or Clay-class spend. The delta shows up as real cash, usually with less legal ambiguity than execs fear if orchestration replaces bulk resale. When the scan leaves enterprise Salesforce plumbing alone but tears the data line item apart, that is this thesis in invoice form: logistics moats deflating while workflows move to on-demand research. Brokers are not dead - but the renewal deck should prove value, not assume it.