Is AI Software? The $230,000 Question Hiding in Every Revenue Dollar.
Part 2 of The Services-as-Software Files. Read Part 1: Is SaaS Dead? The $2 Trillion Question Hiding in Every AI Pitch Deck. Wondering where your company fits? Here's the Test.
There's a comfortable assumption sitting underneath every services-as-software pitch deck. It goes like this: the AI delivers the work, so there are no humans on payroll, so the gross margins should be at least as good as traditional software — maybe better, because you're capturing the full value of a service without the labor cost.
It's wrong. The data showing it's wrong is now public, structural, and getting worse on the timescales most founders are pricing rounds on. If you took anything from Part 1, you should be asking the obvious follow-up: if Sierra and Greenlite both deliver outcomes, why is one priced like software and the other priced like a services firm? The answer, which most founders never get told before they sign a term sheet, is that they actually have similar gross margins. The difference is which one is allowed to claim software multiples in the market right now.
The intuition the deck is selling you
Picture two companies, both at $50M ARR, both delivering outcomes to customers.
Company A runs a tech-enabled service. They employ thirty credentialed humans — let's say licensed plan reviewers, CPAs, paralegals, or registered nurses, depending on the vertical. The AI accelerates their work. They deliver the outcome. Labor is the biggest line item. Gross margin lands somewhere in the 40-55% range.
Company B runs a pure AI agent. No human professionals on staff. The agent delivers the outcome end-to-end. AWS and Anthropic are the suppliers, not humans. Gross margin should be... what? 75%? 85%? It's just software, right?
Sit with that intuition for a second, because it's exactly the one the pitch decks are exploiting.
The numbers
Here's what the data actually shows, drawn from the 2026 reports of the VCs watching this most carefully:
ICONIQ Growth's January 2026 State of AI snapshot put the average AI product gross margin at 52%, up from 41% in 2024 and 45% in 2025. The trajectory is improving. The structural floor is also real. Bessemer Venture Partners' 2025 State of AI placed LLM-native company gross margins at 50-60%, well below the 80-90% ceiling that defined the prior decade of cloud businesses.
The math is straightforward. For every $1M in AI product revenue a company books in 2026, roughly $230,000 walks out the door as inference cost before a single engineer, AE, or marketer gets paid. That's not a startup-of-scale problem — ICONIQ's data shows inference averages 23% of revenue across scaling-stage AI B2B companies, and the number doesn't meaningfully decline as those companies grow.
For agentic architectures — the kind every services-as-software company is building, where a single user action triggers a sequence of model calls across multiple specialized agents — inference costs multiply 5-20x per user action compared to a single-model setup. The more sophisticated the agent stack, the deeper the margin pressure.
In March 2026, Chamath Palihapitiya described his own AI startup's cost trajectory on the All-In Podcast: between AWS inference, Cursor, and Anthropic, costs had more than tripled since November 2025. That's not a fringe data point. That's a senior VC reporting from inside a portfolio company that the cost line is moving faster than the revenue line, and saying it on the podcast he hosts.
Why "no humans" doesn't mean "no costs"
The mental model that gets founders into trouble is treating inference like cloud hosting circa 2015 — a fixed cost that gets cheaper as you scale.
It isn't. Cloud hosting got cheaper per user because the underlying server costs were amortized across more users. Inference cost is variable in a different way: every single query runs the model again, consuming GPU compute, memory bandwidth, and energy. As your product gains users, training costs stay fixed but inference costs grow linearly with usage. For multi-agent architectures, they grow super-linearly with sophistication.
This is the Jevons paradox for AI businesses. As inference gets cheaper per token, your customers ask the AI to do more, and your total inference bill goes up faster than the per-token price comes down. Your supplier (Anthropic, OpenAI, Google) sets the price. Your customer sets the usage volume. You are the spread between them, and that spread is structurally narrower than the SaaS spread between hosting cost and subscription price ever was.
So when you compare Greenlite (paying humans) to Sierra (paying inference), you find something the deck doesn't tell you: they have approximately the same gross margins. Greenlite spends on plan reviewers. Sierra spends on Anthropic. The total cost-of-revenue ratio lands in a similar neighborhood — roughly 40-60% for the tech-enabled services path, roughly 50-60% for the autonomous-AI path.
The cost is just showing up in your AWS bill instead of your W-2s.
The real divergence: predictability and supplier control
Where the two outsource paths actually differ isn't gross margin. It's the direction and predictability of margin change.
Greenlite-style margins are stable and supplier-controlled by you. Labor cost grows linearly with revenue. Wage inflation is manageable. Your gross margin in Q4 looks like your gross margin in Q1, plus or minus normal hiring drift. You can build a real EBITDA story and walk into a Series C with margin visibility. PE buyers can underwrite it. Strategic acquirers in your vertical can model the integration. Boring, in the good way.
Sierra-style margins are variable and supplier-controlled by Anthropic. Inference cost is set by your model providers, who can move their prices 20-30% on a quarter's notice and have done so before. If your customers' usage spikes during a viral product launch, your inference cost spikes with them — and outcome-based pricing means you don't always have a corresponding revenue spike to match it. The Q4 gross margin disclosure can move several hundred basis points on supplier dynamics you don't control.
This matters for one specific reason: predictable margins close rounds at higher multiples than variable margins, even when the average is the same. The market pays for boring. The market discounts surprises.
The exit multiple arbitrage
This is where the VCs funding the services-as-software pitch are running a specific play.
Tech-enabled services companies trade at services multiples in M&A. The median IT services M&A transaction over the last decade clears at 1.3x revenue and 10x EBITDA — implying about 13% net margin on a stable services business. A solid SaaS business clears at 5-7x revenue. AI-native category leaders are clearing 10-25x revenue right now, with the top tier well above that.
So a $50M ARR Greenlite-style business is probably worth $65M-$150M to a strategic acquirer — Sodexo, ABM Industries, a construction services rollup, or the equivalent buyer in your vertical. The same $50M ARR business positioned as "AI-native services-as-software" can claim a $500M-$1B valuation in the private market, if the buyer believes the software margin story.
That's the arbitrage. VCs are paying SaaS multiples for AI-services companies that have services-level economics, betting that one of three things happens:
Margins climb toward SaaS as inference gets cheaper, vindicating the multiple
M&A buyers pay SaaS multiples anyway because the AI-native label clears the diligence vibe check
Aggressive growth creates lock-in that gives the company pricing power later, recovering margin through outcome-based pricing capture
When Thomson Reuters acquired Casetext for $650M in 2023 against roughly $25M ARR — a 26x revenue multiple — that more or less validated the arbitrage for legal AI. Whether the next round of strategic deals will validate it again is the open question every late-stage VC is asking themselves right now, and most of them aren't telling their portfolio founders about.
What this means if you're choosing a path
Let me get specific. If you're a founder deciding between the two outsource paths from Part 1, here's the actual choice, stripped of the marketing language.
The tech-enabled services path has a lower exit ceiling but a higher floor. You're not getting acquired for $10B. You'll probably be acquired by a strategic in your vertical for 1-3x revenue, or you'll keep running the company profitably for decades because the margins are predictable and the supplier risk is manageable. You can usually run this business without a massive raise because labor costs grow in step with revenue. The defensibility is operational — relationships, licensure, workflow integration.
The pure-AI path has a higher exit ceiling and a much lower floor. The bet is that you become a category-defining company with massive scale and improving margins. The risk is that inference costs don't fall fast enough, customers compress pricing because "it's just AI," and you raise three more rounds chasing a margin profile that never arrives. The 11x story from late 2025 — revenue compression at scale as enterprises pushed back on pricing — is the canary for this entire category. The defensibility is reputational — being the category-defining brand in customer service agents, app building, talent sourcing, or wherever you've planted your flag.
The single most important thing I want founders to take from Parts 1 and 2 together:
Both outsource paths are services-flavored businesses on the cost side. The choice between them is mostly a choice of exit narrative, capital intensity, and supplier risk — not gross margin.
That's not a depressing conclusion. It's clarifying. The companies that will look extraordinary in retrospect are the ones whose founders understood which game they were playing and priced their rounds accordingly. The companies that will look like cautionary tales are the ones whose founders signed up for software multiples on services economics and got compressed at the next round.
A working diagnostic for the term sheet
A few questions to actually ask before you sign the next term sheet.
What's your real gross margin at scale? Not your projected gross margin after fictional inference price drops. Your current gross margin. If it's below 60% and your investors are pricing you above 15x revenue, somebody is making an assumption that needs to be tested.
Who controls your largest variable cost? If the answer is "Anthropic and AWS," your cost line is exposed to supplier moves you don't control. Model what happens to your P&L if your inference costs move 25% in a quarter. If the answer surprises you, that's the conversation to have with your board now, not later.
What's your category's actual M&A comp set? Not the aspirational comp set — the actual one. Look at the deals that have closed in your vertical in the last 18 months. The buyers and the multiples tell you what your exit looks like. Funding announcements from your competitors tell you what they convinced their investors to pay, which is not the same thing.
Does your pricing scale with cost or independent of it? Seat-based pricing on usage-based costs is the worst possible combination — you pay per query, you get paid per seat. Outcome-based pricing on outcome-based costs is the cleanest. Most companies are somewhere in between and don't know exactly where, which means they don't know exactly how a usage spike hits their margin until it happens.
If you can't answer those four questions cleanly, you're underwriting a round on vibes. The cost of doing that in 2026 is significantly higher than it was in 2023.
Getting the margin story straight before the diligence partner asks is the difference between a clean round and one that closes 30% lower than the term sheet. The Investor Readiness Vault™ is built for founders who want to walk into the next conversation with the structural answer already in hand. Most founders I work with realize they should have started this two raises ago.
Part 3 of The Services-as-Software Files lands in two weeks: a methodology piece on how to stress-test a VC thesis without falling for the deck — including the specific moves I used to pressure-test Parts 1 and 2 before publishing.