Thin wrappers, thinner margins

Notes on the voice AI frenzy

By Harshad Saykhedkar | 12 May 2026

There’s a growing frenzy in Indian startups and venture funding around voice AI bots for customer calls, especially outbound calls. Multiple startups assert that they will dent the large market by replacing human agents with these bots. Most of these systems are assembled by stitching together APIs from larger players (example: ElevenLabs, DeepGram, GPT-4o). What these wrappers claim to add is actually a thin layer of prompt tuning and application-specific logic to package them as “agents.”

A founder’s job is to try and predict the market for the next few years and hedge their bets. I’ll outline my view, which is cautious and contrarian, in this post. It comes from both my experience as a founder working in the trenches of GenAI since October 2022, and from a decade of hands-on AI/ML work. I’ll also look at this through macroeconomics, my vocation beyond deep tech.

Thin margins

For any voice AI business built on top of model API providers, cost control is limited. It resembles a restaurant selling finished dishes by buying from a cloud kitchen. Today, bringing the full pipeline unit economics below ₹2–3 per minute is difficult.

Admittedly, large language model (LLM) token costs have dropped over the last two years, though it’s unclear how much of that reduction will carry over to speech-to-text (STT) and text-to-speech (TTS). The structural issue is upstream revenue pressure. Large infrastructure players in AI have raised significant capital and will face revenue expectations to justify this funding. It will show up in their push to close the large enterprise clients themselves and their market pricing power.

There are already signals of this. A recent Reddit example showed the risk clearly: a startup built the agent layer, but the underlying infrastructure provider pushed them to become a low-margin service provider instead of owning the product. The strategy of thin wrappers is building towards low double-digit service margins, not the high gross margins expected from deep tech products. Even worse, Anthropic and OpenAI are themselves launching service arms via joint ventures.

Limited latency control

Latency in voice interaction is largely constrained by physics — how fast data moves through compute (GPUs) and across networks and telephony.

Without fine-grained control over these layers, it’s hard to outperform the latency of underlying API providers. Let me be direct: a startup that neither runs its own models nor controls compute location has no real moat in the long term.

Yes, there are smaller gains from optimising pipelines or using lower-level languages (pitch decks are replete with these claims). Though these are at best secondary and not defensible, especially as coding models make such optimisations easier to replicate.

Build depth, not wrappers

India has a history of importing technology and focusing on assembly and distribution — often called “screwdriver-giri” in manufacturing. In physical industries, this worked because local distribution is hard to build.

SaaS and GenAI don’t have that barrier. Distribution is easier, and defensibility is weaker. Over time, thin wrappers risk being displaced by well-funded global players. Durable advantage will come from building low-cost, useful solutions from the ground up in India.

Though I’m a pragmatist and know the costs of building LLMs firsthand. I’m not advocating for everyone to build their own LLM from scratch. Peter Thiel’s observation about technological breakthroughs vs. incremental scaling is often applied to how Chinese companies focussed on 1 to N scaling. There’s no structural reason why India can’t follow the 1 to N path in AI.

Chinese companies like DeepSeek didn’t start with wrappers. They focused on understanding existing systems and iterating to build better ones. European players like Mistral, Kyutai, and Gradium are taking a similar path. My bet is that technically strong, profitable AI solutions can be built from India. Startups may not train large models from scratch, but building smaller, fine-tuned, purpose-built models is both feasible and useful.

The way forward for startups

Build depth! At AbleCredit, we don’t think human interactions will disappear any time soon. But AI will help in making humans 10x more effective in their jobs. That’s one lesson we are clearly taking away from coding LLMs. We expect winning solutions to expand the pie — improving productivity, topline and outcomes, not just cutting costs. The opportunity for Indian startups is not to compete on hype or capital, but on engineering depth, cost efficiency, and solving messy real-world problems at scale. The winners will be companies that deeply understand the stack, build domain-specific systems and models, and turn India’s constraints into an advantage rather than a handicap.

<< Back