Skip to content

Commit 733d14a

Browse files
committed
new post
1 parent 5c9e96f commit 733d14a

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
layout: post
3+
title: Building comparevoiceai.com - Voice agent pricing calculator
4+
categories: [ai]
5+
---
6+
7+
Over the last week, I was reading, building and researching about voice agents. I made a few prototypes and along the way tried to understand the nuances of building voice agent applications. What started as casual curiosity about calculating pricing for voice agents culminated into vibe coding a voice ai agent pricing calculator. You can find it at [comparevoiceai.com](https://comparevoiceai.com/)
8+
9+
<div align = "center">
10+
<img src="/assets/files/comparevoiceai.png">
11+
</div>
12+
13+
After reading a bunch of blogs and comparing benchmark data across providers, I realized just how difficult it is to estimate the true cost of running voice agents in production. Most people focus on the obvious components (LLM costs, perhaps transcription), but miss the quadratic growth pattern that makes long conversations exponentially more expensive. A 40-min session can cost 100x more than a 4-min session. The costs of each component (STT, LLM, TTS) don't necessarily scale linearly.
14+
15+
It currently lets you select different providers for each component of the voice AI stack:
16+
17+
- LLM Provider (like GPT-4o)
18+
- STT Provider (Speech-to-Text like GPT-4o-Transcribe)
19+
- TTS Provider (Text-to-Speech like Sonic from Cartesia)
20+
21+
The calculator then provides a detailed breakdown showing the individual costs for transcription, LLM processing, voice synthesis, and even infrastructure hosting(for running your orchestration, VAD, any audio processing etc).
22+
23+
Another super important thing for building voice agents is latency. Cost is only half the equation, though. Voice-to-Voice latency measures the total time from when a user finishes speaking to when they hear the AI response. Lower latency creates more natural conversational experiences. However, its not super intuitive where the latency is coming from and how you can optimise it. If you want to learn more about this, you can read this [blog](https://rnikhil.com/2025/05/18/how-to-reduce-latency-voice-agents) I wrote for reducing latency in your voice agent application. The website also has a simulator, where you can simulate a conversation with various latency figures.
24+
25+
<div align = "center">
26+
<img src="/assets/files/latencybreakdown.png">
27+
</div>
28+
29+
---
30+
31+
Personally this project has been my playground for learning SEO. I've been:
32+
33+
- Researching high-value keywords in the voice AI space
34+
- Creating content that addresses specific pain points for developers. This is mostly AI generated based on all the notes, docs I have dumped to a Claude project
35+
- Figuring out how to earn backlinks to my site and what should be the strategy for this
36+
- Learning basic site structure guidelines for SEO
37+
38+
I have never marketed anything in my life before. If I am going to start a company and sell software, I better learn how to make my calculator website come on the first page of Google results (or LLM results) soon. This was one of the main motivations of actually seeing this project to completion(with a separate domain name too). I am still super early on figuring out this SEO stuff (have a day job these days, so can't do this fulltime) but I will update this blog if my website starts ranking high on search results.

0 commit comments

Comments
 (0)