AI Agent Session-Cost Calculator

By Sanjay Saini | Updated: June 12, 2026 | 8 min read

A chat call costs what one prompt costs. An agent costs what a whole loop costs — and that's the part finance never sees coming. Every reasoning step resends the system prompt, the tool schema and the entire history so far, so token usage compounds with each iteration. This calculator models that accumulation step by step, shows where the money actually goes, and exposes your runaway-loop worst case so you can size step limits and kill-switches before a stuck agent runs up the bill.

Agent cost scales ~quadratically with steps because history is resent every iteration.
The fixed system prompt + tool schema is paid for on every step — caching targets exactly this.
A runaway loop that hits its step cap can cost many times the average session.
True cost = accumulated input + per-step output + retries + fixed overhead.

Model your agent's cost per session

Volume & loop shape

Agent sessions per month

One session = one completed task

Avg loop steps per session

Reasoning / tool-call iterations

Max steps (runaway cap)

Your hard loop limit

Tokens per step

System + tools prompt

Fixed, resent every step

Initial user request

Persists across the session

Tool result / observation

Added to history each step

Model output

Generated per step

Adjustments

Prompt cache hit rate (%)

Cached input billed at ~10%

Retry / overhead (%)

Failed steps, fallbacks, evals

Fixed monthly overhead (USD)

Observability, gateway, ops

	Model	Input $/1M	Output $/1M

Estimated agent cost

Model	Tokens/session	Cost/session	Monthly	Annual TCO

How agent cost is calculated

A single completion is one input and one output. An agent loop is different: to decide its next action the model must see everything that happened before it, so on every step the runtime resends the fixed system prompt and tool schema, the original request, and the full running transcript of prior outputs and tool observations. Early steps are cheap; later steps drag a long history behind them. Sum the input across all steps and the growth is quadratic — this calculator computes it as the fixed prompt paid once per step plus an accumulating block that grows by your output and observation sizes each iteration.

Three levers change the bill. Prompt caching rebills the repeated prefix — your system prompt and the stable head of the transcript — at a fraction of the input rate, which is why the breakdown below flags when most of your spend is re-sent context. The retry buffer inflates the total for failed steps and fallback hops, and the runaway figure recomputes a session at your maximum step cap instead of the average, showing the worst case a stuck agent can reach before a kill-switch intervenes. Together they turn a per-call price into a defensible per-session and monthly cost for agent FinOps planning.

Frequently Asked Questions (FAQ)

Why do AI agents cost so much more than a single chat call?

An agent completes one task over many loop steps, and each step resends the system prompt, tool schema and the entire growing history. Because context accumulates, total tokens scale roughly with the square of the number of steps, not linearly.

What is context accumulation in an agent loop?

Every reasoning step appends the model's output and the tool result to the conversation, and that longer history is resent as input on the next step. Late steps therefore carry far more input tokens than early ones, which is the main hidden cost driver.

How do I estimate tokens per agent step?

Add the fixed system prompt and tool schema, the user request, plus the running history of prior outputs and observations. Measure a few real sessions with your provider's tokenizer, then enter average system, observation and output token sizes into the calculator.

How much can prompt caching save for agents?

Caching is most valuable for agents because the system prompt and conversation prefix repeat on every step. Billing that repeated context at roughly ten percent of the input rate can cut a large share of total cost, especially for long sessions with big tool schemas.

What is a runaway loop and why does it matter for cost?

A runaway loop is an agent that keeps iterating without converging, hitting its maximum step limit. Because cost grows quadratically with steps, a session that runs to the cap can cost many times the average, which is why step limits and kill-switches protect your budget.

How do I cap an agent's cost per session?

Set a hard maximum on loop steps and on total tokens, add a budget kill-switch that stops a session when it exceeds a cost threshold, and trim or summarise history so context stops growing unbounded. The runaway figure here shows your worst case before those guards.

Do tool calls add to agent cost?

Yes. Tool results are fed back into the model as observation tokens and then persist in the history for every later step. Large tool outputs, such as full API responses or documents, inflate accumulation quickly, so summarising observations before reinsertion saves money.

Should I use a cheaper model for agent steps?

Mixing models often helps: a smaller, cheaper model can handle routine steps while a frontier model handles hard reasoning. The trade-off is that weaker models may take more steps or retries, so compare total session cost here rather than the per-token rate alone.

How does this calculator model agent cost?

It treats each step's input as the fixed prompt plus all accumulated history, sums input across every step, adds output per step, then applies caching, a retry buffer and fixed overhead. The result is a realistic per-session and monthly cost rather than a single-call estimate.

Does this calculator send my data anywhere?

No. All calculation runs in your browser and your inputs are saved only in local storage so the tool remembers them next time. Nothing is transmitted to a server, and the reset button clears everything instantly.

Sanjay Saini

Product leader and Agile coach at AgileWoW, writing on agentic AI, LLM cost engineering and developer productivity for AI Dev Day India. Connect on LinkedIn