Ollama Cloud’s New “Extra Usage” Option: The End of the Weekly Lockout
Ollama Cloud introduces "Extra Usage" credits, allowing Pro and Max users to bypass weekly limits. Learn how this new pay-as-you-go option brings more flexibility—and a little more transparency—to the platform.
Ollama has quietly added a feature that many of us in the dev community have been screaming for: a way to keep working when you hit the ceiling.
For the uninitiated, Ollama Cloud is a different beast than the local CLI tool we all use to run Llama 3 or Mistral on our own rigs. It’s a managed service that gives you access to "frontier" models (the heavy hitters that usually require a cluster of A100s to run) via a subscription. While the local tool is free and unlimited, the Cloud version has always been governed by a somewhat mysterious usage metric.
The new Extra Usage purchase option finally offers a "pay-as-you-go" escape hatch for those who burn through their weekly limits.
The Mystery of "Compute Time"
If you’ve used the $20/month Pro plan, you’ve likely seen the usage bar in your settings. It doesn't track tokens or requests. Instead, it uses a percentage based on compute time.
This has always been a point of friction. Because Ollama doesn't explicitly state how many seconds of GPU time a specific prompt consumes, it’s hard to budget your workload. You might run a complex RAG pipeline and see your weekly limit vanish, or you might chat for days and barely move the needle.
Users on Reddit and Discord have frequently complained about this "black box" approach. It’s a stark contrast to the transparent per-token pricing we see from OpenAI or Anthropic.
How Extra Usage Changes the Math
The introduction of the $5 credit top-up is a significant shift. Based on early reports from users who have already poked at the feature, adding these credits transitions your account into a hybrid model.
- Primary Limit: Your $20 Pro (or $100 Max) subscription covers your initial "generous" weekly usage.
- Secondary Credit: Once that is exhausted, the system pulls from your purchased balance.
The interesting part? This is the first time we can actually see a dollar-to-usage correlation. Early testers are reporting that a $5 credit behaves much more like a standard API. If you monitor your balance while running a specific model, you can finally start to reverse-engineer what Ollama is charging for these frontier weights.
The Good, The Bad, and The Obscure
The "Good" here is obvious: no more lockouts. If you’re in the middle of a sprint and hit 100% usage on a Thursday, you aren't forced to pivot your entire stack to a different provider until Sunday. You can drop $5 and keep the momentum. It keeps you firmly within the Ollama ecosystem.
The "Bad" is that the transparency still isn't where it needs to be. While we can now see $0.03 being deducted for a code review, Ollama still doesn't provide a dashboard showing "Price per 1k Tokens." We are essentially paying for a service where the bill is itemized in a language we don't quite speak yet.
If you’re a Pro or Max subscriber, this is a massive quality-of-life win. Just don't expect a clear breakdown of your ROI quite yet.