TokenSurf routes your ChatGPT, Claude, and Gemini calls to cheaper models when the query is simple. No SDK. No lock-in.
Same OpenAI SDK. Works with GPT, Claude, and Gemini. Just change the URL.
from openai import OpenAI client = OpenAI( api_key="ts_your_tokensurf_key", base_url="https://api.tokensurf.io/v1" # That's it) # OpenAI — gpt-4o routed to gpt-4o-mini (94% savings) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "What is 2+2?"}] ) # Anthropic — claude-opus-4 routed to claude-haiku-3.5 (94% savings) response = client.chat.completions.create( model="claude-opus-4", messages=[{"role": "user", "content": "Translate hello to French"}] ) # Google — gemini-2.5-pro routed to gemini-2.5-flash (72% savings) response = client.chat.completions.create( model="gemini-2.5-pro", messages=[{"role": "user", "content": "Define photosynthesis"}] )
One proxy between you and the providers. Change one URL.
Step 1
Point your SDK at TokenSurf. Same code, same models. One line change.
Step 2
"What is 2+2?" goes to a cheap model. "Write me a React app" keeps yours.
Step 3
Save 50-99% on simple calls. Same quality. Your keys, your providers.
Works with OpenAI, Anthropic, Google, and 300+ models via OpenRouter. Read the full architecture →
When a query is simple, we route it to a cheaper model. Here's exactly what you pay and save.
| You Request | Cost per 1M tokens | We Route To | Cost per 1M tokens | You Save | |
|---|---|---|---|---|---|
| gpt-4 | $30 / $60 | → | gpt-4o-mini | $0.15 / $0.60 | 99% |
| gpt-4-turbo | $5 / $15 | → | gpt-4o-mini | $0.15 / $0.60 | 97% |
| gpt-4o | $2.50 / $10 | → | gpt-4o-mini | $0.15 / $0.60 | 94% |
| claude-opus-4 | $15 / $75 | → | claude-haiku-3.5 | $0.80 / $4 | 95% |
| claude-sonnet-4 | $3 / $15 | → | claude-haiku-3.5 | $0.80 / $4 | 73% |
| gemini-2.5-pro | $1.25 / $10 | → | gemini-2.5-flash | $0.30 / $2.50 | 76% |
Prices shown as input / output per 1M tokens. Already on a cheap model? We pass through unchanged. 300+ models available via OpenRouter.
Every feature you need to run LLMs at scale. All included on every plan.
All features. Every plan. No gating.
The only difference between plans is volume and price per request.
Bring your own API keys. $0.001 per request. Commit to volume, pay less.
1,000 requests/month
Top up anytime, credits never expire
$0.0008/req Save 20%
$0.0006/req Save 40%
Need 50M+ requests/month?
Contact for Enterprise PricingEnterprise: volume pricing from $0.0004/req · 99.9% SLA · dedicated account manager · annual contracts
Sign up in 30 seconds. Get 1,000 free requests. No credit card required.
from openai import OpenAI client = OpenAI( api_key="ts_your_key", base_url="https://api.tokensurf.io/v1" ) # That's it. You're saving money.