OpenAI Anthropic Google OpenRouter · 300+ models

One line change.
Half the cost.

TokenSurf routes your ChatGPT, Claude, and Gemini calls to cheaper models when the query is simple. No SDK. No lock-in.

70+
Models
4
Providers
3
Regions
40-94%
Savings

Integration in 1 Line

Same OpenAI SDK. Works with GPT, Claude, and Gemini. Just change the URL.

from openai import OpenAI

client = OpenAI(
    api_key="ts_your_tokensurf_key",
    base_url="https://api.tokensurf.io/v1"   # That's it)

# OpenAI — gpt-4o routed to gpt-4o-mini (94% savings)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}]
)

# Anthropic — claude-opus-4 routed to claude-haiku-3.5 (94% savings)
response = client.chat.completions.create(
    model="claude-opus-4",
    messages=[{"role": "user", "content": "Translate hello to French"}]
)

# Google — gemini-2.5-pro routed to gemini-2.5-flash (72% savings)
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Define photosynthesis"}]
)

How It Works

One proxy between you and the providers. Change one URL.

code Step 1

Swap one URL

Point your SDK at TokenSurf. Same code, same models. One line change.

classify Step 2

We classify & route

"What is 2+2?" goes to a cheap model. "Write me a React app" keeps yours.

savings Step 3

Cut your bill in half

Save 50-99% on simple calls. Same quality. Your keys, your providers.

Works with OpenAI, Anthropic, Google, and 300+ models via OpenRouter. Read the full architecture →

Real Pricing, Real Savings

When a query is simple, we route it to a cheaper model. Here's exactly what you pay and save.

You RequestCost per 1M tokensWe Route ToCost per 1M tokensYou Save
gpt-4$30 / $60gpt-4o-mini$0.15 / $0.6099%
gpt-4-turbo$5 / $15gpt-4o-mini$0.15 / $0.6097%
gpt-4o$2.50 / $10gpt-4o-mini$0.15 / $0.6094%
claude-opus-4$15 / $75claude-haiku-3.5$0.80 / $495%
claude-sonnet-4$3 / $15claude-haiku-3.5$0.80 / $473%
gemini-2.5-pro$1.25 / $10gemini-2.5-flash$0.30 / $2.5076%

Prices shown as input / output per 1M tokens. Already on a cheap model? We pass through unchanged. 300+ models available via OpenRouter.

Built for Production

Every feature you need to run LLMs at scale. All included on every plan.

Routing
Smart Cost Routing
AI classifies every request as simple or complex. Simple queries go to cheaper models automatically. Complex queries stay on your original model. You save 40-94% without losing quality.
📋
Content-Based Rules
Define regex patterns that override the classifier. "If the prompt contains a code block, never downgrade." "If the system prompt says translate, always downgrade." Your domain knowledge, our routing engine.
Priority & Latency Routing
Tag requests as high-priority (never downgrade) or low-priority (always downgrade). Set max latency targets — if a provider is slow, auto-switch to a faster model.
🧠
Custom Classifier
Replace our default AI classifier with your own prompt. Define what "simple" means for your business. A home improvement chatbot's "simple" is different from a coding assistant's.
Caching & Performance
Semantic Response Cache
Identical requests return cached responses instantly. Zero API call, zero token cost, zero credits consumed. Your chatbot answers "What are your hours?" 500 times a day — you pay once.
📈
Context Window Management
Long conversations that exceed model limits get auto-trimmed. System prompt preserved, oldest messages dropped, newest kept. No more 400 errors on long chats.
Reliability
🛡
Cross-Provider Fallbacks
If OpenAI is down, your request automatically goes to Claude or Gemini. Circuit breakers detect outages, retry logic handles transient errors, and fallback chains keep your app running.
🌐
Multi-Region Deployment
Deployed in US, EU, and Asia. Requests are served from the nearest region. Redis caching, connection pooling, and rate limiting built in.
Observability
📊
Analytics Dashboard
See every request: which model was used, was it downgraded, how much you saved. Cost breakdown by model and by feature tag. "Your checkout flow costs $12K/month — 60% is one bad prompt."
🔔
Alerts & Quality Scoring
Get warned when credits run low, error rates spike, or daily spend exceeds budget. 5% of responses are auto-scored for quality so you can verify downgraded models still meet your bar.
📨
Webhooks
POST every routing decision to your endpoint. Feed into Slack, PagerDuty, Datadog, or your own analytics. Know exactly what's happening in real time.
Security
🔒
PII Redaction
Auto-detect and strip emails, Social Security numbers, credit card numbers, phone numbers, and IP addresses from prompts before they reach the LLM provider. One toggle. Enterprise compliance checkbox done.
🔑
Bring Your Own Keys
Your API keys are encrypted with AES-256-GCM at rest. We never store them in plaintext. Key rotation with 24-hour grace period. Audit logging on every security event. Your keys, your control.
Platform
👥
Teams & Organizations
Create teams, invite members with roles (owner/admin/member), issue labeled API keys with per-key budget caps and rate limits. One bill, full control over who uses what.
📝
Prompt Template Library
Store system prompts server-side and reference them by ID. Change how your AI responds without redeploying your app. Version and manage prompts from the dashboard.
API Playground
Test any model from the dashboard. Send a prompt, see the response, the routing decision, cache status, and cost breakdown side by side. No code required.

All features. Every plan. No gating.

The only difference between plans is volume and price per request.

Get Started Free See the Roadmap

Simple, Predictable Pricing

Bring your own API keys. $0.001 per request. Commit to volume, pay less.

Free

Free

$0

1,000 requests/month

  • No credit card required
  • All providers (BYOK)
  • Basic smart routing
  • Dashboard analytics
Get Started

Pay As You Go

$0.001/req

Top up anytime, credits never expire

  • AI-powered routing
  • All providers (BYOK)
  • No commitment required
  • Full analytics dashboard
  • $10 minimum top-up
Buy Credits

Scale

$3,000/mo

$0.0006/req Save 40%

  • 5M requests/month included
  • Dedicated support
  • Unlimited team keys
  • Custom routing rules
  • Quality scoring
  • Overage at $0.0007/req
Start Scale Plan

Need 50M+ requests/month?

Contact for Enterprise Pricing

Enterprise: volume pricing from $0.0004/req · 99.9% SLA · dedicated account manager · annual contracts

Ready to Start?

Sign up in 30 seconds. Get 1,000 free requests. No credit card required.

from openai import OpenAI

client = OpenAI(
    api_key="ts_your_key",
    base_url="https://api.tokensurf.io/v1"
)  # That's it. You're saving money.