Terse API
Token optimization for vibe coding projects. Reduce LLM API costs by 30–60% with one call — the same engine powering the Terse Mac app.
Overview
The Terse API lets you integrate the same token optimization engine from the Terse Mac Tauri app directly into your own project. It exposes three core capabilities:
- Optimize — send a prompt string, get back a compressed version with token counts
- Scan — pass source code to detect every LLM API call site and get recommendations
- Projects platform — publish your project to the Terse showcase for community discovery
The optimizer uses the same three-mode pipeline as the Mac app: soft (typo fix + phrase shortening), normal (+ filler/hedging removal, question→imperative), and aggressive (+ abbreviations, markdown strip, telegraph compression).
https://www.terseai.org/api/v1
Quickstart
Get your API key, then optimize your first prompt in under 60 seconds:
Authentication
All API endpoints require an API key passed as a Bearer token in the Authorization header:
Authorization: Bearer tsk_your_key_here
Alternatively, pass the key in the X-Api-Key header:
X-Api-Key: tsk_your_key_here
Keys follow the format tsk_<28-char random>. They are hashed before storage — if you lose a key, revoke it and create a new one.
Getting a key
Sign in at terseai.org, then call POST /api/v1/keys with your Clerk session token, or use the dashboard on the landing page. Keys are scoped to your account and track usage stats.
Optimization modes
Every optimize call accepts a mode parameter:
| Mode | What it does | Typical reduction |
|---|---|---|
soft |
Typo correction, whitespace compression, safe phrase shortening ("in order to" → "to"), contraction, greeting/thanks removal. Meaning 100% preserved. | 10–25% |
normal |
Everything in soft, plus: politeness removal, hedging removal ("I think", "maybe"), question→imperative ("Can you explain..." → "Explain..."), meta-language removal, redundant modifier collapse, sentence deduplication, numeralize. | 30–50% |
aggressive |
Everything in normal, plus: abbreviations (w/, bc, fn, db...), markdown header/bold strip, article removal in instruction contexts, telegraph compression, low-info sentence dropping. | 50–70% |
Plans & limits
Every account starts on the Free plan — no credit card required. The API plan is separate from the Terse macOS app subscription. Two limits are enforced on every request:
| Plan | Rate limit | Monthly tokens | Price |
|---|---|---|---|
| Free | 60 requests / min | 500,000 tokens | $0 — no card needed |
| Pro | 600 requests / min | 50,000,000 tokens | $29 / mo, no trial |
The monthly token counter resets automatically at the start of each calendar month (UTC). Upgrade any time at terseai.org/#api-pricing or from your dashboard.
When a limit is hit
Both limits return HTTP 429 with a JSON body explaining which limit was exceeded:
// Rate limit (too many requests this minute)
{ "error": "Rate limit exceeded (60 req/min on free plan).",
"upgrade": "https://terseai.org/#api-pricing" }
// Monthly token quota exceeded
{ "error": "Monthly token quota exceeded (500,000 tokens on free plan).",
"tokens_used": 500000, "tokens_limit": 500000,
"upgrade": "https://terseai.org/#api-pricing" }
On a 429, back off and retry after the minute window, or upgrade for higher limits. Token usage counts the original (input) token count of each request, and is visible any time on your dashboard.
POST /optimize
Optimize a prompt or text string. Returns the compressed version plus token stats. This is the core endpoint — call it before every LLM API call to reduce costs.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
| text | string | required | The prompt or text to optimize. Max 50,000 characters. |
| mode | string | optional | One of soft, normal, aggressive. Default: normal. |
Response
{
"original": "Can you please help me understand how the auth middleware works?",
"optimized": "Explain the auth middleware.",
"tokens_original": 16,
"tokens_optimized": 6,
"tokens_saved": 10,
"reduction_pct": 62,
"techniques": ["phrase_optimization", "filler_removal"],
"mode": "normal"
}
Example
POST /scan
Scan source code to detect LLM API call sites. Returns a list of findings with line numbers, types, and specific recommendations for each call site. Use this during development to audit where tokens are being spent.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
| code | string | required | Source code to scan. Max 500,000 characters. |
| language | string | optional | One of javascript, typescript, python. Default: javascript. |
Detected call site types
anthropic_messages—client.messages.create()callsopenai_chat—client.chat.completions.create()callsgoogle_gemini— Gemini API callsanthropic_http/openai_http— raw HTTP fetch to API endpointssystem_prompt/user_prompt— prompt variable assignmentsmessages_array—"messages":array construction sites
Response
{
"findings": [
{
"line": 12,
"type": "anthropic_messages",
"preview": "const response = await client.messages.create({",
"estimated_tokens": 85,
"potential_savings_pct": 35,
"recommendation": "Wrap this call with Terse to auto-optimize the 'content' field before it reaches the API."
},
{
"line": 34,
"type": "system_prompt",
"preview": "const system_prompt = `You are a helpful assistant...",
"estimated_tokens": 140,
"potential_savings_pct": 45,
"recommendation": "System prompts are ideal candidates for aggressive mode (removes filler, compresses markdown)."
}
],
"total_findings": 2,
"estimated_monthly_savings_tokens": 78540,
"recommendation": "Found 2 optimization opportunities. Wrapping these calls with Terse could save ~78540 tokens/month."
}
Example — scan a whole file
const fs = require('fs');
const code = fs.readFileSync('./src/agent.js', 'utf8');
const res = await fetch('https://www.terseai.org/api/v1/scan', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TERSE_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ code, language: 'javascript' }),
});
const result = await res.json();
console.log(`Found ${result.total_findings} LLM calls`);
result.findings.forEach(f => {
console.log(` Line ${f.line} [${f.type}]: ${f.recommendation}`);
});
POST /keys
Create a new developer API key. Requires a valid Clerk session token (sign in at terseai.org first). The full key is only returned once — store it securely.
Headers
Pass your Clerk session JWT as Authorization: Bearer <clerk_token> (not a tsk_ key — this endpoint uses your account login token).
Request body
| Field | Type | Required | Description |
|---|---|---|---|
| label | string | optional | Human-readable label for this key. Max 60 chars. Default: "Default". |
Response
{
"key": "tsk_AbCdEfGhIjKlMnOpQrStUvWxYz12345",
"prefix": "tsk_AbCdEfGh...",
"label": "My vibe project",
"id": "uuid-here",
"created_at": "2026-05-15T12:00:00.000Z"
}
GET /keys
List all API keys for your account. Returns metadata only — full key values are never returned after creation.
Response
{
"keys": [
{
"id": "uuid",
"key_prefix": "tsk_AbCdEfGh...",
"label": "My vibe project",
"is_active": 1,
"requests_total": 1247,
"tokens_optimized": 892430,
"last_used_at": "2026-05-15T11:22:00",
"created_at": "2026-05-01T09:00:00"
}
]
}
DELETE /keys/:id
Revoke an API key. Requests made with the revoked key will immediately return 401. Use the key ID from GET /keys.
Response
{ "ok": true }
GET /projects
List published vibe coding projects from the platform. No auth required. Featured projects appear first, then sorted by upvotes.
Query parameters
| Param | Type | Required | Description |
|---|---|---|---|
| limit | number | optional | Max results (1–100). Default: 50. |
Response
{
"projects": [
{
"id": "uuid",
"name": "ClaudeFlow",
"description": "Multi-agent workflow builder...",
"github_url": "https://github.com/...",
"website_url": "https://claudeflow.dev",
"tags": ["Node.js", "Agents"],
"tokens_saved_monthly": 38000,
"cost_saved_monthly_cents": 570,
"upvotes": 247,
"is_featured": true,
"submitted_at": "2026-05-10T08:00:00"
}
]
}
POST /projects
Submit your vibe coding project to the showcase. Accepts either a Clerk session token or a developer API key (tsk_...). Projects are reviewed before being featured.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
| name | string | required | Project name. Max 100 chars. |
| description | string | required | What it does and how Terse helps. Max 500 chars. |
| github_url | string | optional | GitHub repository URL. |
| website_url | string | optional | Live project or landing page URL. |
| tags | string[] | optional | Array of tags (max 5, each max 30 chars). e.g. ["Python", "Agents", "RAG"]. |
| tokens_saved_monthly | number | optional | Estimated tokens saved per month using Terse. |
| cost_saved_monthly_cents | number | optional | Estimated cost saved per month in cents. |
Response
{ "ok": true, "id": "uuid" }
Submit via API key (programmatic)
const res = await fetch('https://www.terseai.org/api/v1/projects', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TERSE_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
name: 'My Vibe Project',
description: 'An AI coding tool that uses Terse to cut API costs by 40%.',
github_url: 'https://github.com/you/myproject',
tags: ['TypeScript', 'Agents'],
tokens_saved_monthly: 50000,
}),
});
POST /projects/:id/upvote
Upvote a project. No auth required. Pass the project ID from GET /projects.
Response
{ "ok": true }
Guide: Wrapper pattern
The simplest integration is a thin wrapper around your LLM client. Optimize every prompt before it's sent:
Guide: Scan your project
Use the scan endpoint as a pre-commit hook or CI step to audit every LLM call site in your codebase:
// audit-llm-calls.js — run with: node audit-llm-calls.js
const fs = require('fs');
const path = require('path');
const { globSync } = require('glob'); // or any file walker
const API_KEY = process.env.TERSE_API_KEY;
const files = globSync('./src/**/*.{js,ts}');
let totalFindings = 0;
for (const file of files) {
const code = fs.readFileSync(file, 'utf8');
const res = await fetch('https://www.terseai.org/api/v1/scan', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ code, language: path.extname(file).slice(1) }),
});
const result = await res.json();
if (result.total_findings > 0) {
console.log(`\n${file} — ${result.total_findings} LLM call(s):`);
result.findings.forEach(f => {
console.log(` Line ${f.line} [${f.type}] ~${f.estimated_tokens} tokens`);
console.log(` → ${f.recommendation}`);
});
totalFindings += result.total_findings;
}
}
console.log(`\nTotal: ${totalFindings} call sites across ${files.length} files.`);
Guide: Publishing to the platform
Once your project uses the Terse API, submit it to the showcase to get traffic from the Terse developer community. Projects are visible on the landing page and searchable by other developers looking for vibe coding tools.
Submit via the API using your tsk_... key, or use the Submit your project button on the landing page:
// Submitting programmatically
const res = await fetch('https://www.terseai.org/api/v1/projects', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TERSE_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
name: 'VibeFlow',
description: 'A CLI tool that orchestrates Claude Code agents for full-stack feature development. Integrates Terse API to compress prompts between agent turns, reducing session cost by ~42%.',
github_url: 'https://github.com/yourname/vibeflow',
website_url: 'https://vibeflow.dev',
tags: ['CLI', 'Agents', 'TypeScript'],
tokens_saved_monthly: 65000,
}),
});
Questions? Open an issue on GitHub or reach out at [email protected].