Research

Norvig Spelling Algorithm — How Terse Implements Edit-Distance Spell Correction

Peter Norvig's elegant spelling corrector meets QWERTY proximity weighting and bigram context. Here's how Terse catches the typos that standard spellcheck misses — and prevents the costly agent retry loops they cause.

The Original Norvig Algorithm

In 2007, Peter Norvig — then Director of Research at Google — published "How to Write a Spelling Corrector," a concise essay that demonstrated how a functional spelling corrector could be built in 21 lines of Python. The algorithm became one of the most widely studied pieces of code in natural language processing, not because it was the most accurate spellchecker ever built, but because it elegantly captured the core principles of probabilistic spell correction.

The algorithm works by combining two ideas. First, given a misspelled word, generate all possible corrections within a certain edit distance. Second, rank those corrections by how frequently they appear in a reference corpus. The most probable correction is the one that is both close to the misspelling (low edit distance) and common in real text (high frequency).

Formally, the algorithm computes argmax(P(c) * P(w|c)) for each candidate correction c, where P(c) is the prior probability of the word (its frequency) and P(w|c) is the likelihood that the user intended to type c but produced w instead. Norvig simplified P(w|c) by assuming all edit-distance-1 errors are equally likely and all edit-distance-2 errors are equally likely but less probable than distance-1 errors.

How Edit Distance Works

Edit distance (also called Levenshtein distance) measures the minimum number of single-character operations needed to transform one string into another. The four operations that define the edit space are:

Deletion: Remove one character. fucntion to funtion (delete the first 'c').
Insertion: Add one character. funtion to function (insert 'c').
Substitution: Replace one character with another. requst to request (substitute 'u' with 'ue' — actually this requires two operations, but recust to request is a single substitution).
Transposition: Swap two adjacent characters. teh to the, reqct to rect.

For a word of length n, there are n possible deletions, 26(n+1) insertions, 25n substitutions, and n-1 transpositions at edit distance 1. That produces a large candidate set — roughly 114 candidates for a 5-letter word. At edit distance 2, the set explodes to over 10,000 candidates. Norvig's insight was that checking these candidates against a word frequency dictionary is fast enough to be practical, making the brute-force generation approach viable.

How Terse Enhances Norvig's Approach

Terse uses Norvig's algorithm as its foundation but adds four significant enhancements that make it substantially more accurate for the specific domain of AI prompt correction.

QWERTY Keyboard Proximity Weighting

Norvig's original algorithm treats all substitution errors as equally likely. In practice, they are not. A user is far more likely to type r instead of e (adjacent keys on QWERTY) than r instead of m (opposite sides of the keyboard). Terse assigns a proximity score to each key pair based on their physical distance on the QWERTY layout.

When ranking correction candidates, substitutions involving adjacent keys receive a higher likelihood score. This means that thr is more likely to be corrected to the (r is adjacent to e) than to thr being treated as a valid but rare word. The proximity map covers the full QWERTY layout including the number row, and the weighting is tuned empirically against a corpus of real developer typos.

// Simplified proximity scoring
const QWERTY_NEIGHBORS = {
  'q': ['w','a'],      'w': ['q','e','a','s'],
  'e': ['w','r','s','d'], 'r': ['e','t','d','f'],
  't': ['r','y','f','g'], ...
};

function proximityScore(typed, intended) {
  if (QWERTY_NEIGHBORS[typed]?.includes(intended)) return 0.8;
  return 0.2; // non-adjacent substitution
}

Word Frequency Ranking

Terse maintains a frequency-ranked dictionary of over 1,500 common English words, with additional weighting for words that appear frequently in programming and AI contexts. Words like function, authentication, component, configuration, and implementation receive boosted frequency scores because they appear disproportionately often in developer prompts.

This domain-specific frequency ranking means that when a typo like authetication generates multiple edit-distance-1 candidates, authentication wins decisively because of its high frequency in the programming domain, even if other valid English words are technically closer by some metrics.

nspell/Hunspell Integration

For candidate validation, Terse integrates with nspell (a JavaScript port of Hunspell) to verify that correction candidates are real words. This prevents the algorithm from "correcting" a misspelling to another misspelling that happens to have a high edit-distance score. The Hunspell dictionary provides comprehensive coverage of English (and other languages via system dictionaries), catching edge cases that a frequency-only approach would miss.

Additionally, Terse leverages macOS NSSpellChecker through its Swift helper binary. This gives access to the system's full multilingual spell checking capabilities, handling cases where the user writes prompts that mix English with other languages — a common pattern in international development teams.

Context-Aware Bigram Probabilities

The most significant enhancement is bigram-based context awareness. Norvig's algorithm corrects each word in isolation, which can produce incorrect results when a misspelling is also a valid word. The classic example: "I went to the stroe" — stroe is clearly store, but what about "I red the book"? The word red is valid, but in this context the user meant read.

Terse uses bigram probabilities (the likelihood of two specific words appearing adjacent to each other) to detect and correct these real-word errors. If the bigram "I red" has a much lower probability than "I read", the system flags it as a potential error and offers the correction. This catches a class of errors that purely edit-distance-based systems miss entirely.

Why Standard Spellcheck Fails for AI Prompts

You might wonder why a dedicated spelling correction system is necessary when browsers, IDEs, and operating systems all have built-in spellcheck. The answer is that standard spellcheck has significant blind spots in the contexts where AI prompts are actually written.

Browser Spellcheck Only Works in Browsers

Chrome and Safari underline misspelled words in text fields, but many AI interactions happen outside the browser. Terminal-based tools like Claude Code, shell prompts for Copilot CLI, and VS Code's integrated chat panels either lack spellcheck entirely or have it disabled by default. If you type authetication into a terminal prompt, nothing will catch it.

Grammarly and Similar Tools Miss Terminals and IDEs

Grammar-checking extensions like Grammarly work well in browsers and some desktop apps, but they do not integrate with terminals, VS Code's editor panels, or command-line interfaces. The places where developers most need spell correction are precisely the places where these tools do not operate.

Neither Catches Coding-Specific Typos

Standard spellcheckers are trained on general English text. They do not know that fucntion should be function, that reqct should be react, or that compnent should be component. These words are not in Merriam-Webster, so conventional spellcheckers either ignore them or suggest inappropriate corrections. Terse's domain-specific dictionary handles exactly these cases.

The Hardcoded TYPOS Dictionary

Before the Norvig algorithm even runs, Terse applies a hardcoded dictionary of over 400 coding-specific and prompt-specific typos. This dictionary is the fastest path to correction — a direct lookup table with zero ambiguity.

// Sample entries from Terse's TYPOS dictionary
"fucntion"    -> "function"
"authetication" -> "authentication"
"reqct"       -> "react"
"compnent"    -> "component"
"implmentation" -> "implementation"
"confguration" -> "configuration"
"dependncy"   -> "dependency"
"respnse"     -> "response"
"asynchrnous" -> "asynchronous"
"middlewre"    -> "middleware"

The dictionary was compiled by analyzing thousands of real prompts and identifying the typos that appear most frequently in developer workflows. Each entry is a direct mapping — no edit-distance computation, no probability ranking, no ambiguity. If the typo appears in the dictionary, it is corrected instantly.

This two-tier approach (hardcoded dictionary first, then Norvig algorithm for everything else) gives Terse both speed and coverage. The most common typos are caught in constant time. Less common typos are caught by the probabilistic algorithm. Together, they achieve correction coverage that exceeds what either approach provides alone.

Protecting Code: What Terse Does Not Correct

A spell corrector running on AI prompts must be careful about what it changes. Prompts often contain code snippets, variable names, acronyms, and technical terms that would be mangled by aggressive spell correction. Terse applies several protective heuristics:

ALL-CAPS words are skipped — they are almost always acronyms (API, HTTP, JSON, URL).
Capitalized words (PascalCase) are skipped — they are likely class names or proper nouns (React, TypeScript, WebSocket).
camelCase tokens are identified and skipped — getElementById, useState, fetchData.
Words containing numbers are skipped — likely variable names or identifiers (v2, config3, user1).
URLs and file paths are detected and excluded from correction entirely.
Content inside backtick code blocks is left untouched.

These rules ensure that spell correction improves the natural language portions of a prompt without interfering with the technical content. The corrector fixes "Please refactor the authetication compnent" to "Please refactor the authentication component" while leaving any code references completely intact.

Impact on Agent Sessions

Spell correction might seem like a minor optimization — fixing a few typos saves a handful of tokens at most. But in agent sessions, the impact is dramatically larger because of how AI models process misspelled instructions.

When an agent encounters a typo like authetication in a prompt, several things can go wrong. The model might misinterpret the word, generating code that references a misspelled variable or module name. The model might ask for clarification, consuming a full round-trip of tokens. In tool-using agent sessions, the model might execute a search or file read to resolve the ambiguity, costing additional tool-call tokens.

Consider this scenario in Claude Code: a developer types "refactor the authetication middlewre to use asynchrnous handlers". Without correction, Claude might:

Search the codebase for files matching "authetication" (tool call, no results)
Search again for "authentication" (another tool call, finds results)
Read the file to understand the context (tool call)
Generate the refactored code, possibly with "middlewre" appearing in comments

That extra search in step 1 costs tokens for the tool call, the failed result, and the context it adds to the conversation. Multiply this across a 40-turn agent session and the cost of uncorrected typos compounds rapidly. A single typo in turn 3 can cause retry loops, failed tool calls, and wasted context in every subsequent turn.

Terse's spell correction runs before the prompt ever reaches the model, eliminating these failure cascades at the source. The corrected prompt "refactor the authentication middleware to use asynchronous handlers" resolves cleanly on the first attempt, saving both tokens and time.

The Two-Pass Pipeline

Terse's spell correction operates as the first stage of a multi-pass optimization pipeline. The full sequence is:

Pass 1 — Hardcoded TYPOS dictionary: Direct lookup, constant time. Catches the 400+ most common developer typos with zero false positives.
Pass 2 — macOS NSSpellChecker: System-level spellcheck via the Swift helper binary. Handles multilingual text and words not in the hardcoded dictionary.
Pass 3 — Norvig algorithm with enhancements: QWERTY proximity, frequency ranking, and bigram context for remaining errors.
Pass 4+ — Pattern optimization, NLP analysis, and telegraph compression (depending on the selected mode).

Spell correction runs in every mode — Soft, Normal, and Aggressive. Even the lightest compression mode fixes typos, because typo correction never changes the user's intended meaning. It is the safest, most universally beneficial optimization Terse performs.

Stop Typos Before They Cost You Tokens

Terse catches the typos your spellchecker misses — in terminals, IDEs, and agent interfaces. 400+ hardcoded corrections plus Norvig-enhanced spell correction, all running on-device in under a millisecond.

Download Terse