I Built a Programming Language Where think Is a Keyword

We write if, for, and return every day without a second thought. These are the primitives of computation — the building blocks we use to tell machines what to do.

But we’re in a new era now. AI isn’t a service you call occasionally — it’s becoming a fundamental computing primitive. So why are we still using it like this?

const response = await anthropic.messages.create({
model: “claude-opus-4-6”,
max_tokens: 1024,
messages: [{ role: “user”, content: “Analyze the sentiment of: ” + text }],
});
const result = JSON.parse(response.content[0].text);
// hope it’s the right shape…

API boilerplate. Manual prompt construction. Untyped JSON parsing. Praying the response matches what you expected.

What if instead, you could just write:

let sentiment = think(“Analyze the sentiment of this review”)

That’s why I built ThinkLang — an open-source, AI-native programming language where think is a keyword.

What Is ThinkLang?

ThinkLang is a transpiler. You write .tl files, and the compiler turns them into TypeScript that calls an LLM runtime. The language has its own parser (PEG
grammar), type checker, code generator, LSP server, VS Code extension, testing framework, and CLI.

The compilation pipeline: parse → resolve imports → type check → code generate → execute.

The core idea is simple: AI should be a language-level primitive, not a library call.

The Basics: think, infer, reason

ThinkLang has three AI primitives built into the language:

think(prompt)

The primary primitive. Give it a type and a prompt, and it returns structured, type-safe output.

type MovieReview {
title: string
rating: int
pros: string[]
cons: string[]
verdict: string
}

let review = think(“Review the movie Inception”)

print(review.title) // type-safe access
print(review.rating) // guaranteed to be an int

The compiler turns your type declaration into a JSON schema that constrains the LLM’s output. The AI cannot return anything that violates your type. No
parsing. No validation. No hoping.

infer(value)

Lightweight inference on existing values — when you already have data and want the AI to derive something from it.

type Category {
label: string
confidence: float
}

let data = “The patient presents with a persistent cough and fever”
let category = infer(data)

reason {}

Multi-step reasoning with explicit goals and steps:

type Analysis {
findings: string[]
recommendation: string
risk_level: string
}

let analysis = reason {
goal: “Evaluate this investment opportunity”
steps:
1. “Analyze the financial fundamentals”
2. “Assess market conditions and competition”
3. “Evaluate risk factors”
4. “Form a final recommendation”
with context: { portfolio, market_data }
}

This compiles into a structured chain-of-thought prompt. The AI follows your steps explicitly rather than reasoning however it wants.

Confidence as a Language Concept

Here’s something most AI wrappers get wrong: they treat every AI response as equally certain. But AI outputs have varying levels of confidence, and your code
should reflect that.

ThinkLang has Confident:

type Diagnosis {
condition: string
severity: string
}

let result = think>(“Diagnose based on these symptoms”)

print(result.value) // the actual diagnosis
print(result.confidence) // 0.0 to 1.0
print(result.reasoning) // why the AI is this confident

And the uncertain modifier forces you to handle uncertainty explicitly:

uncertain let diagnosis = think>(“Diagnose this”)

// This won’t compile:
// print(diagnosis.value)

// You must explicitly unwrap:
let safe = diagnosis.expect(0.8) // throws if confidence < 0.8
let fallback = diagnosis.or(default_val) // use fallback if low confidence
let raw = diagnosis.unwrap() // explicit “I accept the risk”

The compiler enforces this. You can’t silently ignore uncertainty — you have to make a conscious decision about how to handle it.

Output Guards

What if the AI returns something that’s technically the right type but semantically wrong? A summary that’s too long. A response that contains placeholder
text. A rating that’s out of range.

Guards are declarative validation rules with automatic retry:

let summary = think(“Summarize this article”)
guard {
length: 50..200
contains_none: [“TODO”, “placeholder”, “as an AI”]
}
on_fail: retry(3) then fallback(“Could not generate summary”)

If the output fails validation, ThinkLang automatically retries up to 3 times. If all retries fail, it falls back to your default. No manual retry loops. No
callback hell.

You can also guard numeric fields:

let review = think(“Review this product”)
guard {
rating: 1..5
length: 100..500
}
on_fail: retry(2)

Pattern Matching on AI Outputs

ThinkLang has structural pattern matching that works beautifully with AI-generated data:

type Sentiment {
label: string
intensity: int
}

let sentiment = think(“Analyze: ‘This is the best day ever!'”)

match sentiment {
{ label: “positive”, intensity: >= 8 } => print(“Extremely positive!”)
{ label: “positive” } => print(“Positive”)
{ label: “negative”, intensity: >= 8 } => print(“Extremely negative”)
{ label: “negative” } => print(“Negative”)
_ => print(“Neutral”)
}

Pattern matching + typed AI outputs = clean, readable branching on AI results.

Context Management

Control exactly what data the AI sees:

let user_profile = { name: “Alice”, preferences: [“sci-fi”, “thriller”] }
let secret_key = “sk-abc123”

let recommendation = think(“Recommend a movie for this user”)
with context: user_profile
without context: secret_key

with context scopes data into the prompt. without context explicitly excludes sensitive data. No accidentally leaking API keys into prompts.

Pipeline Operator

Chain AI operations with |>:

let result = “raw text input”
|> think(“Summarize this”)
|> think(“Translate to French”)
|> think(“Rate the translation quality”)

Readable, composable, functional-style AI pipelines.

Built-in Testing Framework

This is one of my favorite features. ThinkLang has a built-in test framework that understands AI:

test “sentiment analysis” {
let result = think(“Analyze: ‘I love this product'”)

assert result.label == "positive"
assert result.intensity > 5

}

test “summary quality” {
let summary = think(“Summarize the theory of relativity”)

assert.semantic(summary, "explains relationship between space and time")
assert.semantic(summary, "mentions Einstein")

}

assert.semantic() is an AI-powered assertion. It uses the LLM to evaluate whether the output meets qualitative criteria. No brittle string matching.

Snapshot Replay

AI tests are non-deterministic and cost money. ThinkLang solves both problems:

# Record AI responses to snapshot files
thinklang test –update-snapshots

# Replay from snapshots — zero API calls, deterministic results
thinklang test –replay

Record once, replay forever. Your CI pipeline runs deterministic AI tests without an API key.

Cost Tracking

Every AI call is automatically metered:

thinklang run my-program.tl –show-cost

Cost Summary:
Total calls: 5
Input tokens: 2,847
Output tokens: 1,203
Estimated cost: $0.0234

By operation:
  think: 3 calls, $0.0156
  reason: 1 call, $0.0062
  assert.semantic: 1 call, $0.0016

You can also run thinklang cost-report to see aggregated costs across runs. No surprises on your API bill.

Full Developer Ecosystem

ThinkLang isn’t just a language — it’s a complete toolchain:

CLI with run, compile, repl, test, and cost-report commands
VS Code extension with syntax highlighting and 11 code snippets
Language Server (LSP) providing real-time diagnostics, hover tooltips, code completion, go-to-definition, document symbols, and signature help
Module system with import/export across .tl files
Response caching — identical prompts skip the API call automatically
17 example programs covering every feature

How I Built It

I built ThinkLang as a solo developer, and I have to be honest — Claude Code was an incredible partner throughout the process. It helped me move fast, iterate
on the parser grammar, debug the type checker, and ship a project of this scope. Building an entire language ecosystem (parser, checker, compiler, runtime,
LSP, testing framework, VS Code extension, docs) solo would have taken significantly longer without it. AI-assisted development is real, and this project is
proof of it.

The technical stack:

PEG grammar (Peggy) for parsing
TypeScript for the compiler, runtime, and tooling
Zod for runtime validation
Anthropic SDK for the AI runtime
vscode-languageserver for the LSP
VitePress for the documentation site

What’s Next

ThinkLang is at v0.1.1 and I have big plans:

Model Agnostic Support — Right now ThinkLang uses Anthropic’s Claude. The next milestone is supporting any provider: OpenAI, Gemini, Mistral, local models via
Ollama, or any OpenAI-compatible endpoint. Same language, any brain.

Agentic Native Coding — First-class language primitives for building AI agents. Think tool use, planning loops, and multi-agent coordination as language
keywords, not library patterns.

Try It

npm install -g thinklang

Create hello.tl:

type Greeting {
message: string
emoji: string
}

let greeting = think(“Say hello to a developer trying ThinkLang for the first time”)
print(greeting.message)
print(greeting.emoji)

Run it:

thinklang run hello.tl

Links:

The project is MIT licensed. I’d love stars, feedback, issues, or contributions. And if you’re thinking about what AI-native development looks like — let’s
talk.

Source link