MMatt Goren
← AI hub
Topic hub · 20 pieces

Cost & Models

Choosing models and controlling spend.

Guide8 min

Building With Claude: Strengths, Quirks, and How to Get the Most Out of It

How I build with Claude in production: where it shines, which tier to use, prompt caching, structured output, extended thinking, and the honest limits.

Models & Capabilities
Guide7 min

Self-Hosting Open Models: Llama, Mistral, and When It's Worth It

The real case for running Llama and Mistral yourself — privacy, cost at scale, and control — versus the operational burden that eats the savings.

Models & Capabilities
Comparison8 min

AEO vs Paid Ads: Where Should Your Next Dollar Go?

A side-by-side on cost curve, durability, trust, and speed so you know exactly where your next acquisition dollar should land.

AI for Operators
Pillar12 min

AI Leverage: The Operator's Playbook

How a solo operator or small team turns AI into cheap senior labor you direct — where it pays off, where it wastes time, and how leverage compounds.

AI for Operators
FAQ6 min

AI for Operators: Frequently Asked Questions

Straight answers to the questions operators actually ask about AI: cost, headcount, where to start, quality, data safety, and ROI.

AI for Operators
Comparison8 min

Big Model vs Small Model: When Cheap and Fast Wins

Frontier model or small fast one? Quality, cost, latency, and reliability head to head, plus the fan-out-cheap, escalate-to-frontier pattern.

Models & Capabilities
FAQ6 min

Building With AI: Frequently Asked Questions

Practical answers for builders: model choice, RAG vs fine-tuning, agents, hallucinations, evals, cost, latency, and getting started with an LLM.

Building with LLMs
Pillar13 min

Building With LLMs: An Operator's Field Guide

How I actually build with large language models: model tiers, prompting as spec, structured output, evals, guardrails, and what breaks in production.

Building with LLMs
Comparison8 min

Claude vs GPT vs Gemini: Picking a Model as a Builder

Choosing an LLM to build on, not chat with: reasoning, tool use, context, cost tiers, and where each family actually wins.

Models & Capabilities
Guide8 min

Context Engineering: The Skill That Replaced Prompt Hacking

Managing the context window is the real craft now. What to put in, retrieval vs stuffing, ordering, caching, compaction, token budgets, and multi-turn memory.

Building with LLMs
Guide8 min

How to Choose an LLM (and Switch Without Pain)

A practical decision process for picking an LLM: define the task, pick a tier, test on your data, and architect so switching is a config change.

Models & Capabilities
Guide8 min

How to Cut Your LLM Costs (Without Cutting Quality)

Prompt caching, batching, model routing, leaner context, output caps — the levers that drop your AI bill without touching output quality.

Building with LLMs
Comparison7 min

In-House AI Content vs Hiring It Out

Build the AI content engine yourself or hire an agency? A clear breakdown of cost, control, quality, and what to never outsource.

AI for Operators
FAQ7 min

Models & Capabilities: Frequently Asked Questions

Straight answers to the questions builders actually ask about LLMs: tokens, context windows, cost, hallucination, multimodality, and more.

Models & Capabilities
Comparison9 min

Open-Weight vs Closed Models: What Builders Should Actually Use

Open-weight or closed API model for your product? Capability, cost, privacy, control, support, and total cost of ownership, decided by use case.

Models & Capabilities
Comparison9 min

RAG vs Fine-Tuning vs Long Context: How to Give a Model Your Knowledge

Three ways to put your proprietary knowledge into an LLM — retrieval, fine-tuning, long context. What each costs, when each wins, how they combine.

Building with LLMs
Guide8 min

Reasoning Models, Explained: When Thinking Longer Helps

What reasoning and extended-thinking models actually do, where step-by-step deliberation beats a fast answer, and when it's just burning money.

Models & Capabilities
Pillar10 min

The Frontier Model Landscape: A Builder's Map

A builder's map of the frontier LLM landscape: the families, the dimensions that matter, and why you should design to swap models.

Models & Capabilities
Comparison8 min

Vector Search vs Keyword Search for RAG

Semantic embedding retrieval vs lexical keyword search for RAG — accuracy, cost, setup, failure modes, and why hybrid usually wins.

Building with LLMs
Guide7 min

When Fine-Tuning Is Actually Worth It

The honest cases for fine-tuning versus prompting, RAG, and long context — plus the maintenance cost that's why most teams shouldn't start here.

Models & Capabilities