MMatt Goren
← AI hub

Answer Engine Optimization: The Complete Playbook

How to get your business cited inside ChatGPT, Claude, Perplexity, and Google AI answers — the mechanics, the process, and how to measure it.

By Matt Goren · Updated June 25, 2026 · 12 min read

For fifteen years the entire game was rank a blue link, earn the click. That game isn't over, but it now sits inside a bigger one. When someone asks ChatGPT how to choose between two products, or asks Perplexity for the best way to do a thing, or types a question into Google and gets an AI answer above the links, a machine is reading sources and writing the answer for them. The question that decides whether your business exists in that moment isn't "do I rank?" It's "did the model cite me?"

That's the whole shift, and it's what answer engine optimization is about. I build an engine that does this for a living — Otto, the system behind RunOctopus — so this playbook is the mechanics as I actually understand them from shipping against real answer engines, not theory. By the end you'll know what AEO is, how these systems decide what to quote, the concrete steps to get cited, and how to measure whether any of it is working. If you want the SEO-vs-AEO framing in isolation, I wrote that up separately in AEO vs SEO; here we go deep on the playbook.

What AEO actually is (and what it isn't)

Answer engine optimization is structuring your content and your site so AI answer engines retrieve it, ground their answers in it, and name you as a source. Three verbs, in order: retrieve, ground, cite. If any one of them fails, you're invisible in the answer.

It is not keyword-stuffing for robots, and it is not a replacement for SEO. The honest framing is that search is being rewritten and expanded into AI answers, and classic SEO and AEO are complementary disciplines that share a spine. A page that no crawler can read, that has no topical authority, and that's badly written will lose in both. AEO adds a specific layer on top of good SEO: writing in a way a model can lift a clean, correct, quotable claim out of, and marking it up so machines aren't guessing.

The mental shift that matters most: in SEO you're competing for position on a list of ten links, and being number six still gets you some traffic. In AEO there is no number six. An answer engine typically synthesizes from a small handful of sources — often three to six — and cites maybe a few of them. You're either in that tiny set or you don't exist for that query. The distribution is brutally winner-take-most, which is exactly why the structural work is worth it.

How answer engines actually retrieve, ground, and cite

You can't optimize a system you treat as a black box, so here's the pipeline as it really works under the hood across the major engines.

Retrieval. When you ask Perplexity, Google's AI mode, ChatGPT with search on, or Claude with web access a question, the system almost always runs a search first. It rewrites your question into one or more queries, hits an index (its own or a partner search engine), and pulls back a set of candidate pages. This is the gate. If your page isn't in the index, or isn't returned for the rewritten query, nothing else you do matters. This is why classic crawlability and relevance still feed everything — retrieval is built on the same foundations SEO has always cared about.

Chunking and passage selection. The engine doesn't read your whole page with equal weight. It splits content into passages and scores them against the query. The passage that most directly answers the question is the one that gets pulled into the model's context. This is the single most important mechanical fact in AEO: you are optimizing passages, not pages. A 3,000-word article where the answer to a specific question is buried in paragraph nineteen will lose to a page that answers it cleanly in the first two sentences under a matching heading.

Grounding. The model then writes an answer "grounded" in those retrieved passages — meaning it's instructed to base its claims on the provided sources rather than its own memory, and to attach citations. Grounding is where extractability pays off. If your passage states a clear, self-contained claim, the model can quote it confidently and cite you. If your claim is vague, hedged, or only makes sense with three paragraphs of surrounding context, the model paraphrases from a cleaner source and cites them.

Citation. Finally the engine renders the answer with source links. Which sources get named depends on which passages it actually leaned on, how much it trusted them, and sometimes diversity rules so it doesn't cite one domain five times. Trust here is a blend of the page's authority signals and how well the passage matched.

For models drawing on training data rather than live retrieval, there's a parallel path: if your content and your entity were well-represented and consistently described across the web when the model was trained, it may "know" you without a live lookup. You can't reverse-engineer a training cutoff, but the same things that make you citable in retrieval — clarity, consistency, authority — are what get you absorbed into model memory too. Anthropic's current Claude family (Opus 4.8, Sonnet 4.6, Haiku 4.5, and Fable 5) and the frontier models from other labs all combine learned knowledge with live retrieval; specifics of how each weights them move fast, so optimize for the durable mechanics rather than any one engine's current behavior.

The mechanics of getting cited

Everything above collapses into a handful of levers you can actually pull. Here's each one and why it works.

Answer-first structure

Lead with the answer. State the conclusion in the first sentence or two under a heading that matches the question, then expand with the reasoning, caveats, and detail. This is the inverse of how most people write — we love to build up to the point. Answer engines reward the opposite, because the passage selector is looking for a chunk that is the answer, not a chunk that promises one is coming. Every section on a page should be able to stand alone as a quotable unit. I treat the first two sentences of every section as the thing the model will lift.

Extractable, specific claims

A claim is extractable when it's true, self-contained, and specific enough to quote without surrounding context. "Our software is fast" is not extractable. "X reduces invoice processing from three days to under an hour" is. Numbers, named methods, concrete comparisons, and direct definitions all extract cleanly. The discipline here is to write sentences that survive being copied out of the page and pasted into an answer with a citation after them. If a sentence only makes sense in situ, the model won't risk quoting it.

A hard rule, and one I hold the engine I build to: never fabricate the specifics. A made-up statistic that gets cited is a liability that detonates the moment someone checks. Speak from real numbers you can stand behind, or speak from principle and experience. Trust is the whole asset in AEO; don't trade it for a punchier sentence.

Schema and JSON-LD

Structured data is how you tell machines, unambiguously, what your claims and entities are. FAQPage schema maps your questions and answers into a format engines parse directly. Article schema declares author, publish date, and freshness. Organization and Product schema pin down your entity and its attributes. None of this rescues weak content — but on a genuinely useful page it removes ambiguity and friction between your answer and the engine trying to extract it. JSON-LD in the page head is the format to use; keep it accurate and in sync with the visible content, because mismatches between markup and what's on the page get penalized, not rewarded.

llms.txt and crawler access

If AI crawlers can't access your site, you've lost at the retrieval gate before the race starts. Two things matter: don't accidentally block the crawlers you want (check your robots rules and any bot-management or firewall settings), and consider an llms.txt file — an emerging convention for giving AI systems a clean, curated map of your most important content. I wrote the full implementation guide in llms.txt and AI crawlers. The short version: make it trivially easy for these systems to find and read your best material.

Entity and topical authority

Answer engines don't cite random pages; they lean toward sources that are clearly authoritative on the topic. You build that by covering a topic comprehensively — a cluster of interlinked pages around a pillar, not one orphan article — and by being described consistently as an authority across the web. Topical depth is a citation signal because depth is what authority looks like to a machine. One great page on a subject you otherwise never discuss is a weaker bet than that page sitting inside a cluster of fifteen that all reinforce your expertise.

Freshness and being quotable

Many queries have a freshness bias — the engine prefers recently updated sources, especially for anything time-sensitive. Visible, honest update dates and genuinely maintained content help. And "quotable" is its own quality: write the sentence you'd want to see appear, verbatim, inside an AI answer with your name on it. If you can picture the exact pull-quote, you've written an extractable claim. If you can't, the model can't either.

The AEO process, step by step

Here's the actual operating procedure I'd run for a business starting from zero.

1. Map the questions. List the real questions your customers ask before, during, and after buying — in their words, not your marketing language. These are your target prompts. Don't guess at clever keywords; capture genuine questions, because that's what people type into answer engines.

2. Baseline your visibility. Take that prompt list to the real engines — ChatGPT, Claude, Perplexity, Google AI mode — and ask each question. Record whether you're cited, who is cited instead, and what the consensus answer looks like. This is your before picture and it's usually humbling.

3. Build answer-first pages. For each cluster of questions, write or rewrite a page that leads with the answer, uses headings that match the questions, makes specific extractable claims, and goes deep enough to be the authoritative source. Depth wins — match or exceed the best existing answer, never go thin to seem tidy.

4. Mark it up. Add accurate JSON-LD — FAQPage for Q&A sections, Article for the page, Organization or Product where relevant. Keep markup synced to the visible text.

5. Open the gates. Confirm AI crawlers can reach the content, ship or update your llms.txt, and make sure nothing in robots or firewall config is silently blocking the bots you want.

6. Build the cluster and the authority. Interlink related pages so the topic reads as a coherent body of expertise. Earn mentions on the third-party sources these models already trust — that off-site corroboration is what turns a good page into a cited one. The tactical detail on this lives in how to get cited by AI search.

7. Re-test and iterate. Run the prompt list again on a schedule. Watch which questions start citing you, which don't, and rewrite the laggards. AEO is a loop, not a launch.

How to measure AEO visibility

The metric that matters is prompt coverage: of the real questions your customers ask, on what share does each engine cite you? Fix a list of prompts, run them across the engines on a regular cadence (monthly is a sane default), and log a simple grid — prompt by engine, cited or not. That trend line is your scoreboard.

Layer in two more reads. Share of voice: when you're not cited, who is? Tracking the sources that consistently win your queries tells you exactly who to study and where to earn corroboration. And citation accuracy: when you are cited, is the engine representing you correctly? A citation that misquotes you is a content bug to fix at the source. Don't lean on vanity numbers like "AI traffic" alone — referral attribution from inside AI answers is still messy and undercounts reality. The cleanest signal remains: ask the engine the question, look at who it names.

Common mistakes

The ones I see kill AEO efforts, in rough order of how often they do damage:

  • Burying the answer. Beautiful long intros, answer in paragraph nineteen. The passage selector never finds it.
  • Vague claims. Nothing specific enough to quote, so the model paraphrases a competitor.
  • Fabricated specifics. Made-up stats that read great until someone checks. This is self-sabotage; never do it.
  • Blocking the crawlers. Silent robots or firewall rules that lock AI bots out at the retrieval gate. You can do everything else perfectly and lose here.
  • Thin coverage. One orphan page on a topic instead of a real cluster, so you never read as an authority.
  • Schema that lies. JSON-LD that doesn't match the visible page, which gets distrusted rather than rewarded.
  • Treating it as one-and-done. Publishing and never re-testing. AEO is a measured loop or it's nothing.
  • Declaring SEO obsolete. It isn't. Retrieval rides on the same foundations; you need both.

The businesses that win the next decade of search aren't the ones chasing a trick. They're the ones who decided to be the genuinely clearest, most authoritative, most quotable source on the questions they want to own — and then made it effortless for machines to find and extract that. Do that, measure it honestly, and iterate. For more on the specifics, see how to get cited by AI search and the broader AI search FAQ.

FAQ

What is answer engine optimization?

Answer engine optimization (AEO) is the practice of structuring your content so AI answer engines like ChatGPT, Claude, Perplexity, and Google's AI overviews retrieve it, ground their answers in it, and cite you as a source. The goal isn't ranking a blue link — it's becoming one of the few sources the model quotes when it writes the answer.

Is AEO different from SEO?

Yes, but they overlap heavily. SEO optimizes for a ranked list of links a human clicks; AEO optimizes for being the source a model extracts and cites inside a synthesized answer. The same fundamentals — crawlable pages, real authority, clear writing — feed both, but AEO adds answer-first structure, extractable claims, and machine-readable markup.

How do AI answer engines decide what to cite?

Most answer engines run a retrieval step (often a live or indexed search), pull a handful of candidate passages, and have the model write an answer grounded in those passages with citations attached. You get cited when your page is retrievable, the passage directly and clearly answers the query, and your claim is specific enough to quote without the model having to hedge.

How do I know if AI search is already citing me?

Test it directly. Ask the real engines — ChatGPT, Claude, Perplexity, Google AI mode — the actual questions your customers ask, and see whether your domain shows up in the cited sources. Do it across a fixed list of prompts, log which ones cite you, and re-run monthly. That prompt-coverage rate is your real AEO scoreboard.

Does schema markup help with AEO?

It helps, but it's not magic. JSON-LD (FAQPage, Article, Organization, Product) makes your claims and entities unambiguous to machines and powers rich results that feed AI overviews. It won't rescue thin content — but on a genuinely useful page, clean structured data removes friction between your answer and the engine extracting it.

How long does AEO take to work?

Faster than classic SEO in some ways, slower in others. Live-retrieval engines like Perplexity can pick up a strong new page within days of crawling it. Models with training-data memory move on their own slower cycle. In my experience you start seeing citation pickup in weeks on the retrieval engines, with authority compounding over months.

#aeo#ai-search#citations
Want to apply this right now?

Use the free, no-API prompt generators to put it into practice.

Open Prompt Studio →
Keep reading