How to write the first 200 tokens so AI engines cite your article
The first 30% of an article drives 44% of AI citations. A practical guide to an opener ChatGPT, Claude and Perplexity quote cleanly.
By the end of this guide, you will write the first 200 tokens of any article in a shape that ChatGPT, Claude, Perplexity, and Google AI Overviews can quote verbatim. That block does most of the work: an Ahrefs analysis of 1.4 million AI prompts found that 44.2% of all LLM citations come from the first 30% of an article, versus 24.7% from the last third. If you want to get cited by ChatGPT, the opener is where the decision is made.
A token is roughly 0.75 English words. 200 tokens is about 150 words, roughly the first two paragraphs of a blog post. That window is what AI engines scan first during retrieval, and what they extract when the page is ranked high enough to be quoted.
What you need before starting
- A concrete topic and a target query (for example, how to get cited by ChatGPT), not a theme.
- One authoritative source with a fact or number you can quote. One is the minimum, three is better.
- The entity-category-differentiator template from GEO research (Step 1).
- A slug that matches the topic literally, not cleverly. Pages with descriptive URLs are cited 89.78% of the time they appear in search results, versus 81.11% for vague ones.
Step 1: Name the topic in entity-category-differentiator form
Open the first sentence with the shape [Topic] is a [category] that [differentiator]. This is the Definition Lead pattern studied in the Princeton GEO paper accepted at KDD 2024. It works because LLMs retrieve sentences that answer the literal question what is X, and this template answers it in one clause.
Example, for this article: a Definition Lead is a first-sentence template that names the topic, its category, and its differentiator so AI engines can quote one self-contained line. The sentence is complete on its own. An extracted quote carries full meaning without the surrounding paragraph.
Step 2: Pack one statistic with its source in the first two paragraphs
The Princeton GEO experiment tested six content modifications across 10,000 queries. Adding statistics improved AI citation visibility by 41% for lower-ranked pages. Adding external citations moved the same metric by 115%. Adding more words without facts did nothing. Keyword stuffing performed 10% below the baseline.
The practical rule: put one real number, with the source linked inline, inside the first 150 words. Not recent research shows, but a specific figure, a year, a source name, and a link. An LLM can extract a concrete sentence and cite you. Fuzzy claims get ignored.
Step 3: Follow the lead with one paragraph naming who, when, and the outcome
The second paragraph answers three questions the LLM uses to rank your page: who this applies to, when it becomes relevant, and what the reader will get. Three sentences is enough. This paragraph often becomes the passage that ChatGPT Search quotes when it cites you, because it carries scope and context in one block.
Avoid filler like in today's fast-paced world. Avoid self-promotion like our team has spent years. Both get stripped at retrieval because neither carries an extractable fact.
Step 4: Match the page title and URL to a real sub-query
ChatGPT Search decomposes prompts into multiple sub-queries (the fan-out step), then retrieves pages whose title, URL, and first paragraph match one of those sub-queries. The title should contain the target query verbatim when the phrase reads naturally. The URL should repeat the core noun phrase.
Counter-example: a clever URL like /blog/words-that-win for an article on how to get cited by ChatGPT buries the signal. /blog/how-to-get-cited-by-chatgpt wins retrieval. The measured gap between descriptive and vague URLs in the Ahrefs study was close to 9 percentage points of citation rate.
Step 5: Add structural cues for the extractor
After the opening two paragraphs, place an H2 that begins with a question or a direct noun phrase, not a metaphor. Under it, put the body: short paragraphs, one list, one table if the topic deserves one. The AI does not parse your metaphor, it parses your structure. Article and BlogPosting schema add a signal that Microsoft's Fabrice Canel publicly confirmed helps LLMs understand content at SMX Munich in March 2025.
Verifying it works
- Read the first 150 words alone. Do they stand as an answer to the target query without the rest of the article? If no, the opener fails retrieval.
- Count quotable sentences. A quotable sentence is fact-bearing, self-contained, and under 25 words. Aim for three in the first 200 tokens.
- Test the actual prompt. Run the target query on ChatGPT (with search enabled), Perplexity, and Claude. If your URL surfaces within two weeks of publishing, the opener is doing its job.
- Check the slug. The URL should be readable without visiting the page. If a human cannot guess what the article is about from the URL, an AI will not either.
Common failures and fixes
The Medium-lede opener. A narrative opening like it was a Tuesday in 2024 when we realized the world had changed works for human readers. AI engines cannot extract a fact from it. Fix: replace line one with the Definition Lead, keep the narrative for paragraph three.
The corporate brochure opener. A first line about the author (at Studio we help companies navigate) carries near-zero signal. Fix: the first sentence is about the topic, not about the author.
The stat without a link. A naked number reads like a claim, not a fact. Fix: inline link to the primary source at the claim site, never in a bottom-of-page footnote.
The front-loaded keyword dump. The Princeton GEO paper measured this: keyword stuffing scored 10% below the baseline. Fix: one natural occurrence of the target query in the first paragraph, then write for a human.
A slug that is too clever. /blog/the-first-impression loses to /blog/first-200-tokens-ai-citation. Fix: rename the file and emit a 301 if the URL is already live.
Going further
The opener is the strongest lever, not the only one. Freshness matters: AI-cited content is on average 393 to 458 days newer than the top-ranked Google results for the same query, and pages not updated within 90 days lose citations three times faster. Domain authority matters: sites with more than 32,000 referring domains are 3.5 times more likely to be cited than sites with fewer than 200. Neither replaces a strong first 200 tokens, but both amplify a strong one. For the broader playbook, see the 12 GEO best practices that actually move citation rates in 2026 and the underlying definition of GEO versus SEO.
Sources
Studio
Start a project.
One partner for companies, public sector, startups and SaaS. Faster delivery, modern tech, lower costs. One team, one invoice.