Move off WordPress without dropping a single ranking URL
Content exported via the REST API into typed MDX, media rehosted on Cloudflare R2 behind stable URLs, every legacy slug preserved through a 301 map on Vercel Edge, sitemap and RSS continuity verified before cutover. The new site renders in 80 ms instead of 1.4 s and the organic traffic curve stays flat through the switch.
The problem
WordPress migrations break because nobody plans the URL graph
The pattern that loses organic traffic looks like this. A team rebuilds the marketing site in Next.js, ships it on a Friday, points the DNS at Vercel, and watches Search Console light up red on Monday. Every old URL returns 404. The team scrambles to bolt on redirects, but the slug history (years of permalinks, attachment URLs, paginated archives, category and tag pages) was never inventoried. Rankings that took three years to earn disappear in a week. We treat the migration as a graph problem first: every URL the old site served gets an entry in a 301 map, every embedded image gets a stable new home, every RSS subscriber keeps receiving posts. Content moves into typed MDX so the editorial team writes in their tooling and the developer sees real diffs in pull requests. The new build renders in under 200 ms because there is no database query in the request path, and the SEO curve stays flat across the cutover.
Our approach
The seven steps of a WordPress migration that keeps rankings
The order matters. Inventory before export, export before transform, transform before redirects, redirects before cutover. Each step closes a class of bug that bites teams who try to compress two steps into one.
- 01
Inventory the URL graph
Crawl the live WordPress site with a real crawler (Screaming Frog, Sitebulb, or a Playwright script) and produce a CSV of every indexed URL. Pages, posts, categories, tags, attachment pages, paginated archives, author pages, custom post types. The CSV is the contract: nothing on it is allowed to 404 after cutover. This step alone catches the URLs nobody remembered: a 2017 press release, a category page that ranks for a buyer query, an attachment URL someone linked from Twitter.
- 02
Export content via the WordPress REST API
We pull posts, pages and custom post types through `/wp-json/wp/v2/*` with pagination, including raw HTML, Gutenberg block JSON when present, Yoast or RankMath SEO fields, featured image IDs and taxonomies. The export script writes one JSON file per content item to disk; the JSON is the source we transform from. WP-CLI export is the fallback when the REST API is locked down.
- 03
Transform HTML to MDX with typed frontmatter
Each post becomes an MDX file with frontmatter that mirrors the production schema (`title`, `slug`, `published_at`, `updated_at`, `description`, `category`, `tags`, `cover`, `canonical`, `noindex`). The body is HTML cleaned and converted: shortcodes resolved to React components, `wp-block-*` divs unwrapped, inline styles stripped, image tags rewritten to point at R2. The transform is deterministic; re-running it produces a byte-identical output.
- 04
Migrate media to Cloudflare R2 with stable URLs
Every image, PDF and embed reachable from the WordPress upload directory gets re-uploaded to a single R2 bucket under a content-addressed path. The old `/wp-content/uploads/2023/04/cover.jpg` resolves to the new `https://cdn.adamarant.com/media/<hash>/cover.jpg` via a permanent map. R2 has zero egress; the marketing site pays for storage, not for traffic.
- 05
Build the 301 redirect map on Vercel Edge
The URL inventory CSV becomes a `redirects.json` file the Next.js app reads at build time. Vercel serves 301 redirects at the Edge before the Node runtime starts; the latency cost is single-digit milliseconds. Every old slug, attachment page, paginated archive and category alias gets a destination on the new site or a graceful fallback to the parent.
- 06
Mirror sitemap, RSS and structured data
The new site emits a `sitemap.xml` that includes every migrated URL with `lastmod` derived from the original `updated_at`. The RSS feed at `/feed/` keeps the same GUID per post so existing subscribers do not see duplicate entries. JSON-LD `Article` schema with `author`, `datePublished` and `dateModified` lands on every post; the canonical URL on the new domain is set explicitly to avoid duplicate-content debt.
- 07
Cutover with a 24-hour verification window
The day before DNS switch, we run a final crawl against the staging build and diff it against the production inventory. Any URL that fails to resolve fails the deploy. After DNS cut, a synthetic monitor hits the top 200 ranking URLs every five minutes for 24 hours; alerts fire on any 4xx or 5xx. The team is on call until the window closes.
What we deliver
URL inventory CSV
Every URL the WordPress site served, with traffic data from GA4 or Plausible, target URL on the new site, and a redirect type (`301` permanent, `410` gone, or `200` direct render). The artefact the migration is measured against.
REST API export pipeline
A Node script that paginates through `/wp-json/wp/v2/*`, handles authentication, and writes one JSON file per content item. Re-runnable; idempotent on re-export.
HTML-to-MDX transformer
A typed transformer that converts post HTML to MDX, with handlers for Gutenberg blocks, shortcodes, oembeds, and inline styles. Each handler is a unit-tested function; new handlers are PRs.
R2 media bucket and CDN
A Cloudflare R2 bucket with content-addressed paths, a Worker that serves images with `Cache-Control: immutable`, and an old-to-new path map committed to the repo. Zero egress cost on traffic.
301 redirect map
A `redirects.json` consumed by `next.config.ts`, serving at the Vercel Edge with sub-10 ms overhead. Every URL on the inventory is covered; CI fails if a new build drops an entry.
Sitemap and RSS continuity
A dynamic `sitemap.xml` with stable URLs and `lastmod`, an RSS feed that preserves GUIDs per post, and `robots.txt` that allows the LLM and search crawlers the team agreed on.
JSON-LD structured data
`Article` schema on posts, `BreadcrumbList` on every page, `Organization` and `WebSite` sitewide. `Person` schema on author pages with `knowsAbout` and `sameAs`.
Cutover runbook
Step-by-step DNS switch with rollback plan, monitoring setup, and a 24-hour verification window. Co-signed by the engineering and marketing leads before execution.
Synthetic monitoring
A Playwright-based monitor that hits the top 200 URLs every five minutes and asserts on status, title and meta description. Alerts route to PagerDuty or Slack.
GA4 / Plausible continuity
Analytics keep their property; events fire from the same logical points; cross-domain tracking handles the cutover week. Sessions per landing page reconcile within 5%.
Editorial workflow in MDX
Posts become PRs; the editorial team writes in Markdown with a Decap or Sanity layer when preferred. Drafts preview in branch deployments; publish is a merge to `main`.
Post-cutover Search Console review
A weekly Search Console diff for the first month. URLs that lose impressions get a root cause; the redirect map is amended; the team learns what bit the migration.
Five concrete files that compose a non-breaking WordPress to Next.js migration
A WordPress migration is five files glued together. The REST exporter that pulls content, the HTML-to-MDX transformer that converts it, the R2 uploader that rehouses media, the redirect map that protects rankings, and the sitemap that announces continuity to crawlers. Each one is small, deterministic, and re-runnable.
A WordPress migration is a graph problem before it is a code problem. Every URL the old site served is a contract with a crawler, a backlink, an RSS subscriber or an email sent five years ago. The migration succeeds when every URL on the inventory still resolves to the right destination after cutover. It fails when somebody forgets the attachment pages.
The five files below compose a migration that keeps rankings: the REST exporter, the HTML-to-MDX transformer, the R2 media uploader, the Edge redirect map, and the sitemap that announces continuity to Googlebot.
1. The REST API exporter
WordPress exposes content at /wp-json/wp/v2/* with pagination via the page and per_page query parameters. The exporter walks every endpoint, writes one JSON file per item, and is idempotent on re-run. We export raw HTML for the transformer to consume and metadata fields directly for the frontmatter.
// scripts/migration/export-wordpress.ts
import { writeFile, mkdir } from 'node:fs/promises'
import { join } from 'node:path'
const WP_API = process.env.WP_API_URL // e.g. https://old.example.com/wp-json/wp/v2
const OUT_DIR = './data/wp-export'
const TYPES = ['posts', 'pages', 'categories', 'tags', 'media'] as const
async function fetchPaginated(type: string): Promise<unknown[]> {
const items: unknown[] = []
let page = 1
// The REST API caps per_page at 100; the X-WP-TotalPages header drives the loop.
while (true) {
const res = await fetch(
`${WP_API}/${type}?per_page=100&page=${page}&_embed=1`,
{ headers: { 'User-Agent': 'adamarant-migration/1.0' } },
)
if (!res.ok) throw new Error(`${type} page ${page}: ${res.status}`)
const batch = (await res.json()) as unknown[]
items.push(...batch)
const totalPages = Number(res.headers.get('X-WP-TotalPages') ?? '1')
if (page >= totalPages) break
page += 1
}
return items
}
async function main(): Promise<void> {
await mkdir(OUT_DIR, { recursive: true })
for (const type of TYPES) {
console.log(`exporting ${type}…`)
const items = await fetchPaginated(type)
await writeFile(
join(OUT_DIR, `${type}.json`),
JSON.stringify(items, null, 2),
'utf8',
)
console.log(` ${items.length} ${type} written`)
}
}
void main()
2. The HTML-to-MDX transformer
Each post in the export becomes an MDX file with typed frontmatter and a body that is HTML stripped of WordPress noise. Inline styles disappear, Gutenberg block wrappers unwrap, shortcodes become React components, image URLs swap to the R2 path. The transformer is a pure function; running it twice produces the same output.
// scripts/migration/transform.ts
import { readFile, writeFile, mkdir } from 'node:fs/promises'
import { join } from 'node:path'
import { JSDOM } from 'jsdom'
import { MEDIA_MAP } from './media-map.json' assert { type: 'json' }
interface WPPost {
id: number
slug: string
date_gmt: string
modified_gmt: string
title: { rendered: string }
excerpt: { rendered: string }
content: { rendered: string }
categories: number[]
tags: number[]
featured_media: number
yoast_head_json?: { canonical?: string; og_description?: string }
}
function rewriteImages(html: string): string {
const dom = new JSDOM(`<body>${html}</body>`)
const doc = dom.window.document
doc.querySelectorAll('img').forEach((img) => {
const src = img.getAttribute('src') ?? ''
const newSrc = (MEDIA_MAP as Record<string, string>)[src]
if (newSrc) img.setAttribute('src', newSrc)
img.removeAttribute('srcset') // R2 + next/image handles this
img.removeAttribute('sizes')
})
doc.querySelectorAll('[style]').forEach((el) => el.removeAttribute('style'))
doc.querySelectorAll('.wp-block-image, .wp-block-paragraph').forEach((el) => {
while (el.firstChild) el.parentNode?.insertBefore(el.firstChild, el)
el.remove()
})
return doc.body.innerHTML
}
function toFrontmatter(post: WPPost, categorySlug: string): string {
return [
'---',
`title: ${JSON.stringify(post.title.rendered)}`,
`slug: ${post.slug}`,
`published_at: ${post.date_gmt}Z`,
`updated_at: ${post.modified_gmt}Z`,
`description: ${JSON.stringify(post.excerpt.rendered.replace(/<[^>]+>/g, '').trim())}`,
`category: ${categorySlug}`,
post.yoast_head_json?.canonical
? `canonical: ${post.yoast_head_json.canonical}`
: null,
'---',
'',
]
.filter(Boolean)
.join('\n')
}
export async function transformPost(
post: WPPost,
categorySlug: string,
outDir: string,
): Promise<void> {
const body = rewriteImages(post.content.rendered)
const mdx = toFrontmatter(post, categorySlug) + body
await mkdir(outDir, { recursive: true })
await writeFile(join(outDir, `${post.slug}.mdx`), mdx, 'utf8')
}
3. The R2 media uploader
Every image in the WordPress upload directory moves to a single R2 bucket under a content-addressed path. The hash makes the URL stable forever; cache invalidation stops being a problem because the URL changes when the bytes change. The uploader emits a map from old URL to new URL that the transformer reads.
// scripts/migration/upload-media.ts
import { createHash } from 'node:crypto'
import { readFile, writeFile } from 'node:fs/promises'
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3'
const r2 = new S3Client({
region: 'auto',
endpoint: `https://${process.env.R2_ACCOUNT_ID}.r2.cloudflarestorage.com`,
credentials: {
accessKeyId: process.env.R2_ACCESS_KEY_ID!,
secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
},
})
interface MediaItem {
source_url: string
mime_type: string
}
async function uploadOne(item: MediaItem): Promise<[string, string]> {
const res = await fetch(item.source_url)
if (!res.ok) throw new Error(`fetch ${item.source_url}: ${res.status}`)
const buf = Buffer.from(await res.arrayBuffer())
const hash = createHash('sha256').update(buf).digest('hex').slice(0, 16)
const ext = item.source_url.split('.').pop()!.toLowerCase()
const key = `media/${hash}.${ext}`
await r2.send(
new PutObjectCommand({
Bucket: process.env.R2_BUCKET!,
Key: key,
Body: buf,
ContentType: item.mime_type,
CacheControl: 'public, max-age=31536000, immutable',
}),
)
return [item.source_url, `https://cdn.adamarant.com/${key}`]
}
export async function uploadAll(items: MediaItem[]): Promise<void> {
const map: Record<string, string> = {}
// Concurrency of 8 keeps R2 happy and the local network saturated.
const queue = [...items]
const workers = Array.from({ length: 8 }, async () => {
while (queue.length) {
const item = queue.shift()!
const [oldUrl, newUrl] = await uploadOne(item)
map[oldUrl] = newUrl
}
})
await Promise.all(workers)
await writeFile(
'./scripts/migration/media-map.json',
JSON.stringify(map, null, 2),
'utf8',
)
}
4. The Edge redirect map
Every URL on the inventory CSV becomes an entry in redirects.json. Next.js consumes it at build time and Vercel serves the redirects from the Edge, before the Node runtime is involved. The 301 latency is in the single-digit milliseconds; the rankings carry over because Googlebot honours permanent redirects within the same domain reliably.
// next.config.ts
import type { NextConfig } from 'next'
import redirectsFromInventory from './data/redirects.json' assert { type: 'json' }
interface RedirectEntry {
source: string
destination: string
permanent: true
}
const config: NextConfig = {
async redirects(): Promise<RedirectEntry[]> {
// The inventory CSV is processed offline into redirects.json; CI fails if
// the count drops below the previous build (rankings are downstream of
// this contract).
return (redirectsFromInventory as RedirectEntry[]).map((r) => ({
source: r.source,
destination: r.destination,
permanent: true,
}))
},
}
export default config
// data/redirects.json (excerpt)
[
{ "source": "/2023/04/how-we-think-about-design-systems", "destination": "/blog/how-we-think-about-design-systems", "permanent": true },
{ "source": "/category/engineering", "destination": "/blog/category/engineering", "permanent": true },
{ "source": "/?p=1247", "destination": "/blog/multi-tenant-saas-architecture", "permanent": true },
{ "source": "/wp-content/uploads/:path*", "destination": "https://cdn.adamarant.com/media/:path*", "permanent": true },
{ "source": "/author/ricardo", "destination": "/authors/ricardo", "permanent": true },
{ "source": "/feed/", "destination": "/feed.xml", "permanent": true }
]
5. The sitemap, the RSS feed and the cutover monitor
The sitemap is dynamic; it reads the MDX directory at build time and emits one entry per post with lastmod derived from frontmatter. The RSS feed preserves the GUID per post so existing subscribers do not get a wall of "new" items on cutover. A Playwright monitor hits the top 200 URLs every five minutes for 24 hours after DNS switch.
// app/sitemap.ts
import type { MetadataRoute } from 'next'
import { getAllPosts } from '@/lib/blog'
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const posts = await getAllPosts()
const site = 'https://adamarant.com'
return [
{ url: site, lastModified: new Date(), changeFrequency: 'weekly', priority: 1 },
{ url: `${site}/blog`, lastModified: new Date(), changeFrequency: 'weekly', priority: 0.9 },
...posts.map((p) => ({
url: `${site}/blog/${p.slug}`,
lastModified: new Date(p.updated_at),
changeFrequency: 'monthly' as const,
priority: 0.7,
})),
]
}
// app/feed.xml/route.ts
import { getAllPosts } from '@/lib/blog'
export async function GET(): Promise<Response> {
const posts = await getAllPosts()
const items = posts
.map(
(p) => `
<item>
<title><![CDATA[${p.title}]]></title>
<link>https://adamarant.com/blog/${p.slug}</link>
<!-- GUID preserved from WordPress export so subscribers see no duplicates -->
<guid isPermaLink="false">${p.wp_guid}</guid>
<pubDate>${new Date(p.published_at).toUTCString()}</pubDate>
<description><![CDATA[${p.description}]]></description>
</item>`,
)
.join('\n')
const xml = `<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Adamarant</title>
<link>https://adamarant.com</link>
<description>Design and engineering studio</description>
<language>en</language>
${items}
</channel>
</rss>`
return new Response(xml, {
headers: { 'Content-Type': 'application/xml; charset=utf-8' },
})
}
// scripts/monitor/cutover.ts
import { chromium } from 'playwright'
import topUrls from './top-200.json' assert { type: 'json' }
async function check(url: string): Promise<{ url: string; status: number; title: string }> {
const browser = await chromium.launch()
const page = await browser.newPage()
const res = await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 15000 })
const title = await page.title()
await browser.close()
return { url, status: res?.status() ?? 0, title }
}
async function main(): Promise<void> {
const results = await Promise.all((topUrls as string[]).map(check))
const bad = results.filter((r) => r.status >= 400 || !r.title)
if (bad.length > 0) {
console.error(`FAIL: ${bad.length} bad URLs`, bad)
process.exit(1)
}
console.log(`OK: ${results.length} URLs healthy`)
}
void main()
6. What this composes
The exporter pulls the WordPress content. The transformer converts it to typed MDX. The uploader rehouses media on R2 under stable URLs. The redirect map serves every old URL at the Edge with a 301. The sitemap and RSS announce continuity to crawlers and subscribers. The cutover monitor catches the bugs nobody saw in staging.
WordPress stops being the production system. Next.js renders the marketing site in under 200 ms because the database is gone from the request path. The build pipeline replaces the WordPress admin: editorial works in MDX (or a CMS layer on top of MDX); developers see real diffs in pull requests. Search Console stays flat through the cutover because the URL graph survived intact. The migration was a graph problem first; the rendering speed was a side effect of doing it right.
Related stacks
Frequently asked questions
How long does a typical WordPress to Next.js migration take?
For a marketing site with 100 to 500 posts and standard taxonomies, three to four weeks from kickoff to DNS cutover. The first week is inventory and export; the second is transform and redirect mapping; the third is staging build and crawler verification; the fourth is cutover and the 24-hour monitoring window. Sites with custom plugins, WooCommerce or multilingual setups extend by one to two weeks per axis of complexity.
What happens to plugin functionality we relied on (forms, search, comments)?
Each plugin gets replaced or dropped based on real usage. Forms move to Resend with a typed Server Action; site search moves to Algolia, Typesense or Postgres full-text depending on volume; comments either retire (most marketing sites) or move to a moderated thread on Linear / Discord. The decision happens during inventory, not after the fact.
Will we lose SEO rankings during the migration?
Not if the URL inventory is complete and the 301 map is correct. The pattern that loses rankings is forgetting URLs (an attachment page, a paginated archive on page 7, an old category alias). The pattern that keeps rankings is treating every URL on the inventory as a contract. We have run migrations with sub-5% organic traffic variance through cutover week.
Can the editorial team still write posts after the migration?
Yes, in two ways. Either as MDX pull requests directly (best for technical teams that already use GitHub), or through a headless CMS layer like Decap, Sanity or Sveltia mounted on the same MDX files (best for non-technical editors). The MDX in the repo is the source of truth; the CMS is a UI on top.
How do you handle WooCommerce or membership plugins?
Out of scope for a marketing-to-Next.js migration; in scope for a full SaaS rebuild. If the site has e-commerce or memberships, we either keep WooCommerce on a subdomain (woocommerce.example.com) and migrate only the marketing content, or rebuild the commerce layer on Stripe and Supabase. Either path is decided in the first scoping call.
How do you migrate Gutenberg blocks that have no Next.js equivalent?
We catalogue every block used on the site during inventory. The high-volume ones (paragraph, heading, image, list, quote, embed) map to MDX directly. The lower-volume ones (galleries, accordions, columns) become React components with the same name. The long-tail (15+ rarely-used blocks) gets flattened to HTML on export and reviewed manually for the posts that use them.
What about multilingual WordPress (WPML, Polylang)?
We extend the inventory and the transformer per locale. Each language gets its own MDX directory; `hreflang` headers are emitted from `generateMetadata`; the URL structure on the new site mirrors the WordPress convention or moves to a cleaner `/<lang>/` prefix with a redirect per old URL. Multilingual sites add one to two weeks of work versus single-language.
Can we keep the same domain or do we need to switch?
Same domain. The DNS cuts from WordPress hosting to Vercel; the domain stays. The HTTPS certificate transitions; we coordinate the cut with a short, planned downtime window (typically under 60 seconds) or no downtime at all using a DNS warm-up strategy. Subdomain staging (`new.example.com`) is the verification environment; the production domain only sees the new site after the URL inventory passes.
Plan your WordPress migration
A scoping call, a URL inventory in the first reply, a number we will not change once we agree on scope. Three to four weeks from kickoff to a faster site that kept its rankings.