Redis on HTTP, priced per request, with no connection pool to babysit
Sliding-window rate limits at the edge, idempotency tokens that survive retries, cache-aside reads with stale-while-revalidate. We wire Upstash into the application so the operational story matches the serverless deploy: no connections to manage, no provisioned capacity to plan for, no PagerDuty for Redis.
Why this stack
HTTP API means it works from the edge
Traditional Redis needs a TCP connection. Serverless functions and edge runtimes do not keep TCP connections cleanly. Upstash exposes Redis over HTTP/REST, so the same code that runs in Node also runs at the Cloudflare or Vercel edge with zero changes.
Pay per request, not for provisioned capacity
A normal Redis instance bills by node size and runs whether you use it or not. Upstash bills by command count. For bursty workloads (rate limiting, idempotency, session lookup) this is the right shape. We model the cost up front based on real request patterns.
Built-in rate limiting library
`@upstash/ratelimit` ships with sliding-window, token-bucket and fixed-window primitives, all backed by Redis MULTI/EXEC. The implementation is correct under concurrency and the API is one line at the call site.
Global replication when you need it
Upstash global databases replicate writes across regions and serve reads from the nearest replica. For a multi-region deploy that needs the same rate-limit and session view everywhere, this collapses an entire infrastructure problem into a config flag.
No connection management is a feature
HTTP is stateless. There is no connection pool to size, no warm-up step on cold start, no leaked connection that pages someone at 3am. The serverless function makes an HTTP request and either gets a result or an error. That is the entire operational surface.
What we build with it
Rate limiting per IP, per user, per tenant
Sliding-window or token-bucket per case, distinct keys per dimension, fail-open or fail-closed configured per route, response headers for X-RateLimit-Remaining.
Session storage with TTL
Cookie-backed sessions stored in Redis with explicit TTL, rotation on privilege change, server-side revocation that propagates within milliseconds.
Cache-aside for slow queries
A typed `getOrSet` helper that reads from Redis, falls through to the source, writes back with TTL, supports stale-while-revalidate for resilience.
Idempotency tokens
A unique key per logical request, recorded with the response, replayed verbatim on retry. Combined with the Stripe webhook pattern, this is the production-grade dedupe layer.
Edge Config patterns
Feature flags, allowlists, kill switches read from Redis at the edge in single-digit milliseconds. No central service to call, no JWT to validate.
Short-lived job queues
List-based queues for short-running work (email send, image transform, search reindex) with a worker that polls or a webhook that drains. Heavy duty queues stay on a real queue system.
Counters for analytics
Atomic increment counters per tenant, per route, per metric. Aggregated into time buckets, flushed periodically to the analytics warehouse.
Distributed lock primitive
SETNX-based locks with fencing tokens for critical sections that cannot run concurrently across function invocations.
Multi-region replication setup
Global database configured, read regions selected, write region documented, fallback behaviour spelled out in the runbook.
Cost monitoring + alerting
Daily query volume against quota, alerts at 80 percent of budget, monthly trend reports tied to the admin dashboard.
Migration from self-hosted or ElastiCache
Existing keyspace exported, replayed into Upstash, application config switched, old cluster decommissioned with a rollback window.
Upstash Workflow for durable execution
Long-running, retry-safe workflows defined as code and orchestrated by Upstash. Used where a background job has multiple steps that must all complete eventually.
Sliding-window rate limiting and idempotency in one Server Action
One helper rate-limits the call by IP and user; another reads or writes the idempotency record. The Server Action runs both before touching the database, so a retried POST hits the limiter once and the duplicate write never happens.
Most "Upstash quickstart" guides show you a single GET and SET. The pattern that earns Upstash its place in a serverless stack is the one below: a sliding-window rate limiter and an idempotency-token check, both running inside a Server Action, both backed by the same Redis instance, with the Stripe webhook pattern from earlier in the stack borrowing the same idempotency primitive.
1. The rate limiter
@upstash/ratelimit does the algorithmic correctness inside Redis with MULTI/EXEC. The application code is one line at the call site.
// src/lib/upstash/ratelimit.ts
import { Ratelimit } from '@upstash/ratelimit'
import { Redis } from '@upstash/redis'
const redis = new Redis({
url: process.env.UPSTASH_REDIS_REST_URL!,
token: process.env.UPSTASH_REDIS_REST_TOKEN!,
})
export const ipLimiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(20, '10 s'),
prefix: 'rl:ip',
analytics: true,
})
export const userLimiter = new Ratelimit({
redis,
limiter: Ratelimit.slidingWindow(100, '60 s'),
prefix: 'rl:user',
analytics: true,
})
export const tenantLimiter = new Ratelimit({
redis,
limiter: Ratelimit.tokenBucket(1000, '60 s', 1000),
prefix: 'rl:tenant',
analytics: true,
})
export async function checkRate(opts: {
ip: string
userId: string
tenantId: string
}): Promise<
| { ok: true }
| { ok: false; scope: 'ip' | 'user' | 'tenant'; reset: number }
> {
const ipResult = await ipLimiter.limit(opts.ip)
if (!ipResult.success) return { ok: false, scope: 'ip', reset: ipResult.reset }
const userResult = await userLimiter.limit(opts.userId)
if (!userResult.success) return { ok: false, scope: 'user', reset: userResult.reset }
const tenantResult = await tenantLimiter.limit(opts.tenantId)
if (!tenantResult.success) return { ok: false, scope: 'tenant', reset: tenantResult.reset }
return { ok: true }
}
2. The idempotency helper
A unique key encodes the request shape; the response is cached against it. A retry with the same key returns the original response without re-running the work.
// src/lib/upstash/idempotency.ts
import { Redis } from '@upstash/redis'
const redis = new Redis({
url: process.env.UPSTASH_REDIS_REST_URL!,
token: process.env.UPSTASH_REDIS_REST_TOKEN!,
})
const TTL_SECONDS = 24 * 60 * 60
interface RecordedResult<T> {
status: 'success' | 'error'
response: T
}
export async function withIdempotency<T>(
key: string,
run: () => Promise<T>,
): Promise<T> {
const fullKey = `idem:${key}`
const existing = await redis.get<RecordedResult<T>>(fullKey)
if (existing) {
if (existing.status === 'error') {
throw new Error('previous attempt failed; retry with a different key')
}
return existing.response
}
try {
const result = await run()
await redis.set(
fullKey,
{ status: 'success', response: result } satisfies RecordedResult<T>,
{ ex: TTL_SECONDS },
)
return result
} catch (err) {
await redis.set(
fullKey,
{ status: 'error', response: null } satisfies RecordedResult<unknown>,
{ ex: 60 },
)
throw err
}
}
3. The Server Action that uses both
The Server Action takes an Idempotency-Key header, runs the rate limiter, then runs the business logic inside the idempotency helper. A retried POST with the same key returns the original response; a duplicate write never happens.
// app/[lang]/(app)/invoices/actions.ts
'use server'
import { headers } from 'next/headers'
import { z } from 'zod'
import { checkRate } from '@/lib/upstash/ratelimit'
import { withIdempotency } from '@/lib/upstash/idempotency'
import { getServerSession } from '@/lib/auth/server'
import { stripe } from '@/lib/stripe/server'
const Input = z.object({
customerId: z.string(),
amountCents: z.number().int().positive(),
description: z.string().min(1).max(200),
})
export async function createInvoice(
raw: unknown,
): Promise<
| { ok: true; invoiceId: string }
| { ok: false; error: string; reset?: number }
> {
const session = await getServerSession()
if (!session) return { ok: false, error: 'unauthorised' }
const hdrs = await headers()
const ip = hdrs.get('x-forwarded-for')?.split(',')[0] ?? '0.0.0.0'
const idempotencyKey = hdrs.get('idempotency-key')
if (!idempotencyKey) return { ok: false, error: 'missing idempotency key' }
const rate = await checkRate({
ip,
userId: session.userId,
tenantId: session.tenantId,
})
if (!rate.ok) {
return { ok: false, error: `rate limited (${rate.scope})`, reset: rate.reset }
}
const parsed = Input.safeParse(raw)
if (!parsed.success) return { ok: false, error: 'invalid input' }
return withIdempotency(`invoice:${session.tenantId}:${idempotencyKey}`, async () => {
const invoice = await stripe.invoices.create({
customer: parsed.data.customerId,
collection_method: 'send_invoice',
days_until_due: 30,
description: parsed.data.description,
metadata: { tenant_id: session.tenantId },
})
await stripe.invoiceItems.create({
customer: parsed.data.customerId,
amount: parsed.data.amountCents,
currency: 'eur',
invoice: invoice.id,
})
return { ok: true as const, invoiceId: invoice.id }
})
}
4. What this buys you
The rate limiter runs in a couple of milliseconds at the edge. The idempotency helper turns a retried POST into a cache lookup, so the Stripe API never sees a duplicate. The whole flow lives in three files; the Redis layer never appears in the application logic except as two helpers that read like normal functions.
This is the Upstash that earns its place: not a "we replaced Redis with HTTP" curiosity, but the operational layer that makes serverless deploys actually safe under retries and high traffic.
Frequently asked questions
Upstash versus ElastiCache or Memorystore?
Upstash for serverless deploys, edge runtimes and bursty workloads where pay-per-request pricing is the right shape. ElastiCache or Memorystore when you have a long-running Node service, an in-house infra team, and a workload where provisioned capacity beats per-request billing.
HTTP versus TCP — what is the latency hit?
Single-digit milliseconds from a Vercel or Cloudflare function to the nearest Upstash region. For the common rate-limit and session-lookup case, this is faster than the cold-start hit of opening a fresh TCP connection. We measure the actual latency in the staging environment.
Does the latency at edge match a regional Redis?
For reads from the global database, yes — the read hits the nearest replica. For writes that must round-trip to the primary region, you accept the inter-region latency. We document the read-versus-write split per use case.
How does pricing scale at high request volume?
Linearly. Upstash charges per command; high traffic means a bigger bill, predictable per the rate curve. For very high request volumes a Pro plan with reserved capacity pays off; we model it in the scoping phase based on expected RPS.
Can we run Lua scripts on Upstash?
Yes. `EVAL` and `EVALSHA` work, and complex atomic operations (the rate-limit primitives themselves) ship as scripts inside the official libraries. Custom scripts are supported for our own application logic where the round-trip count matters.
What about persistence guarantees?
Upstash Redis persists writes durably with replication. A standard read-after-write is consistent within a region. Across regions, the global database is eventually consistent with replication lag in the low milliseconds. We pick the consistency model per use case and document it.
How does migration from self-hosted Redis work?
Two paths. For small keyspaces, dump and replay. For larger ones, run dual-write for a window (the application writes to both, reads from the old until catch-up is verified), then cut over. The runbook keeps the rollback path open until you sign off.
Tell us what you are caching, rate-limiting, or coordinating
A scoping call, a concrete number in the first reply, no agency theater. Upstash integration in a week.