Time to First Byte (TTFB) is the time from a user’s HTTP request to the moment the browser receives the first response byte from your server (or CDN). It spans DNS → connect → TLS → request queueing → application work → first byte. While not a Core Web Vital, TTFB influences LCP, INP (via network backlog/queuing), user-perceived snappiness, crawl budgets, and conversion on slow/variable networks.

Below is a practical, sysadmin-friendly guide: what drives TTFB, how to measure it correctly, and a prioritized, step-by-step plan to reduce it without breaking your stack.

1) What exactly does TTFB include?

A “classic” breaking-down for an HTTPS request:

DNS resolution (recursive resolver to authoritative).
TCP/QUIC connect (handshake; for TCP also slow-start).
TLS handshake (cert exchange, key agreement; resumption/0-RTT can shorten this).
Request send (client → edge/origin).
Server queueing (waiting for a worker/thread).
Application work (routing, DB queries, cache lookups, templates).
First byte leaves the server and reaches the browser.

TTFB captures the whole chain up through (7). If any one piece drags—high RTT, cold Lambda/Functions, slow DB, overworked PHP-FPM—TTFB grows.

Rules of thumb (good global sites):

p50 ≤ 200–300 ms in-region, ≤ 500–700 ms intercontinental.
p95 should not be 3–5× your p50. High tail latency = contention, GC pauses, DB locks, cold starts, or bad peering.
Focus on percentiles (p50/p90/p95/p99), not just averages.

2) How to measure TTFB (properly)

Use multiple vantage points and layers:

Real User Monitoring (RUM): TTFB as seen by real browsers across ISPs/regions. Segment by country/ASN/device.
Synthetic: headless browsers/cURL from multiple regions. Test:
- Direct to origin and via CDN (compare delta).
- With and without cold caches (serverless cold starts, microservice wakeups).
Server metrics: accept queue length, web workers busy, DB slow query log, cache hit ratios, TLS handshake stats, CPU steal/wait, GC pauses.
Network path: traceroute/MTR/HTTP/3 reachability; verify DNS is close to users (EDNS Client Subnet/GeoDNS correctness).

Check the waterfall: resolving → connecting → TLS → waiting (TTFB) → content. A long “waiting” often means application (or origin) work; a long connect/TLS points to network/TLS.

3) The TTFB improvement playbook (priority ordered)

A. Put an optimized edge in front (CDN/reverse proxy)

Terminate TLS at the edge (TLS 1.3, HTTP/2 and HTTP/3/QUIC enabled).
Use Anycast + good peering to be physically/network-close to users.
Enable origin shielding (one regional “shield” PoP fetches from origin; others cache from shield) to slash origin fan-out.
Cache aggressively where legal:
- Set Cache-Control: public, max-age=..., stale-while-revalidate=...
- For static: immutable (max-age=31536000, immutable) with content hashes in filenames.
- For semi-static HTML: short TTL (e.g., 30–120 s) + SWr to hide revalidation cost.
For dynamic pages, cache fragments/ESI/edge includes or use edge compute for personalization at the edge while keeping the base HTML hot.

Why it helps: The edge knocks out DNS/connect/TLS RTT and avoids origin trips for hot content; first byte can come from a nearby PoP in tens of ms.

B. Reduce handshake & transport overhead

TLS 1.3 everywhere; enable session resumption and, if appropriate, 0-RTT for idempotent GETs.
Prefer HTTP/2 (multiplexing) and HTTP/3/QUIC (no TCP slow-start, faster loss recovery on mobile/wifi).
Keep keep-alives long enough to reuse connections across requests; avoid connection churn.
Ensure OCSP stapling is on; certificates sized reasonably; ECDSA preferred for lighter handshakes.
On origins under your control:
- Use modern TCP congestion control (e.g., BBR) if it performs better on your network.
- Make sure NIC offloads (TSO/GRO) and MTU are sane; avoid PMTUD black holes.

C. Kill redirect chains and geo-mismatches

Collapse http → https → www → locale → final into one hop (or HSTS preload + canonical).
Make sure DNS/edge geolocation aligns with end-users; wrong GeoDNS can send EU users to US PoPs, adding 80–120 ms.

D. Make the origin fast (when you must hit it)

Web server
- Nginx: worker_processes auto; worker_connections sized; keepalive_requests 1000+; enable http2/quic (if available); sendfile on; tcp_nodelay on; tcp_nopush on;.
- Apache: mpm_event; KeepAlive On; MaxRequestWorkers sized to CPU/IO; Protocols h2 h2c http/1.1.
App runtime
- Pooling: PHP-FPM, Node.js cluster/PM2, JVM thread pools sized via profiling.
- Opcode caches/JIT (OPcache), warmed bundles, avoid dev-mode.
- Avoid synchronous external calls on request path; move to async/queue.
Database
- Add caching layer (Redis/Memcached) for read-heavy paths; set proper TTLs/keys.
- Indexes on hot queries; use the DB slow log; reduce N+1 queries.
- Connection pool tuned (not too small to queue, not too big to thrash).
Static offload
- Serve images/CSS/JS from the CDN; don’t let origin waste CPU on bytes.
Serverless / Functions
- Use provisioned concurrency or warmers for endpoints with strict TTFB targets.
- Trim bundle size and cold-start dependencies; keep connections warm to backends.

E. Smarter HTML caching for dynamic sites

Cache entire HTML for anonymous users; personalize via:
- Edge keys (cookie-less) and client-side fetch for private bits.
- Signed/cryptographic cookies as cache keys only when needed (beware of cache fragmentation).
For logged-in users, cache expensive fragments (menus, product tiles, recommendations) at the edge and stitch server-side or via client hydration.

F. Control queueing & overload

Cap concurrency to the sustainable level:
- Use queueing (e.g., limited worker pool + 503 with Retry-After) rather than over-committing and inflating TTFB for everyone.
Apply circuit breakers for slow/back-pressure-notified dependencies (payments, search, recommendation API).
Auto-scale before queues explode; scale based on queue depth and p95 latency, not only CPU.

G. Don’t break caches

Vary only on what you must: Vary: Accept-Encoding is fine; avoid Vary: * or volatile cookies.
Normalize query strings; strip trackers before cache lookups where allowed.
Use consistent casing in headers/hosts; cache keys should be stable.
For APIs, design cacheable GETs with ETags/Last-Modified.

4) A pragmatic 14-day TTFB improvement plan

Day 1–2: Measure & baseline

RUM dashboard segmented by country/ASN/device; record p50/p95 TTFB.
Synthetic runs from 5–10 regions: CDN vs origin, HTTP/2 vs HTTP/3, warm vs cold caches.
Capture server metrics (web workers busy, accept queues, DB p95, cache hit ratio).

Day 3–5: Quick network wins

Ensure TLS 1.3, resumption, OCSP stapling, and HTTP/2 + HTTP/3 at the edge.
Fix redirect chains; force canonical routes; enable HSTS.
Confirm users hit nearest PoP; correct GeoDNS/peering issues with your provider.

Day 6–9: Cache more

Static assets: long-TTL immutable + hashed filenames.
HTML for anon users: enable short-TTL caching + stale-while-revalidate.
Turn on origin shielding; inspect origin hit reduction.

Day 10–12: Origin / app

Profile app endpoints with high TTFB; add Redis read-through where viable.
Fix top slow queries; add missing indexes; lift connection pool caps if queueing.
Tune PHP-FPM/Node/JVM pools to avoid both thrash and starvation.

Day 13–14: Guardrails

Add autoscaling triggers on p95 latency & queue depth.
Add SLO alerts: TTFB p50/p95 per main geos; synthetic canary checks for login/checkout.

Re-measure; expect p50 drop and smaller p95 gap. Iterate on the worst geos/ASNs.

5) Special cases

E-commerce & personalization: cache category/listing pages fully; for PDP add fragment cache for reviews/recommendations; push cart/account data via client fetch.
APIs: implement ETag/If-None-Match, Cache-Control: public, max-age where safe; coalesce duplicate backend requests at the edge (request collapsing).
Serverless: use provisioned concurrency for hot paths; move auth/session validation to edge when possible.

6) Validation: did it work?

Graphs: TTFB p50/p95 per region before/after; cache hit ratios; origin requests per 1k hits; DB p95.
Tail reduction: p95/p99 improvement is often worth more than a small p50 drop.
User outcomes: LCP improvement, conversion/engagement, crawl rate (for SEO-sensitive sites).

7) Common pitfalls that inflate TTFB

Accidentally disabling CDN cache for HTML via set-cookies or overly broad Vary.
Per-user cache keys (cookie explosion) killing hit ratio.
Redirect chains for locale or A/B experiments.
Serverless cold starts on peak traffic.
DB connection pools too small (queueing) or too large (thrash).
Edge → origin fan-out without shielding.

FAQ

Is TTFB the same as LCP?
No. TTFB ends at the first byte; LCP measures when the largest above-the-fold element renders. Lower TTFB generally helps LCP but they’re distinct.

Can HTTP/3 alone fix poor TTFB?
It can shave handshake/transport overhead and improve loss recovery, especially on mobile/wifi, but application and origin latency still dominate dynamic pages.

What’s a “good” TTFB target?
Aim for ≤ 200–300 ms p50 in-region and ≤ 500–700 ms intercontinental; keep p95 within ~2–3× p50. Targets depend on product and audience.

Does caching HTML hurt personalization/analytics?
Not if you split public content (cacheable) and private bits (hydrated client-side or served as edge fragments). Avoid per-user cache keys unless necessary.

Why did my TTFB worsen after adding a WAF/CDN?
Usually mis-routing (wrong PoP/GeoDNS), disabled cache, or extra redirects. Validate POP proximity, cache rules, and TLS/HTTP versions.

Bottom line

TTFB is a composite metric: network + TLS + server + app. The fastest path to better TTFB is a good edge (HTTP/2/3, TLS 1.3, shielding, smart caching) plus a disciplined origin (pooling, DB/indexes, fragment caching). Measure from the user’s perspective, attack tail latency, and iterate region-by-region.

X (Twitter) Facebook Pinterest LinkedIn Email WhatsApp

TTFB, explained: what it is, why it matters, and how to make it faster on your site

1) What exactly does TTFB include?

2) How to measure TTFB (properly)

3) The TTFB improvement playbook (priority ordered)

A. Put an optimized edge in front (CDN/reverse proxy)

B. Reduce handshake & transport overhead

C. Kill redirect chains and geo-mismatches

D. Make the origin fast (when you must hit it)

E. Smarter HTML caching for dynamic sites

F. Control queueing & overload

G. Don’t break caches

4) A pragmatic 14-day TTFB improvement plan

5) Special cases

6) Validation: did it work?

7) Common pitfalls that inflate TTFB

FAQ

Bottom line

Related articles

BitNinja vs. cPguard: A Comparison of Security Solutions for Linux Servers

Microsoft Tightens DMARC Enforcement for High-Volume Senders: What Sysadmins Need to Know

IP and DNS Blocking: A Technical Threat to the Foundations of the Internet

How to Block Scrapy and Other Aggressive Bots in OpenLiteSpeed, Apache and Nginx

MediaTek Introduces the Dimensity 8400: Pioneering All Big Core Architecture for Premium Smartphones

Is FuriosaAI’s Chip Architecture Truly Innovative or Just a Systolic Array with a New Name?

AlmaLinux 9.5: A free and robust alternative to Red Hat Enterprise Linux 9.5

openSUSE Tumbleweed Switches from AppArmor to SELinux for New Installations

Redis 8.0 Embraces AGPLv3 in a Bid to Rebuild Trust with the Open Source Community

Routers, Switches, and Firewalls: The Technical Foundations of Network Infrastructure