Cloudflare has rebuilt the “brain” of its edge from the legacy FL1 (NGINX/OpenResty/LuaJIT) to FL2 on Rust + Oxy. The new stack is modular, typed, and hot-reloadable, delivering ~10 ms lower median response times, ~25% higher CDN performance, and <½ CPU/RAM usage versus FL1—while making deployments and rollbacks far less disruptive to live traffic.
What actually changed
- Language/runtime: FL1 (NGINX/OpenResty/LuaJIT) → FL2 in Rust on Oxy. Rust brings compile-time memory safety and predictable performance; Oxy provides proxy primitives, telemetry, soft reloads, and dynamic config.
- Architecture: product logic is split into strict modules with declared inputs/outputs and no direct I/O. Contracts are enforced at compile time; modules run only if filters say they’re relevant to a request.
- Deployments: graceful restarts + systemd socket activation. New versions come up while old processes keep serving long-lived connections (WebSockets/streams) until they close naturally.
- Migration path: Rust modules were embedded in the old stack during cut-over, with automated E2E tests comparing FL1 vs FL2 and a byte-level fallback to FL1 if FL2 couldn’t handle a case. Most traffic now runs on FL2; HTTP/TLS termination is the last big piece being rewritten.
Why sysadmins should care
- Latency & tail behavior: Expect lower TTFB and less variance at p95/p99 during busy windows. That’s real money for ecommerce and fewer UX hiccups for real-time apps.
- Capacity headroom: With <½ CPU and memory for the same workloads, you’re less likely to see noisy-neighbor symptoms when Cloudflare adds features (more inspection, rulesets, Workers) on their side.
- Fewer brownouts on deploys: Socket activation + graceful restarts reduce connection resets during Cloudflare rollouts, hotfixes, and incident response.
- Faster feature velocity: The modular design and automated rollout gates mean new features and fixes ship days, not weeks—useful when you depend on a CDN/WAF vendor to unblock you.
Ops impact you’ll actually notice
- TTFB/TTI improvements across sites and APIs that front with Cloudflare (especially under load or in geos where peering was weaker).
- More stable long-lived sessions (WebSockets, SSE, streaming) during Cloudflare maintenance windows.
- Cleaner change windows: fewer “is it us or them?” moments—though you should still correlate with their status feed and your own SLOs.
What stays on your plate
FL2’s gains don’t absolve first-mile problems:
- Origin tuning: keep TLS session reuse, HTTP/2 or HTTP/3, keep-alives, and cache headers in order. Bad cacheability still costs you.
- Regional architecture: if you serve from one region only, cross-ocean RTT will dominate even with a fast edge.
- Health & surge control: retain circuit breakers, autoscaling, and backpressure on origins—faster edges can amplify stampedes.
Observability: metrics to watch (and why)
- TTFB (p50/p95/p99) per colo/region: confirms edge-side wins and flags regressions tied to a rollout.
- Origin response time vs edge TTFB: isolates first-mile vs last-mile gains.
- Connection reset rates around Cloudflare deploy windows: should trend down with FL2.
- Cache hit ratio (overall & hot paths): higher effective hit ratios may appear as FL2 eliminates per-request overhead.
- Bandwidth/CPU at origin: look for reduced spikes when the edge is under seasonal or event load.
Risk/compat notes
- Module filtering: by design, only relevant modules run per request. If you rely on nuanced WAF/Workers interactions, validate with synthetic tests and canaries.
- Feature parity: most is done; HTTP/TLS termination is the last major piece migrating to Rust. Keep an eye on release notes for corner-case behaviors.
- Edge Partner POPs: performance improvements can be geo-dependent; your ASN may benefit earlier/later than global averages.
Practical checklist for sysadmins
- Baseline now. Capture a week of TTFB/TTI and origin latency (p50/p95/p99) per region before you flip any config.
- Canary critical paths. Run synthetic checks for login, checkout, and hot APIs at 1–5 min intervals from multiple geos.
- Audit cacheability. Verify
Cache-Control
,Vary
, ETags, and CDN rules. FL2 reduces edge overhead; let it cache. - Harden origin. Enable TLS session tickets/0-RTT where safe, tune keep-alives, and confirm HTTP/2 multiplexing isn’t blocked by middleboxes.
- Alert for deploy windows. Add a change calendar entry pulling Cloudflare’s status feed; correlate with your SLO dashboards.
- Fail-safe posture. Keep your origin surge protections (rate limits, queueing, autoscale). Faster edges hit harder when traffic spikes.
- Document assumptions. If you depend on order-of-execution between WAF, Workers, and routing rules, write a short test plan; FL2’s contracts should preserve behavior, but verify.
What this means for multi-cloud / hybrid
- Consistent front-door behavior for global apps (less jitter at the edge), better baseline for anycast and origin routing policies.
- Cleaner DR drills: graceful restarts diminish incidental client churn while the edge updates; combine with your GSLB/traffic manager for fewer false alarms.
Roadmap signals to watch
- Rust-based HTTP/TLS termination rollout (final big cut-over).
- Non-HTTP protocols (RPC/streams) support expansion.
- More last-mile POPs/peering in “next-gen” markets (biggest wins lately came from Africa and dense markets like Japan).
FAQ
Will FL2 change my cache or WAF rules?
Behavior is intended to be equivalent; logic moved into typed modules with explicit contracts. Still, run synthetics for your critical paths after major vendor updates.
Can FL2 fix my origin slowness?
No. It lowers edge overhead and improves routing/deploy resilience. Slow origins remain slow—tune cacheability and capacity.
Should I expect fewer dropped connections during maintenance?
Yes. Graceful restarts with socket activation keep sockets open and hand them to the new process. WebSockets/streams should survive routine rollouts.
What if I run long-lived HTTP/2 streams?
Expect fewer mid-rollout interruptions. Keep your client retry/backoff logic anyway.
Do I need to change anything in my config to “use” FL2?
No—migration is vendor-side. Your wins surface through lower TTFB and more stable deploys. Keep your origin/CDN configs clean to exploit the gains.
Is there any overhead from the module system?
The opposite in practice: modules are filtered per request, so irrelevant logic doesn’t run—removing the “every product runs every time” tax.
When will everything be Rust-based?
Most of the plane is already on FL2; HTTP/TLS termination is the remaining big piece slated to complete next, after which FL1 is retired.
Bottom line for sysadmins
FL2 is not “Rust for Rust’s sake.” It’s a cleaner control/data plane with typed contracts, selective per-request execution, and deployments that don’t kick users off. Treat it as free latency and stability budget at the edge—and reinvest that budget by tightening cacheability, smoothing origin load, and raising your own SLOs.