Google has put numbers—and a method—behind something sysadmin teams have long considered: running the same workloads on x86 and Arm side by side. With its Axion (Arm) CPUs serving YouTube, Gmail, and BigQuery alongside x86, the company reports up to 65% better price-performance and up to 60% better energy efficiency versus comparable instances inside Google Cloud. Beyond the headline, Google’s experience is a useful playbook for system administrators operating VMware/Proxmox, Kubernetes, and bare-metal who want to prepare their software for a multi-architecture world.

“The friction wasn’t in instructions or assembly—it was in tests, builds, pipelines, and configuration that assumed x86 as the only path,” says David Carrero, co-founder of Stackscale – Grupo Aire (a European private-cloud provider on VMware and Proxmox). “If you harden that layer, moving to multi-architecture stops being an epic and becomes an industrializable operation.”

Important note: Stackscale does not offer Arm nodes today. Stackscale’s offering is x86 private cloud (VMware and Proxmox) plus advisory to ensure customer software is multi-arch-ready at the application and container levels—so it can run on x86 now and be ready for Arm when and where the customer chooses (their own lab, public clouds, or future market options).


What Google actually did (and why sysadmins should care)

  • Real multiarch execution: 30,000+ internal applications compiled and running on x86 and Arm.
  • Production reality: mixed clusters where the scheduler places jobs on the ISA that best fits, raising overall cluster utilization.
  • Mass automation: 38,156 commits (~700,000 lines changed) with early peaks in tooling/tests, later dominated by config and process.
  • AI as a multiplier: CogniPort, an agent that reacts to build/test failures and proposes edits; in a retrospective benchmark it fixed 30% of failing tests without special prompting.

Operational translation: the real challenge wasn’t “hand-porting code,” but taming the periphery so the same artifacts (images, packages, manifests) coexist per ISA without breaking deployments, observability, or SLOs.


Where the friction lies moving from x86 to Arm (and how to attack it)

1) Tooling, CI/CD, and pipelines

  • Multiarch builds: enable arm64 targets in CI (GCC/Clang) and matrix builds (amd64/arm64).
  • Multiarch images: publish OCI manifests with per-ISA variants (e.g., docker buildx) and add tests to verify the runtime pulls the correct one.
  • Emulation (QEMU): fine in CI for quick tests, not a substitute for performance or capacity planning.

Carrero: “In CI/CD, treat arm64 as a first-class target. If a merge breaks Arm, it fails like a lint. That’s the only way to keep the debt from growing.”

2) Tests that assume x86

  • Determinism: avoid tests dependent on tight timings or memory-model quirks.
  • Sanitizers/fuzzing: enable on both ISAs to expose data races masked by x86’s TSO.
  • Fixtures: make tolerances explicit (floating point) and document platform-reasonable differences.

3) Code and libraries

  • Intrinsics: hide AVX/SSE behind abstractions; provide NEON/SVE paths or fallbacks where it makes sense.
  • Packing and binaries: review bitfields, struct packing, and binary serialization (modern Arm is little-endian).
  • Crypto/compression: verify hardware accelerations and coherent fallbacks across ISAs.

4) Configuration and deployment

  • Manifests: parameterize by architecture (Kubernetes/Helm/Ansible/Terraform).
  • Scheduler: label nodes by ISA; define affinity/taints; ensure dual lanes (x86/Arm) for critical services.
  • ISA canaries: rotate canaries per architecture and measure errors/latencies separately.

Carrero: “Many incidents weren’t CPU issues but config choreography: rollouts unaware of two variants, a daemon built for one ISA and the sidecar for another. You must design that dance.”

5) Observability and SRE

  • Labels: add architecture as a first-class label across logs, metrics, and traces.
  • SLOs per ISA: separate latencies and error budgets; alert on x86 vs Arm drift.
  • Auto-evacuation: emulate Google’s CHAMP idea: if a job crash-loops or tanks throughput on Arm, evict it and flag for offline tuning.

Cost, energy, and capacity (what the data says—and what you can do today)

Google claims up to 65% better price-performance and up to 60% better energy efficiency. If you run on-prem, those figures influence PUE, density, and rack planning. What can a team do today operating x86 only?

  • Logical mix planning: even if all physical compute is x86, prepare software for multi-ISA: multiarch images, ISA-aware tests, feature flags.
  • Realistic evaluation: test Arm in a lab or public cloud (when sensible) to compare SLOs and cost per transaction.
  • Future elasticity: once software is multiarch, you can place workloads where they’re most efficient later (x86 on Stackscale/private today, Arm elsewhere tomorrow) without rewriting applications.

Carrero: “At Stackscale we’re x86 today (VMware/Proxmox). Still, we recommend multiarch at the software layer: it prepares you to leverage Arm when it makes sense (lab, public cloud, future providers) without getting stuck in a single-ISA lock-in.”


Security and compliance

  • SBOM per ISA: produce architecture-specific component inventories.
  • Cryptography: align libraries and certifications by platform (FIPS if applicable).
  • Hardening parity: compare ASLR, seccomp, compiler hardening flags, and execution policies across ISAs.

If you operate VMware/Proxmox (x86-only) in a private cloud

The core of migration is application-layer. At the platform layer:

  • VMware/Proxmox: production-ready on x86; verify roadmaps and use cases before thinking about Arm at the hypervisor level.
  • Containers/Kubernetes: the cleanest way to prepare multiarch today—while deploying only on x86—and decide later where to run Arm.
  • Interoperability: orchestrate by ISA with labels; publish multiarch images; avoid tight coupling to legacy templates.

Carrero: “Our job today is to deliver robust, predictable x86 with VMware/Proxmox, and help customers get multiarch-ready at software level. Arm trials can happen in a lab or public cloud; when the market offers private Arm options that fit, the heavy lifting will already be done up-stack.”


A pragmatic 90-day roadmap (no Arm in your platform yet)

Day 0–15 — Discovery & technical baseline

  • Inventory services by criticality and dependencies (intrinsics, native binaries).
  • Enable arm64 builds in CI (matrix) and produce multiarch images.
  • Add architecture as a label across observability.

Day 15–45 — Controlled pilot on x86

  • Pick 3–5 services (CPU-bound, low coupling).
  • Deploy only on x86, but with multiarch artifacts and logical ISA canaries (CI + staging validation).
  • Fix tests/config that break; document runbooks.

Day 45–90 — Scale and external validation

  • Add automation: lint for multiarch Dockerfiles, Helm validations per ISA, evacuation rules (simulated).
  • If relevant, run spot tests in an Arm lab or public cloud to compare SLOs and cost/request.
  • Adopt “multiarch by default” for new services.

What not to do

  • Equate “it compiles for Arm” with “it performs on Arm”: validate SLOs.
  • Lean on emulation in production: QEMU is for CI, not for benchmarks or capacity planning.
  • Ship without feature flags: architecture cutovers must be reversible.
  • Ignore third parties: confirm SDKs, drivers, and appliances on Arm (if you plan to use it elsewhere) or have alternatives.

Closing opinion

“The goal isn’t to ‘abandon x86,’ it’s to gain options,” Carrero concludes. “At Stackscale we provide x86 today with VMware and Proxmox, and we help customers prepare for multi-architecture at the software layer. When they decide where and when to run Arm, they won’t have to rewrite anything—it’ll just be another target in their pipelines.”


FAQs

Which workloads “win” first when you trial Arm outside your x86 platform?
CPU-bound services with minimal ISA-specific intrinsics; microservices in Go/Java/Python; stateless APIs; workers; and data components not reliant on AVX. Always validate SLOs and cost/request.

How do I publish container images for x86 and Arm without duplicating repos?
Use multiarch manifests (e.g., docker buildx build --platform linux/amd64,linux/arm64 --push) with a single tag per version; the runtime will pull the correct variant.

Do I need to rewrite my apps for Arm?
Usually no. You need to enable builds, fix tests/config, and abstract intrinsics/native calls. Where you use AVX/SSE, provide NEON/SVE paths or sane fallbacks.

If I’m x86-only today (Stackscale/private), what do I gain from preparing multiarch?
You gain technology independence and decision speed: you can move workloads to Arm in a lab or public cloud without rewriting software—and keep x86 where it makes sense. TCO improves when architecture stops limiting deployment choices.

Scroll to Top