Linux filesystems have spent decades optimizing a familiar path: storage I/O → page cache → copies into user space. It’s reliable and battle-tested — but it’s also a tax you feel immediately when your priorities shift to minimum latency, maximum reuse across instances, and moving less data through the CPU.

That’s the space where DAXFS shows up as a bold experiment: a read-only filesystem built directly on the kernel’s DAX (Direct Access) infrastructure, designed to bypass the traditional I/O stack and the page cache to enable true zero-copy reads from a contiguous shared memory region.

Instead of treating “storage” like blocks, DAXFS treats it like memory you can mount.


The core idea: DAX, pushed to its logical extreme

DAX exists to access byte-addressable storage (PMEM, DAX-mapped regions, etc.) without funneling everything through the page cache. DAXFS builds on that premise and goes further:

  • No page cache duplication
  • No buffer heads
  • No extra CPU copies for reads
  • Reads resolve as direct memory loads from the mapped region

This makes DAXFS especially interesting when the “dataset” is something you want to share broadly — like a container root filesystem image or model weights — and you don’t want every instance to keep its own cached copy.


How DAXFS works: a read-only base + copy-on-write branches

DAXFS combines:

  • A read-only base image shared by everyone
  • Copy-on-write branches, each maintaining a delta log for modifications
  • An in-memory index (rb-tree) over each branch’s delta log to keep lookups fast

Reads check deltas first, then fall through to parent branches and ultimately the shared base. Writes append to the branch log — fast and simple.

The “single winner” branch model

This is not “git for filesystems.” DAXFS explicitly models speculative workflows:

  • Commit: the current branch becomes the winner and sibling branches are discarded
  • Abort: throw away the current branch and return to the main line
  • No merges, no long-lived parallel histories — just speculative execution with one outcome

That matches real operational patterns in automation/agents: try changes, test, keep the successful path, discard the rest.


Why now: containers, multikernel, CXL — and accelerators

DAXFS was originally designed for multikernel environments, where multiple kernels share the same physical memory region. The flagship use case: booting container rootfs images from shared memory so that every kernel instance sees the same filesystem without network I/O or inter-kernel coordination.

But the same properties make it relevant to modern infra trends:

  • Container rootfs sharing: one shared base, writable branches per container, less overhead than overlay stacks in memory-heavy fleets
  • CXL memory pooling: place a DAXFS image into a shared CXL Type-3 region and mount it across hosts without network fetches
  • Persistent memory: a persistent read-only image that survives reboots (depending on the backing)
  • GPU/FPGA/SmartNIC data: DAXFS can be backed by device-accessible memory via dma-buf, enabling structured, mountable views over shared buffers (e.g., model weights, lookup tables)

Why dma-buf matters

dma-buf is Linux’s standard mechanism for cross-subsystem buffer sharing. DAXFS leans into this by supporting mounting from a dma-buf file descriptor using the new mount API (fsopen/fsconfig/fsmount). In practical terms: if a device driver can export a dma-buf, it can potentially host a DAXFS image — and the kernel can read it directly without intermediate copies.


How it differs from tmpfs, overlayfs, and EROFS

DAXFS is easiest to understand by contrast:

  • tmpfs/ramfs: fast, but per-instance and read-write; sharing the same rootfs across many containers typically means duplicated physical pages and “copy in first” population
  • overlayfs: good layering model, but you pay for copy-up, page cache overhead in upper layers, and increasing complexity with deep stacks
  • EROFS: excellent read-only performance, but branching would require major structural compromises (indirection, merging directory views, full write path), undermining what makes it fast

DAXFS tries to sit in a different niche: shared immutable base + CoW deltas, optimized for “many readers, controlled writers” over a memory-backed region.


Practical glimpse: mounting and branch ops

# Mount a daxfs image located at a physical address, with a writable branch
mount -t daxfs -o phys=0x100000000,size=0x10000000,branch=main,rw none /mnt

# Create a new branch from 'main' (implicitly switches to it)
daxfs-branch create feature -m /mnt -p main

# List branches
daxfs-branch list -m /mnt

# Commit or abort changes (commit discards sibling branches)
daxfs-branch commit -m /mnt
daxfs-branch abort  -m /mnt
Code language: PHP (php)

That commit/abort semantic is the tell: DAXFS isn’t trying to replace ext4 or XFS — it’s trying to make shared images and speculative mutation cheap and operationally clean.


What sysadmins should take away

DAXFS is compelling precisely because it’s not a general-purpose filesystem. It’s an attempt to solve a very modern pain point:

If your “storage” is really a shared memory region — across kernels, containers, hosts, or devices — why pretend it’s blocks and pay the cache/copy tax?

Where it can shine:

  • Large fleets sharing the same rootfs/dataset
  • Memory pooling environments (CXL / DAX-mapped regions)
  • Accelerator-heavy pipelines where moving data hurts more than computing it

Where it’s not (yet) the answer:

  • Conventional server workloads that need mature tooling, recovery paths, and a traditional write model
  • Environments without controllable contiguous memory regions or without a clear shared-memory architecture

The real question isn’t whether DAXFS is “faster” in a microbenchmark — it’s whether the Linux ecosystem sees enough value in this memory-first model to push it toward upstream maturity.

Source: Github

Scroll to Top