Skip to main content
Do you want an immutable OS? You have to reboot to update it.

Do you want an immutable OS? You have to reboot to update it.

Massimo Gollo
Author
Massimo Gollo
I like understanding why systems break, and building them so they don’t.
Why ostree-based systems like Fedora Silverblue and Bottlerocket trade Linux’s old habits for atomic, predictable updates - and what /sysroot is really doing under the hood.

I never imagined I’d have to reboot a system just to install a package.

That, after all, is part of why Linux quietly took over daily operations everywhere - cough Hello, Windows cough. Robust, simple, predictable. You install a package you’ve never heard of, run a command you copy-pasted from Stack Overflow in a hurry, and everything works. It isn’t magic: it’s the patient work of countless people who, over thirty years, have made an operating system genuinely solid.

Now imagine this: after years of Ubuntu LTS humming along on your servers, somebody walks up and tells you, “for a more robust system, you need to reboot.” Sounds backwards. And yet that’s exactly what an immutable OS asks of you.

It isn’t done out of love for ruining your uptime. Behind that request lies a very specific idea: separate what’s sitting on the disk from what the system is using right now. Once you grasp that trick, you also start to understand why df -h on a Fedora Silverblue install shows you weird things - like a mount point called /sysroot you’d never seen before.

This article is the explanation I wish I’d had in the afternoon I first stared at that /sysroot and started wondering what the hell was going on.

The HelloWord Boot Process
#

To understand what changes in an immutable OS, it’s worth reviewing how a regular Linux works. Let’s consider a typical Linux distro on the disk, such as Ubuntu or Debian.

The disk has (at least) two partitions we care about: a small /boot partition with the kernel and bootloader, and a root partition with the actual filesystem: /usr, /etc, /var, and whatever chaos has accumulated in /home.

We hit the power button and this is what happens:

  1. The firmware (BIOS/UEFI) initializes the hardware and looks for a bootloader on the disk.
  2. The bootloader (GRUB, u-boot, systemd-boot) loads two things from the /boot partition: the kernel and a file called the initramfs. Both go into RAM.
  3. The kernel starts. The initramfs is a compressed CPIO archive: the kernel extracts it into RAM and uses it as the first /. Inside there’s just the bare essentials: busybox, init scripts, drivers needed to mount the real root partition.
  4. The initramfs has one job: identify the real root partition, mount it, and hand over control.
  5. When it succeeds, it performs a switch root: the system root becomes the partition on disk, the initramfs gets unmounted, its RAM freed.
  6. systemd starts and the boot proceeds normally.

The key point: after boot, / is the disk. One-to-one mapping. The initramfs served as a trampoline and no longer exists. If you cat /etc/hostname, you’re reading a file that physically sits on the root partition. No surprises.

The Problem: Updating Without Praying
#

How many times have we fired off apt upgrade on a server you actually cared about, and held your breath until it finished? Exactly.

Imagine being able to update the system atomically. Either we’re on the old version or the new one, never halfway. No more “the package manager got interrupted and now the system won’t boot.” And if the new version turns out broken, you want an instant rollback.

A naïve solution: keep two complete copies of the OS on disk, and pick one at boot. But now you have a representation problem. How does this present itself to userspace? If you have /usr-version-A and /usr-version-B on disk, programs don’t know to look there. They expect /usr and nothing else.

You need a layer of indirection.

The Solution: Disk as Archive, / as Window
#

Immutable OSes (Fedora Silverblue, Fedora CoreOS, Fedora IoT, and others) use a technology called ostree that solves exactly this problem. Systems like openSUSE MicroOS take a conceptually similar approach with different technology (btrfs snapshots). The idea is simple and elegant.

On disk, the root partition no longer contains /usr, /etc, /var as plain directories. It contains a more articulated structure:

(root partition, raw view)
├── boot/
└── ostree/
    ├── repo/                          ← content-addressed objects
    └── deploy/<distro>/
        ├── deploy/abc123.../          ← complete OS, version N
        │   ├── usr/
        │   ├── etc/
        │   └── ...
        ├── deploy/def456.../          ← complete OS, version N-1
        └── var/                       ← shared state

On disk we have N complete installations of the OS, each in its own directory called a “deployment.” They’re content-addressed: their name is the SHA-256 hash of the contents. Think of each deploy/abc123/ as an unpacked container image on disk, because conceptually that’s exactly what it is.

But then how does the system make a program believe /usr exists? This is where /sysroot enters the picture.

/sysroot: The Window onto the Warehouse
#

When an ostree system boots, it makes an elegant move. Instead of mounting the root partition on / like a regular Linux, it mounts it on /sysroot. Then it builds an artificial / made of bind mounts that reach into a specific deployment.

At runtime, from inside the system, we see this:

/                              ← artificial view, built at boot
├── usr                        ← bind mount → /sysroot/.../deploy/abc123/usr
├── etc                        ← /sysroot/.../deploy/abc123/etc (3-way merge)
├── var                        ← bind mount → /sysroot/.../var
└── sysroot/                   ← the actual disk, in its entirety
    └── ostree/...

We get two views onto the same disk:

  • cd / shows us only the active deployment, presented as a normal OS. All programs see /usr, /etc, /var exactly where they expect them. Nothing broken, nothing weird for anything running on top.
  • cd /sysroot shows us the physical disk in its entirety, with all deployments, the ostree repo, the shared storage. It’s the “landlord’s” view.

The trick is all here: a layer of indirection between the disk and the apparent root.

The initramfs in an Immutable System
#

There’s a natural question at this point: if / is a view built at boot, who builds it? The answer: the initramfs, with one extra binary called (in ostree) ostree-prepare-root.

The flow becomes:

  1. Bootloader loads kernel + initramfs. Identical to the normal case.
  2. Kernel starts, initramfs is extracted into RAM. Identical.
  3. ostree-prepare-root mounts the disk’s root partition on /sysroot. Different: previously it would have been mounted directly on /.
  4. It reads the bootloader entries to figure out which deployment is the active one.
  5. It builds the bind mounts for /usr, /var, /etc pointing inside that deployment.
  6. Switch root into the constructed view.

From here on, systemd starts and sees a “normal” /. It doesn’t know, and doesn’t care, that it’s artificial. The rest of the boot proceeds exactly like on any other Linux.

The Analogy That Makes It Click
#

If you’re familiar with containers, you already have the mental model: this is exactly how a container works, applied to the boot of the host itself.

ContainerImmutable OS
Image layers stored in the container runtimeDeployments stored under /sysroot/ostree/
Container sees only its own rootfsSystem sees only the active deployment as /
Host sees all imagesFrom /sysroot you see all deployments
Switching containers = switching rootfsUpdating the OS = switching which deployment is active

An immutable OS is, in a sense, a container that promoted itself to host system. The same conceptual primitive - separating the storage of “all possible views” from the “currently exposed view” - applied at boot rather than at process runtime.

Why You Should Care
#

Even if you’ll never run an immutable OS in production, understanding this model changes how you think about systems. Three intuitions I take away from the moment it clicked:

The filesystem isn’t the disk. We’ve always known this (through proc, tmpfs, bind mounts, containers) but immutable OSes make it the organizing principle. What you see mounted is an interpretation, not the ultimate truth.

Atomicity requires indirection. Want atomic updates, rollback, A/B testing of the OS? You need to be able to swap the entire view at once. That’s impossible if the view is the storage. You need a layer in between that decouples “what’s on disk” from “what’s currently exposed.”

Containers are Linux applied to itself. Container primitives - namespaces, mounts, overlayfs - were born to isolate processes. Immutable OSes show that the same primitives, applied at boot, give you transactionality across the entire system. It’s Linux rediscovering its own patterns in a new domain.

The next time you find yourself staring at df -h with a mysterious mount point called /sysroot, you’ll know what you’re looking at: a carefully constructed window onto a much bigger warehouse.