Ric Wheeler and Christoph Hellwig were quick to point out I was wrong about something: Linux now has a standard API for freezing or “quiescing” filesystems.
Quiescing a filesystem lets you take a consistent snapshot or backup at the block device level. If your server uses SAN storage, then probably your SAN lets you take snapshots of the SCSI LUNs at any time. But if you try doing this while the server is under load you’ll (at best) get a “crash consistent” snapshot, where the journal has to be replayed when the copy of the filesystem is mounted, and at worst you’ll get data corruption, particularly with ext3 defaults.
Quiescing tells the filesystem to make things consistent at the disk / block device level. The journal won’t need to be replayed, and the superblock is marked as if you’d unmounted the device. A snapshot taken at this stage will be consistent, at least at the filesystem level (applications don’t know what is happening, so you could still see things like half-written transactions in databases).
The downside to quiescing a filesystem is that it generally causes writes to be blocked, eventually bringing the whole system to a grinding halt. SAN snapshots can be done very quickly though, so the time between a “freeze” and “thaw” operation is usually brief.
Very recent versions of util-linux-ng have an fsfreeze command that lets you freeze or thaw filesystems at the command line. Use with care!
Freezing filesystems also has an application for virtual machines. Our new guest agent will support freezing filesystems so that you can coordinate a consistent backup or snapshot from outside the guest.
If you have Rawhide and the most recent virt-rescue you can play with freezing filesystems without breaking anything:
$ rm -f test.img
$ truncate -s 1G test.img
$ virt-rescue test.img
><rescue> mkfs.ext4 /dev/vda
><rescue> mount /dev/vda /sysroot
From another window you can see that the image is not consistent. If you were to snapshot the image now the filesystem would at least require journal recovery when mounted:
$ file test.img
test.img: Linux rev 1.0 ext4 filesystem data (needs journal recovery) (extents) (large files) (huge files)
But by issuing fsfreeze in the guest we can make it consistent:
><rescue> fsfreeze -f /sysroot
$ file test.img
test.img: Linux rev 1.0 ext4 filesystem data (extents) (large files) (huge files)
.. allowing us to take a snapshot or copy of the block device (test.img) in a consistent state.