I wrote an out of tree patch to qemu that lets you gather read and write traces when a virtual machine accesses its virtual hard disk.
Firstly we use the guestfish prepared disk feature to partition a disk and create an ext2 filesystem on the disk. The command under test is:
$ guestfish -N fs:ext2:10M
and this is what the disk access looks like (click to see the full size image):
The whole box represents the entire disk (10 MB), and each cell represents one sector (512 bytes in this case).
The large number of unaligned writes there points in fact to a mistake in the guestfs part-disk operation which is creating an unaligned whole disk partition. By adding a (non-upstream) patch to make this create an aligned partition, the results look a little better:
The second test is to take this disk image and simply mount it up. The command under test is:
$ guestfish -a test1.img -m /dev/sda1
and the access pattern looks like this:
As expected this is a mostly-read operation, but the act of mounting performs some writes to the ext2 superblock (shown in red).
Finally let’s see what it looks like to create a file on this empty filesystem:
$ guestfish -a test1.img -m /dev/sda1 \ write /hello "hello, world."
That diagram includes the mount operation, so you have to mentally subtract that to see just the various file and metadata writes.
An obvious area of improvement here is to have libguestfs signal down to qemu to start and stop the trace, so that we can trace single operations (like just creating the file, not mounting the filesystem).
Another area of investigation is to add an LVM layer, and to experiment with the filesystem blocksize and other tunables.
It would also be good to start identifying various filesystem metadata areas by name, such as the inode table, block free bitmap and superblock.