Wouldn’t it be cool if you could watch a virtual machine booting, and at the same time see what files it is accessing on disk:
reading /dev/sda1 master boot record
reading /dev/sda1 /grub2/i386-pc/boot.img
reading /dev/sda1 /grub2/i386-pc/ext2.mod
reading /dev/sda1 /vmlinuz
You can already observe what disk blocks it is accessing pretty easily. There are several methods, but a quick one would be to use nbdkit’s file plugin with the
-f -v flags (foreground and verbose). The problem is how to map disk blocks to the files and other interesting objects that exist in the disk image.
How do you map between files and disk blocks? For simple filesystems like ext4 you can use the FIBMAP ioctl, and perhaps adjust the answer by adding the offset of the start of the partition. However as you get further into the boot process you’ll probably encounter complexities like LVM. There may not even be a 1-1 mapping since RAID means that multiple blocks can store a single file block, and tail packing and deduplication mean that a block can belong to multiple files. And of course there are things other than plain files: directories, swap partitions, master boot records, and boot loaders, that live in and between filesystems.
To solve this I have written a tool called virt-bmap. It takes a disk image and outputs a block map. To do this it uses libguestfs (patched) to control an nbdkit instance, reading each file and recording what blocks in the disk image are accessed. (It sounds complicated, but virt-bmap wraps it up in a simple command line tool.) The beauty of this is that the kernel takes care of the mapping for us, and it works no matter how many layers of filesystem/LVM/RAID are between the file and the underlying device. This doesn’t quite solve the “RAID problem” since the RAID layers in Linux are free to only read a single copy of the file, but is generally accurate for everything else.
$ virt-bmap fedora-20.img
virt-bmap: examining /dev/sda1 ...
virt-bmap: examining /dev/sda2 ...
virt-bmap: examining /dev/sda3 ...
virt-bmap: examining filesystem on /dev/sda1 (ext4) ...
virt-bmap: examining filesystem on /dev/sda3 (ext4) ...
virt-bmap: writing /home/rjones/d/virt-bmap/bmap
virt-bmap: successfully examined 3 partitions, 0 logical volumes,
2 filesystems, 3346 directories, 20585 files
virt-bmap: output written to /home/rjones/d/virt-bmap/bmap
bmap file is a straightforward map from disk byte offset to file / files / object occupying that space:
1 541000 541400 d /dev/sda1 /
1 541400 544400 d /dev/sda1 /lost+found
1 941000 941400 f /dev/sda1 /.vmlinuz-3.11.10-301.fc20.x86_64.hmac
1 941400 961800 f /dev/sda1 /config-3.11.10-301.fc20.x86_64
1 961800 995400 f /dev/sda1 /initrd-plymouth.img
1 b00400 ef1c00 f /dev/sda1 /grub2/themes/system/background.png
1 f00400 12f1c00 f /dev/sda1 /grub2/themes/system/fireworks.png
1 1300400 1590400 f /dev/sda1 /System.map-3.11.10-301.fc20.x86_64
1 that appears in the first column means “first disk”. Unfortunately virt-bmap can only map single disk virtual machines at present.]
The second part of this, which I’m still writing, will be another nbdkit plugin which takes these maps and produces a nice log of accesses as the machine boots.