Half-bakery is a great website for amusing ideas.
I’m going to put my half-baked ideas under a special Ideas tag on this site.
My first idea is a “cluster” version of ext4 for the special case where there is one writer and multiple readers.
Virt newbies commonly think this should “just work” – create a ext4 filesystem in the host and export it read-only to all the guests. What could possibly go wrong?
In fact this doesn’t work, and guests will see corrupt data depending on how often the filesystem gets updated. The reason is that the guest kernel caches old parts of the filesystem and/or can be reading metadata which the host is simultaneously updating.
Another problem is that a guest process could open a file which the host would delete (and reuse the blocks). Really the host should be aware of what files and directories that guests have open and keep those around.
So it doesn’t work, but can it be made to work with some small changes to ext4 in the kernel?
You obviously need a communications path from the guests back to the host. Guests could use this to “reserve” or mark their interest in files, which the host would treat as if a local process had opened. (In fact if the hypervisor is qemu, it could open(2) these files on the host side).
This is a really useful feature for virtualization. Uses include: exporting any read-only data to guests (documents, data). Updateable shared /usr for guests. Mirrors of yum/apt repositories.
Related: virtio filesystem (what happened to that plan9 fs for KVM?), paravirt filesystems.
If qemu is going to open files on the host side, why not go the whole way and implement a paravirtualized filesystem? It wouldn’t need to be limited to just ext4 on the host side.
But how would we present it on the guest side? Presenting a read-only ext2 filesystem on the guest side is tempting, but not feasible. The problem again is what to do when files disappear on the host side — there is no way to communicate this to the guest except to give fake EIO errors which is hardly ideal. In any case qemu can already export directories as “virtual FAT filesystems”. I don’t know anyone who has a good word to say about this (mis-)feature.
So it looks like however it is done, there is a requirement for the guest to communicate its intentions to the host, even though the guest still would not be able to write.