There seems to be a lot of confusion about the “sVirt” feature that we added in Fedora 11. Really it’s very simple, and in this posting I hope to explain it in very simple terms.
Let’s start with the problem that sVirt tries to solve: If you run lots of KVM virtual machines on a single host, then probably all those qemu processes are running as the same user. They could all be running as root (very bad!). Better, they might all be running as a separate qemu.qemu user/group. Also, any disks they are using are probably chowned to qemu.qemu too.
The problem is that the interface between the virtual machines and the containing qemu process is very complicated and hacked together in C. It’s very likely that this boundary is full of undiscovered insecurities that allow a user in the virtual machine to take over the qemu process — in other words, to escape from the confinement provided by the hypervisor. (Xen and other hypervisors have similar problems, this is not something that’s special to KVM).
If all your qemu processes are running as the same user, there is literally no protection between the virtual machines if one is compromised like this. Two processes running as the same user can send signals, insert data into each other using ptrace, and lots more — if one qemu process “goes bad”, you’ve lost control of all the other qemu processes on the host. Furthermore, because all the disk images (and other resources) were accessible by the single qemu.qemu user, you’ve also lost all those too.
What’s the solution? Well, sVirt of course. One thing you could do is to run all the qemu processes as different users, but that’s not very convenient because it would mean reserving a block of hundreds of UIDs and GIDs. Step forward SELinux: without needing to reserve anything, you can give each qemu process a different SELinux label. This firstly prevents a compromised qemu from attacking other processes, and also allows you to label the precise set of resources that each process can see — so a compromised qemu can only attack its own disk images.
You can see an example of sVirt labels on this page.
What happens if you turn off SELinux (or use a distribution that doesn’t have libvirt, SELinux or sVirt in the first place)? You’re trusting that huge, hacked together boundary between the VM and the hypervisor to keep you safe. Good luck.