It just doesn’t work. Nothing in the documentation hints at this, but it’ll save you a lot of time to know it doesn’t work. You have to use NFSv3 for nfsroot.
Tag Archives: nfs
[virt] ioengine=libaio iodepth=4 rw=randrw bs=64k direct=1 size=1g numjobs=4
It has 4 large “disks” (
size=1g) and 4 large “qemu processes” (
numjobs=4) running in parallel. Each test thread can have up to 4 IOs in flight (
iodepth=4) and the size of IOs is 64K which matches qcow2 default cluster size. I enabled O_DIRECT (
direct=1) because we normally use qemu
cache=none so that live migration works.
Performance of the RAID 1 array of hard disks
The raw performance of the RAID 1 array (this includes the filesystem) is fairly dismal:
Performance of the SSD
The SSD in contrast does a lot better:
However you want to look at the details, the fact is that the test runs 11 times faster on the SSD.
The effect of NFS
What about when we NFS-mount the RAID array or the SSD on another node? This should tell us the effect of NFS.
NFS makes this test run 3 times slower.
For the NFS-mounted SSD:
NFS makes this test run 4.5 times slower.
The effect of virtualization
By running the virtual machine on the first node (with the disks) it should be possible to see just the effect of virtualization. Since this is backed by the RAID 1 hard disk array, not SSDs or LVM cache, it should be compared only to the RAID 1 performance.
The effect of virtualization (virtio-scsi in this case) is about an 8% drop in performance, which is not something I’m going to worry about.
- The gains from the SSD (ie. using LVM cache) could outweigh the losses from having to use NFS to share the disk images.
- It’s worth looking at alternate high bandwidth, low-latency interconnects (instead of 1 gigE) to make NFS perform better. I’m going to investigate using Infiniband soon.
These are just the baseline measurements without LVM cache.
I’ve included links to the full test results. fio gives a huge amount of detail, and it’s helpful to keep the HOWTO open so you can understand all the figures it is producing.
virtlockd is a lock manager implementation for libvirt. It’s designed to prevent you from starting two virtual machines (eg. on different nodes in your cluster) which are backed by the same writable disk image, something which can cause disk corruption. It uses plain fcntl-based file locking, so it is ideal for use when you are using NFS to share your disk images.
Since documentation is rather lacking, this post summarises how to set up virtlockd. I am using NFS to share
/var/lib/libvirt/images across all the nodes in my virtualization cluster.
Firstly it is not clear from the documentation, but virtlockd runs alongside libvirtd on every node. The reason for this is so that libvirtd can be killed without having it drop all the locks, which would leave all your VMs unprotected. (You can restart virtlockd independently when it is safe to do so). I guess the other reason is because POSIX file locking is so fscking crazy unless you use it from an independent process.
Another thing which is not clear from the documentation: virtlockd doesn’t listen on any TCP ports, so you don’t need to open up the firewall. The local libvirtd and virtlockd processes communicate over a private Unix domain socket and virtlockd doesn’t need to communicate with anything else.
There are two ways that virtlockd can work: It can either lock the images directly (this is contrary to what the current documentation says, but Dan told me this so it must be true).
Or you can set up a separate lock file directory, where virtlockd will create zero-sized lock files. This lock file directory must be shared with all nodes over NFS. The lock directory is only needed if you’re not using disk image files (eg. you’re using iSCSI LUNs or something). The reason is that you can’t lock things like devices using fcntl. If you want to go down this route, apart from setting up the shared lock directory somewhere, exporting it from your NFS server, and mounting it on all nodes, you will also have to edit
/etc/libvirt/qemu-lockd.conf. The comments are fairly self-explanatory.
However I’m using image files, so I’m going to opt for locking the files directly. This is easy to set up because there’s hardly configuration at all: as long as virtlockd is running, it will just lock the image files. All you have to do is make sure the virtlockd service is installed on every node. (It is socket-activated, so you don’t need to enable it), and tell libvirt’s qemu driver to use it:
--- /etc/libvirt/qemu.conf --- lock_manager = "lockd"