Tag Archives: virtualization

Setting up virtlockd on NFS

virtlockd is a lock manager implementation for libvirt. It’s designed to prevent you from starting two virtual machines (eg. on different nodes in your cluster) which are backed by the same writable disk image, something which can cause disk corruption. It uses plain fcntl-based file locking, so it is ideal for use when you are using NFS to share your disk images.

Since documentation is rather lacking, this post summarises how to set up virtlockd. I am using NFS to share /var/lib/libvirt/images across all the nodes in my virtualization cluster.

Firstly it is not clear from the documentation, but virtlockd runs alongside libvirtd on every node. The reason for this is so that libvirtd can be killed without having it drop all the locks, which would leave all your VMs unprotected. (You can restart virtlockd independently when it is safe to do so). I guess the other reason is because POSIX file locking is so fscking crazy unless you use it from an independent process.

Another thing which is not clear from the documentation: virtlockd doesn’t listen on any TCP ports, so you don’t need to open up the firewall. The local libvirtd and virtlockd processes communicate over a private Unix domain socket and virtlockd doesn’t need to communicate with anything else.

There are two ways that virtlockd can work: It can either lock the images directly (this is contrary to what the current documentation says, but Dan told me this so it must be true).

Or you can set up a separate lock file directory, where virtlockd will create zero-sized lock files. This lock file directory must be shared with all nodes over NFS. The lock directory is only needed if you’re not using disk image files (eg. you’re using iSCSI LUNs or something). The reason is that you can’t lock things like devices using fcntl. If you want to go down this route, apart from setting up the shared lock directory somewhere, exporting it from your NFS server, and mounting it on all nodes, you will also have to edit /etc/libvirt/qemu-lockd.conf. The comments are fairly self-explanatory.

However I’m using image files, so I’m going to opt for locking the files directly. This is easy to set up because there’s hardly configuration at all: as long as virtlockd is running, it will just lock the image files. All you have to do is make sure the virtlockd service is installed on every node. (It is socket-activated, so you don’t need to enable it), and tell libvirt’s qemu driver to use it:

--- /etc/libvirt/qemu.conf ---
lock_manager = "lockd"

2 Comments

Filed under Uncategorized

Caseless virtualization cluster: remote libvirt

Now to the question of how to manage the VMs on my virtualization cluster.

I don’t have a good answer yet, but two things are true:

  1. libvirt will be used to manage the VMs
  2. ssh is used for remote logins

It’s simple to set up ssh to allow remote logins as root using ssh-agent:

ham3$ sudo bash
ham3# cd /root
ham3# mkdir .ssh
ham3# cp /mnt/scratch/authorized_keys .ssh/

From a remote host, remote virsh commands now work:

$ virsh -c qemu+ssh://root@ham3/system list
 Id    Name                           State
----------------------------------------------------

Using libvirt URI aliases (thanks Kashyap) I can set up some aliases to make this quite easy:

$ cat .config/libvirt/libvirt.conf
uri_aliases = [
  "ham0=qemu+ssh://root@ham0/system",
  "ham1=qemu+ssh://root@ham1/system",
  "ham2=qemu+ssh://root@ham2/system",
  "ham3=qemu+ssh://root@ham3/system",
]
$ virsh -c ham0 list
 Id    Name                           State
----------------------------------------------------

However my bash history contains a lot of commands like these which don’t make me happy:

$ for i in 0 1 2 3 ; do ./bin/wol-ham$i; done
$ for i in 0 1 2 3 ; do virsh -c ham$i list; done

Leave a comment

Filed under Uncategorized

Caseless virtualization cluster: Wake On LAN

Wake On LAN (WOL) is a feature where you can send a specially formatted network packet to a machine to wake it up when it is switched off. The on-board network ports of my virtualization cluster should support WOL, and the always-useful ArchLinux Wiki has a guide for how to enable WOL.

$ sudo ethtool p6p1 | grep Wake
	Supports Wake-on: pumbg
	Wake-on: g

By the way the BIOS option for WOL on these Gigabyte motherboards is implausibly called “Onboard LAN Boot ROM” which you have to set to Enabled.

It basically just works once the BIOS option is enabled.

$ sudo wol -i 192.168.0.193 74:d4:35:51:ab:86
$ sudo wol 74:d4:35:51:ab:86
Waking up 74:d4:35:51:ab:86...

Edit: Don’t use the -i option. You want your WOL packets to be broadcast on your LAN.

3 Comments

Filed under Uncategorized

Caseless virtualization cluster: power usage

I have a handy power meter which is useful for measuring the power consumed by my virtualization cluster under various loads.

20140429_180316

Note in the figures below I’m only including the four cluster hosts. Not included: the NFS server (which is my development server, so I have it switched on all the time anyway), or the network switch.

With all four hosts idling, power usage of the total cluster was around 174-177 watts. To give you an idea of how much that costs to run, it would be about £263 p.a. at a typical UK rate for electricity.

Interestingly, with one host running, power usage was 50W (I would have expected it to be exactly one quarter), and with no hosts running, something consumes 7W. I suspect the PSUs leak a bit even when nominally off but with the hardware switch still in the ON position.

Under some moderately heavy loads (parallelized compiles in a loop) I was able to drive the power usage up to a maximum seen of 584W (£869 p.a.). Note that was a one-off, and most of the time it bumps wildly around the 300-500W range (unfortunately my simple power meter doesn’t do running averages).

So low power ARM this ain’t! One thing we can do to mitigate this (apart from turning it off) is to migrate VMs to a single node when load is low, and turn the other nodes off. That should save a fairly considerable amount of power.

I also installed lm_sensors so I could monitor temperatures and fan speeds, but there’s not really anything interesting to say. Under load the fans of course spin like mad (typically 3000 RPM), but the temperatures stay at a cool 40°C or so. Even though there is only one fan per motherboard (the CPU fan) there don’t appear to be any cooling issues.

1 Comment

Filed under Uncategorized

Caseless virtualization cluster: one machine is 43% faster than the others

Edit: Mystery solved — see the end

The hosts are identical — motherboards, memory, processors — yet one machine is 43% faster than the others, consistently, on a simple compile benchmark with everything on a ramdisk.

dmidecode and cpuinfo data is here. That should cover all questions about the hardware and processor configuration, I think.

Kernels are identical (all 3.11.10-301.fc20.x86_64).

I’ve no idea, but I’m starting to doubt my own sanity now!

Edit: Mystery solved:

It turned out to be a difference in the OCaml compiler versions installed. On the “fast” machine I had installed the new compiler, which it turns out is quite a bit faster.

Leave a comment

Filed under Uncategorized

Caseless virtualization cluster, part 6

Holy crap: you can connect PCIe ports across motherboards!

ib-pcie-c

Unfortunately the cables for this sort of thing seem to be a bit expensive, but they run at 20 Gbps!

5 Comments

Filed under Uncategorized

Caseless virtualization cluster, part 5

My caseless virtualization cluster is now complete. 32 cores (arguably), 64 GB of RAM, for about £1300:

20140428_123312

The power supplies cause a real wiring nightmare! It would be great to have a better solution to delivering power:

20140428_123326

20140428_123333

It runs almost silently.

I plan to encase the whole thing in a metal case, firstly to make it more portable, and secondly to reduce the amount of RF given off. You would probably not be able to legally run this in a commercial environment because of EMC regulations. You definitely would not be allowed to sell it.

The next problem is management software. While it’s certainly possible to log in to each of the four individual hosts and run virsh commands, that’s going to get tedious rather quickly.

The problem is that all “solutions” to this are rather heavyweight. I could manage the hosts using Puppet and install OpenStack, but it would probably take longer to set that up than the time saved. There’s a lot of cloud software out there, but not much that nicely manages 4 hosts without requiring huge dependencies. What I really want is a small command line tool that uses libvirt remotely so I don’t have to install anything on the hosts.

22 Comments

Filed under Uncategorized