Cluster performance: baseline testing

I’m using fio (as recommended by Linus!) to baseline test my virtualization cluster. My fio script is supposed to look a bit like a qemu process:

[virt]
ioengine=libaio
iodepth=4
rw=randrw
bs=64k
direct=1
size=1g
numjobs=4

It has 4 large “disks” (size=1g) and 4 large “qemu processes” (numjobs=4) running in parallel. Each test thread can have up to 4 IOs in flight (iodepth=4) and the size of IOs is 64K which matches qcow2 default cluster size. I enabled O_DIRECT (direct=1) because we normally use qemu cache=none so that live migration works.

The first node now has a RAID 1 array of spinning rust (hard disks) and a smaller SSD, and the plan is to use LVM-cache so the SSD can sit on top of the RAID array.

Performance of the RAID 1 array of hard disks

The raw performance of the RAID 1 array (this includes the filesystem) is fairly dismal:

virt-ham0-raid1.txt

Performance of the SSD

The SSD in contrast does a lot better:

virt-ham0-ssd.txt

However you want to look at the details, the fact is that the test runs 11 times faster on the SSD.

The effect of NFS

What about when we NFS-mount the RAID array or the SSD on another node? This should tell us the effect of NFS.

virt-ham1-raid1-nfs.txt

NFS makes this test run 3 times slower.

For the NFS-mounted SSD:

virt-ham1-ssd-nfs.txt

NFS makes this test run 4.5 times slower.

The effect of virtualization

By running the virtual machine on the first node (with the disks) it should be possible to see just the effect of virtualization. Since this is backed by the RAID 1 hard disk array, not SSDs or LVM cache, it should be compared only to the RAID 1 performance.

virt-vm-on-ham0.txt

The effect of virtualization (virtio-scsi in this case) is about an 8% drop in performance, which is not something I’m going to worry about.

Conclusions

  • The gains from the SSD (ie. using LVM cache) could outweigh the losses from having to use NFS to share the disk images.
  • It’s worth looking at alternate high bandwidth, low-latency interconnects (instead of 1 gigE) to make NFS perform better. I’m going to investigate using Infiniband soon.

These are just the baseline measurements without LVM cache.

I’ve included links to the full test results. fio gives a huge amount of detail, and it’s helpful to keep the HOWTO open so you can understand all the figures it is producing.

1 Comment

Filed under Uncategorized

One response to “Cluster performance: baseline testing

  1. mkowp

    Anothet nice writeup! I’ve currently spun down my IB with recent requirements in my lab. Looking forward to your posts.

    Would it be possible to link to the firmwares and/or models of the storage components?

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.