virt-v2v: better living through new technology

If you ever used the old version of virt-v2v, our software that converts guests to run on KVM, then you probably found it slow, but worse still it was slow and could fail at the end of the conversion (after possibly an hour or more). No one liked that, least of all the developers and support people who had to help people use it.

A V2V conversion is intrinsically going to take a long time, because it always involves copying huge disk images around. These can be gigabytes or even terabytes in size.

My main aim with the rewrite was to do all the work up front (and if the conversion is going to fail, then fail early), and leave the huge copy to the last step. The second aim was to work much harder to minimize the amount of data that we need to copy, so the copy is quicker. I achieved both of these aims using a lot of new technology that we developed for qemu in RHEL 7.

Virt-v2v works (now) by putting an overlay on top of the source disk. This overlay protects the source disk from being modified. All the writes done to the source disk during conversion (eg. modifying config files and adding device drivers) are saved into the overlay. Then we qemu-img convert the overlay to the final target. Although this sounds simple and possibly obvious, none of this could have been done when we wrote old virt-v2v. It is possible now because:

  • qcow2 overlays can now have virtual backing files that come from HTTPS or SSH sources. This allows us to place the overlay on top of (eg) a VMware vCenter Server source without having to copy the whole disk from the source first.
  • qcow2 overlays can perform copy-on-read. This means you only need to read each block of data from the source once, and then it is cached in the overlay, making things much faster.
  • qemu now has excellent discard and trim support. To minimize the amount of data that we copy, we first fstrim the filesystems. This causes the overlay to remember which bits of the filesystem are used and only copy those bits.
  • I added support for fstrim to ntfs-3g so this works for Windows guests too.
  • libguestfs has support for remote storage, cachemode, discard, copy-on-read and more, meaning we can use all these features in virt-v2v.
  • We use OCaml — not C, and not type-unsafe languages — to ensure that the compiler is helping us to find bugs in the code that we write, and also to ensure that we end up with an optimized, standalone binary that requires no runtime support/interpreters and can be shipped everywhere.

10 Comments

Filed under Uncategorized