Tag Archives: virt-v2v

Tip: Read guest disks from VMware vCenter using libguestfs

virt-v2v can import guests directly from vCenter. It uses all sorts of tricks to make this fast and efficient, but the basic technique uses plain https range requests.

Making it all work was not so easy and involved a lot of experimentation and bug fixing, and I don’t think it has been documented up to now. So this post describes how we do it. As usual the code is the ultimate repository of our knowledge so you may want to consult that after reading this introduction.

Note this is read-only access. Write access is possible, but you’ll have to use ssh instead.

VMware ESXi hypervisor has a web server but doesn’t support range requests, so although you can download an entire disk image in one go from the ESXi hypervisor, to random-access the image using libguestfs you will need VMware vCenter. You should check that virsh dumpxml works against your vCenter instance by following these instructions. If that doesn’t work, it’s unlikely the rest of the instructions will work.

You will need to know:

  1. The hostname or IP address of your vCenter server,
  2. the username and password for vCenter,
  3. the name of your datacenter (probably Datacenter),
  4. the name of the datastore containing your guest (could be datastore1),
  5. .. and of course the name of your guest.

Tricky step 1 is to construct the vCenter https URL of your guest.

This looks like:


https://root:password@vcenter/folder/guest/guest-flat.vmdk?dcPath=Datacenter&dsName=datastore1

where:

root:password
username and password
vcenter
vCenter hostname or IP address
guest
guest name (repeated twice)
Datacenter
datacenter name
datastore1
datastore

Once you’ve got a URL that looks right, try to fetch the headers using curl. This step is important! not just because it checks the URL is good, but because it allows us to get a cookie which is required else vCenter will break under the load when we start to access it for real.

$ curl --insecure -I https://....
HTTP/1.1 200 OK
Date: Wed, 5 Nov 2014 19:38:32 GMT
Set-Cookie: vmware_soap_session="52a3a513-7fba-ef0e-5b36-c18d88d71b14"; Path=/; HttpOnly; Secure; 
Accept-Ranges: bytes
Connection: Keep-Alive
Content-Type: application/octet-stream
Content-Length: 8589934592

The cookie is the vmware_soap_session=... part including the quotes.

Now let’s make a qcow2 overlay which encodes our https URL and the cookie as the backing file. This requires a reasonably recent qemu, probably 2.1 or above.

$ qemu-img create -f qcow2 /tmp/overlay.qcow2 \
    -b 'json: { "file.driver":"https",
                "file.url":"https://..",
                "file.cookie":"vmware_soap_session=\"...\"",
                "file.sslverify":"off",
                "file.timeout":1000 }'

You don’t need to include the password in the URL here, since the cookie acts as your authentication. You might also want to play with the "file.readahead" parameter. We found it makes a big difference to throughput.

Now you can open the overlay file in guestfish as usual:

$ export LIBGUESTFS_BACKEND=direct
$ guestfish
><fs> add /tmp/overlay.qcow2 copyonread:true
><fs> run
><fs> list-filesystems
/dev/sda1: ext4
><fs> mount /dev/sda1 /

and so on.

4 Comments

Filed under Uncategorized

libguestfs 1.28 released

The new stable version of libguestfs — a C library and tools for accessing and modifying virtual machine disk images — has been released.

There is one brand new tool, virt-log. And I rewrote the virt-v2v and virt-p2v tools. These tools convert VMware and Xen guests and physical machines, to run on KVM. They are now much faster and better than before.

As well as that there are hundreds of other improvements and bug fixes. For a full list, see the release notes.

Libguestfs 1.28 will be available shortly in Fedora 21, Debian/experimental, RHEL and CentOS 7, and elsewhere.

1 Comment

Filed under Uncategorized

Odd/scary RHEL 5 bug

Yesterday my colleague gave me a RHEL 5 VM disk image which failed to boot after converting it using the latest virt-v2v.  Because it booted before conversion but not afterwards, the fingers naturally pointed at something that we were doing during the conversion process. Which is not unusual as v2v conversion is highly complex.

Screenshot_xen-pv-rhel5.8-x86_64
The “GRUB _” prompt after conversion

The thing is that we don’t reinstall grub during conversion, but we do edit a few grub configuration files. Could editing grub configuration cause this error?

I wanted to understand what the grub-legacy “GRUB _” prompt means. There are lots and lots and lots of people reporting this bug (eg), but as is often the case I could find no coherent explanation anywhere of what grub-legacy means when it gets into this state. Lots of the blind leading the blind, and random suggestions about how people had rescued such machines (probably coincidentally), but no hard data anywhere. So I had to go back to first principles and debug qemu to find out what’s happening just before the message is printed.

Tip: To breakpoint qemu when the Master Boot Record (first sector) is loaded, do:

target remote tcp::1234
set architecture i8086
b *0x7c00
cont

After an evening of debugging, I found that it’s the first sector (known in grub-legacy as “stage 1″) which prints the GRUB<space> message. (The same happens to be true of grub2). The stage 1 boot sector has, written into it at a fixed offset, the location of the /boot/grub/stage2 file, ie. the literal disk start sector and length of this file. It sends BIOS int $0x13 commands to load those sectors into memory at address 0x8000, and jumps there to start the stage 2 of grub. The boot sector is 512 bytes, so there’s no luxury to do anything except print 5 characters. It’s after the stage2 file has been loaded when all the nice graphical stuff happens.

Unfortunately in the image after conversion, the stage2 data loaded into memory was all zeroes, and that’s why the boot fails and you see GRUB<space><cursor> and then the VM crashes.

The mystery was how conversion could be changing the location of the /boot/grub/stage2 file so that it could no longer be loaded at the fixed offset encoded in the boot sector.

This morning it dawned on me what was really happening …

The new virt-v2v tries very hard to avoid copying any unused data from the guest, just to save time. No point wasting time copying deleted files and empty space. This makes virt-v2v very fast, but it has an unusual side-effect: If a file is deleted on the source, the contents of the file are not copied over to the target, and turn into zeroes.

It turns out if you take the source disk image and simply zero all of the empty space in /boot, then the source doesn’t boot either, even though virt-v2v is not involved. Yikes … this could be a bug in RHEL 5. Grub is generating a bootloader that references a deleted file.

This is where we are right now with this bug. It appears that a valid sequence of steps can make a RHEL 5 bootloader that references a deleted file, but still works as long as you never overwrite the sectors used by that file.

I have written a simple test script that you can download to find out if your RHEL ≤ 6 virtual machines could be affected by this problem. I’m interested if anyone else sees this. I ran the test over a selection of RHEL 3 – 5 guests, and could not find any which had the problem, but my collection is not very extensive, and there are likely to be common modes in how they were created.

The next steps will likely be to test a lot more RHEL 5 installs to see if this bug is really common or a strange one-off. I will also probably add a workaround to virt-v2v so it doesn’t trim the boot partition — the reason is that we cannot go back and fix old RHEL 5 installs, we have to work with them if they are broken. If it turns out to be a real bug in RHEL 5 then we will need to issue a fix for that.

3 Comments

Filed under Uncategorized

virt-v2v preview packages for RHEL and CentOS 7.1 are available

virt-v2v is a small program for converting guests from VMware or Xen, to run on KVM, RHEV-M or OpenStack. For RHEL 7.1, I am rewriting and enhancing virt-v2v, so it’s much faster and easier to use.

To install, follow the instructions here for setting up the yum repository, and then you can do:

yum install virt-v2v

To use it, start with the manual here that has lots of examples and the full reference documentation.

6 Comments

Filed under Uncategorized

virt-v2v: better living through new technology

If you ever used the old version of virt-v2v, our software that converts guests to run on KVM, then you probably found it slow, but worse still it was slow and could fail at the end of the conversion (after possibly an hour or more). No one liked that, least of all the developers and support people who had to help people use it.

A V2V conversion is intrinsically going to take a long time, because it always involves copying huge disk images around. These can be gigabytes or even terabytes in size.

My main aim with the rewrite was to do all the work up front (and if the conversion is going to fail, then fail early), and leave the huge copy to the last step. The second aim was to work much harder to minimize the amount of data that we need to copy, so the copy is quicker. I achieved both of these aims using a lot of new technology that we developed for qemu in RHEL 7.

Virt-v2v works (now) by putting an overlay on top of the source disk. This overlay protects the source disk from being modified. All the writes done to the source disk during conversion (eg. modifying config files and adding device drivers) are saved into the overlay. Then we qemu-img convert the overlay to the final target. Although this sounds simple and possibly obvious, none of this could have been done when we wrote old virt-v2v. It is possible now because:

  • qcow2 overlays can now have virtual backing files that come from HTTPS or SSH sources. This allows us to place the overlay on top of (eg) a VMware vCenter Server source without having to copy the whole disk from the source first.
  • qcow2 overlays can perform copy-on-read. This means you only need to read each block of data from the source once, and then it is cached in the overlay, making things much faster.
  • qemu now has excellent discard and trim support. To minimize the amount of data that we copy, we first fstrim the filesystems. This causes the overlay to remember which bits of the filesystem are used and only copy those bits.
  • I added support for fstrim to ntfs-3g so this works for Windows guests too.
  • libguestfs has support for remote storage, cachemode, discard, copy-on-read and more, meaning we can use all these features in virt-v2v.
  • We use OCaml — not C, and not type-unsafe languages — to ensure that the compiler is helping us to find bugs in the code that we write, and also to ensure that we end up with an optimized, standalone binary that requires no runtime support/interpreters and can be shipped everywhere.

10 Comments

Filed under Uncategorized

New in libguestfs 1.27.34 – virt-v2v and virt-p2v

There haven’t been too many updates around here for a while, and that’s for a very good reason: I’ve been “heads down” writing the new versions of virt-v2v and virt-p2v, our tools for converting VMware and Xen virtual machines, or physical machines, to run on KVM.

The new virt-v2v [manual page] can slurp in a guest from a local disk image, local Xen, VMware vCenter, or (soon) an OVA file — convert it to run on KVM — and write it out to RHEV-M, OpenStack Glance, local libvirt or as a plain disk image.

It’s easy to use too. Unlike the old virt-v2v there are no hairy configuration files to edit or complicated preparations. You simply do:

$ virt-v2v -i disk xen_disk.img -o local -os /tmp

That command (which doesn’t need root, naturally) takes the Xen disk image, which could be any supported Windows or Enterprise Linux distro, converts it to run on KVM (eg. installing virtio drivers, adjusting dozens of configuration files), and writes it out to /tmp.

To connect to a VMware vCenter server, change the -i options to:

$ virt-v2v -ic vpx://vcenter/Datacenter/esxi "esx guest name" [-o ...]

To output the converted disk image to OpenStack glance, change the -o options to:

$ virt-v2v [-i ...] -o glance [-on glance_image_name]

Coming up: The new technology we’ve used to make virt-v2v much faster.

12 Comments

Filed under Uncategorized

Please don’t do this (v2v and p2v requests that are wrong)

Using v2p to get around Oracle support contracts

Problem: Oracle won’t support the database in a virtualized environment. If you report a bug, they’ll ask you to reproduce it on a supported (ie. physical) machine.

Wrong solution: We’ll run Oracle in a VM. When we run into trouble, we’ll use a V2P tool to convert the virtual machine to a physical machine!

Why this is wrong: Conversion involves copying the disks, ripping out device drivers, adding new device drivers, fiddling with configuration files, doing resize ops on filesystems, and reinstalling the boot loader. These are (a) slow, (b) very intrusive, and (c) liable to break. This is all a recipe for turning a small disaster (ie. my database is down) into a very big disaster (my database is still down and the hairy support “solution” took 6 hours and didn’t work).

Good solution: Oracle are probably right that you shouldn’t try to run your database virtualized. But assuming you want to ignore that advice, put your database and its files onto a separate SAN LUN. When you need support, detach the LUN from the virtual machine and reattach it to a physical machine. This operation should be instantaneous and doesn’t involve any modification of the data.

Using p2v and v2p to test upgrades

Problem: It’s not easy to test an upgrade on a production physical machine.

Wrong solution: Virtual machines let you snapshot, test your upgrade on the snapshot, and if it’s bad you just throw away the snapshot. Therefore to test our upgrade, we’ll convert the physical machine to virtual (P2V), do the test, and if it works we’ll convert it back to a physical machine (V2P)!

Why this is wrong: Conversion involves a slow disk copy and a very intrusive set of modifications to the configuration. P2V followed by V2P is not a symmetric operation that leaves you with an identical machine. More than likely it’ll simply break the machine, and if it doesn’t, then drivers could be less than optimal after the conversion. Plus (unlike with virtualized environments) your physical machine is a one-of-a-kind system, and if you break it with a hairy set of P2V and V2P operations you can’t just roll back to a previous snapshot.

Good solution: Virtualize your workloads! If you don’t want to do that, use a filesystem like btrfs/ZFS that lets you do cheap snapshots, or use the snapshot feature of your SAN. In any case, always arrange your production environment so that you have a staging mirror on which to do tests before you deploy anything to production, and have a tested back-out plan.

Using multiple v2v steps

Problem: We don’t have a conversion tool that can do (eg.) Citrix Xen to KVM in one step.

Wrong solution: We found something on the web that can do Citrix to VMware, and Red Hat have a great tool for doing VMware to KVM, so we’ll just run one after the other!

Why this is wrong: Conversion involves a large set of intrusive changes on the guest such as installing device drivers for the particular target hypervisor. Doing this in two steps means you go through two rounds of intrusive changes to your guest, and it’s unlikely that anyone has tested both together. Most likely it’ll break, or leave your guest with conflicting device drivers.

Good solution: Sorry, but at the moment there isn’t a good solution, but that doesn’t mean you should use the bad solution. It could be your best bet is to reinstall the guest from scratch on the target VM.

1 Comment

Filed under Uncategorized