Tag Archives: v2v

Red Hat Summit video online

I’m working to get this in a more usable format, but at the moment if you have flash you can watch Matt and I presenting virt-v2v and libguestfs at the URL below.

Click “Red Hat Summit”, then “V2V VMWare/Xen”.

Matt does the first 30 minutes, then I take over at almost exactly the 30’00 mark in the video.

http://www.redhat.com/promo/summit/2010/highlights/

Advertisements

18 Comments

Filed under Uncategorized

Outline for V2V session at the Red Hat Summit 2010

Posted here …

“In this session, Rich Jones and Matt Booth, senior software engineers at Red Hat, will introduce the virtualization V2V tool. They will also demonstrate how users can easily convert VMware and Xen virtual machines from the native format into KVM images for use in Red Hat Enterprise Linux and Red Hat Enterprise Virtualization. Rich and Matt will detail the step-by-step instructions for converting existing virtual machines, and provide an in-depth look into the technologies, including libguestfs, used to build the V2V tool.”

4 Comments

Filed under Uncategorized

What things make P2V/V2V conversion hard?

(In case anyone is confused by the title, P2V means turning a physical system into a virtual machine, and V2V means converting one type of virtual machine into another — eg. a Xen VM into a VMWare VM. Usually what we are talking about is making the conversion completely or mostly automatic.)

There are some things that Linux distros do which make P2V and V2V conversions harder than they really need to be. In this article I hope to collect a few of these things. If you can think of any more, post them as comments and I’ll incorporate them here.

1. Your disk partitions are on hard-coded device names like /dev/hda1

When we virtualize a disk, we usually want to install alternate (“paravirt”) drivers, but these drivers cause the disk names to change. Xen paravirt drivers use /dev/xvda and the standard Linux virtio drivers use /dev/vda.

Your Linux distro may have scattered references to /dev/hd* and /dev/sd* in /etc/fstab, inside the boot initramfs, and maybe in some scripts which are used to decrypt the hard drive at boot time. virt-v2v has to perform intimate surgery on the disk to find and fix all these.

Luckily there is a easy way to avoid this: every major filesystem and swap type for Linux supports either labels or UUIDs. These are preserved transparently when converting, and distros should use these scrupulously, and never use device names.

Update: LVM partition names are OK too. (Thanks Matt Booth).

2. You didn’t expect the network hardware would change

When the OS gets converted, it is highly likely that the network device will appear to change, but mysteriously it will still have the same MAC address.

That’s not supposed to happen with real hardware — MAC addrs are allocated by IEEE and no two manufacturers should have the same ones. Virtual hardware doesn’t play by the same rules.

What we’d want you to do is to keep track of the MAC addresses of each interface and when you see the same MAC, give it the same ethX device name.

Update: See comments.

3. Your kernel doesn’t support virtual devices

Some Linux distros don’t contain drivers for virtual (virtio) devices. There’s a variety of reasons for this, not all of them solvable: The kernel might predate virtio, or the drivers are closed source (eg. for older VMWare), or they might have been compiled out. In some cases a whole different kernel is required (as was true for earlier versions of Xen).

In any case, installing a new kernel inside the guest and making it bootable is major surgery, and we’d rather not do that.

4. X needs reconfiguration

Pretty much the same as for disks and network devices, the display device will also change after conversion (either to a generic Cirrus Logic 54xx or to one of the new accelerated paravirt devices).

Ideally X would just deal with this and bring up a display on the first device it finds, and it should probe all possible devices, but by no means all distros work this way.

5. Making assumptions about the environment around the machine

Static IP addresses and static DNS resolution are a pain when moving a machine. Although not necessarily the fault of the distro as such, it’s better if machines are configured to pick up their network details from a DHCP server.

6. Don’t assume CPU extensions like SSE3 will always be there

After conversion, the new virtual CPU and motherboard you see won’t look very much like the old ones. CPU extensions like SSE, vectors, NX might come or go. The apparent CPU model and manufacturer might have changed. Therefore if you spent any time optimizing the installed packages, those optimizations are wasted and might even cause programs to crash.

It’s better to make your packages generic and have programs detect the runtime environment and optimizations needed each time they start up. (Actually, it’s worse than this when you also consider live migration, because the processor might change while a program is running — no one has really solved this one satisfactorily yet).


As you can see, P2V and V2V conversions are like major hardware changes. Your distro should be able to handle just about all the hardware changing underneath it. It should probe everything at boot, hard-code nothing, and be designed to cope gracefully with abrupt changes.

Update: A thread on fedora-devel-list about this issue.

4 Comments

Filed under Uncategorized

Stuck

Somedays I just spend the time going round in circles, and this was one of those days. Not really for lack of work, but lack of knowing the Right Thing to do. Below I’ll describe the problem I had today.

If you look at a virtual machine from the point of view of libvirt, it looks like CPU resources, network interfaces, and one or two massive, opaque blobs – the block devices (virtual hard drives). From the point of view of libguestfs we can squint a bit harder and resolve those opaque block devices as partitions and filesystems and files.

But that’s not the whole story either, because we can ask what is the virtual machine? People typically describe the virtual machine by what it is and can do:

“It’s our mail server, running on RHEL 5.2”

“That’s my Windows XP VM that I use to run Office”

libvirt lists out the VMs, but doesn’t see anything beyond their virtual hardware. libguestfs can analyze the virtual hard drives looking for filesystems. Using both libvirt and libguestfs I have written what can only be described as a very long, very hairy Perl script that tries to answer the meta-question of what the virtual machine is.

Operating system(s) Filesystems Applications installed Kernel and Device drivers
Linux, Windows, …
And the distribution and version of each
mount-point => device
And a great deal of information about each filesystem
Multiple applications Type of ethernet card, virtio drivers, Xen PV drivers, etc etc

The way the Perl script works is best described as horrific. It probes each filesystem it finds, tries to mount it, looks for “characteristic” files (like /etc/redhat-release, /grub/grub.conf and Program Files), parses /etc/fstab to try to work out the relationship between devices, labels, UUIDs and actual mount-points. Does it have a root filesystem? Does it have multiple root filesystems? Perhaps that means it’s multi-boot? It’s an inelegant, special-case, nightmare, and I’m only testing this on 14 sample guests. When it hits real world usage, it will undoubtedly grow whiskers and legs.

And this, oddly enough, is not the problem that perplexed me all day today. Rather the problem I have is how to package up and distribute this code.

It’s obviously going to be useful for others to reuse this code, since the answers to the meta-question are obviously useful in many situations. So it should be a library of some sort.

Do I keep it as a Perl library, perhaps exporting specialized Perl structures? That treats Perl preferentially, and why should Perl be treated this way, and not any of the other languages we support?

Do I rewrite it in C? That’s difficult for a couple of reasons. First and by no means least is that this code uses a lot of string handling and regular expressions, which C is hardly suitable for. Secondly the code exports a large, recursive data structure which is likely to change and be extended and refined over time, and a C API is generally not flexible enough to handle such a thing easily. Then to complete libguestfs, we’d have to write the code to convert that recursive structure into all the language bindings, which is time-consuming at best.

Should I have a stand-alone program exporting some intermediate representation? XML seems suitable. But strange though it may seem, XML is not the preferred choice of data representation in many languages. Perl and OCaml, for example, don’t really get along with XML being respectively too unstructured and too loosely typed.

So I’m left at an impasse. I have code, but it doesn’t “belong” anywhere. It’s generally useful, but not usefully general. What to do?

5 Comments

Filed under Uncategorized