Tag Archives: v2v


Somedays I just spend the time going round in circles, and this was one of those days. Not really for lack of work, but lack of knowing the Right Thing to do. Below I’ll describe the problem I had today.

If you look at a virtual machine from the point of view of libvirt, it looks like CPU resources, network interfaces, and one or two massive, opaque blobs – the block devices (virtual hard drives). From the point of view of libguestfs we can squint a bit harder and resolve those opaque block devices as partitions and filesystems and files.

But that’s not the whole story either, because we can ask what is the virtual machine? People typically describe the virtual machine by what it is and can do:

“It’s our mail server, running on RHEL 5.2”

“That’s my Windows XP VM that I use to run Office”

libvirt lists out the VMs, but doesn’t see anything beyond their virtual hardware. libguestfs can analyze the virtual hard drives looking for filesystems. Using both libvirt and libguestfs I have written what can only be described as a very long, very hairy Perl script that tries to answer the meta-question of what the virtual machine is.

Operating system(s) Filesystems Applications installed Kernel and Device drivers
Linux, Windows, …
And the distribution and version of each
mount-point => device
And a great deal of information about each filesystem
Multiple applications Type of ethernet card, virtio drivers, Xen PV drivers, etc etc

The way the Perl script works is best described as horrific. It probes each filesystem it finds, tries to mount it, looks for “characteristic” files (like /etc/redhat-release, /grub/grub.conf and Program Files), parses /etc/fstab to try to work out the relationship between devices, labels, UUIDs and actual mount-points. Does it have a root filesystem? Does it have multiple root filesystems? Perhaps that means it’s multi-boot? It’s an inelegant, special-case, nightmare, and I’m only testing this on 14 sample guests. When it hits real world usage, it will undoubtedly grow whiskers and legs.

And this, oddly enough, is not the problem that perplexed me all day today. Rather the problem I have is how to package up and distribute this code.

It’s obviously going to be useful for others to reuse this code, since the answers to the meta-question are obviously useful in many situations. So it should be a library of some sort.

Do I keep it as a Perl library, perhaps exporting specialized Perl structures? That treats Perl preferentially, and why should Perl be treated this way, and not any of the other languages we support?

Do I rewrite it in C? That’s difficult for a couple of reasons. First and by no means least is that this code uses a lot of string handling and regular expressions, which C is hardly suitable for. Secondly the code exports a large, recursive data structure which is likely to change and be extended and refined over time, and a C API is generally not flexible enough to handle such a thing easily. Then to complete libguestfs, we’d have to write the code to convert that recursive structure into all the language bindings, which is time-consuming at best.

Should I have a stand-alone program exporting some intermediate representation? XML seems suitable. But strange though it may seem, XML is not the preferred choice of data representation in many languages. Perl and OCaml, for example, don’t really get along with XML being respectively too unstructured and too loosely typed.

So I’m left at an impasse. I have code, but it doesn’t “belong” anywhere. It’s generally useful, but not usefully general. What to do?


Filed under Uncategorized