GLLUG talk on libguestfs (18th March 2010)

Back in 2008 we faced a pressing problem with virtualization. How do we look at what’s going on inside a virtual machine?

Let’s step back: what is a virtual machine? In nuts and bolts terms, it’s a big file or partition containing a disk image, and when it’s running, it’s a complicated emulation of CPUs, memory, and virtual devices like network cards. It’s interesting and necessary to be able to look inside all of those things. (“How many packets are coming out of the virtual network card?” “How is the virtual CPU coping with the load?”). But for the purpose of this talk I’m just going to talk about looking inside that disk image.

That large (multi-gigabyte) disk image file has a rich internal structure: a Master Boot Record; a boot partition; LVM, which has its own internal structures. Then it contains filesystems and those contain directories and files and more besides.

What might we want to do with the disk image if we could look inside it at this rich internal structure? Clone the machine, changing a few config files like the hostname. Edit grub.conf in a VM which isn’t booting. Audit a VM to find out what licensed software is installed. Is the VM running out of disk space? Offline resizing or backups. Make a new virtual machine from scratch …

In 2008 (and now) you could look inside the disk image. First of all you’d need to be root. Then you could run a command line tool called kpartx which splits the disk image partitions into device mapper devices (this is why you need to be root). These are actually global devices on your host, visible to everyone. If you’re lucky, LVM on the host might find the volume groups located in the disk image, but you might have to adjust the global host LVM configuration to get that to work. If you’re unlucky, those could conflict with volume groups already in your host.

So if you are root, you should usually be able to mount a guest disk in the host. If your program crashes, of course, it will leave unattached device mapper devices, loopback devices and mount points on the host system.

It’s not clear from a security point of view if mounting untrusted guest devices on the host as root is a good idea.

That said, kpartx is a useful tool if: you are already root on the host, you just want to mount a partition, it’s ad hoc (no scripting), you can clean up if you make a mistake, and if you can trust the guests.

So we considered how we could improve this process and provide more features.

You shouldn’t need to be root: If you have a word-processor document, you don’t need to be root to edit that document. If you have a JPEG file, you don’t need to run GIMP as root to crop it. So why are disk image files any different? You should be able to modify disk images from CGI scripts, or from shell scripts. You shouldn’t have to clean up after it. There should be no gotchas or corner cases where it doesn’t work.

What is libguestfs? An API for creating, accessing, manipulating and modifying filesystems and disk images. Access from many different programming languages, or the command line. A set of useful tools. And applications built on top.

Today is going to be mainly a demonstration of what can be done with libguestfs and the tools we’ve built around this.

[Demonstration of guestfish]

“Guestfish” is the “guest filesystem interactive shell”, and you can just run it on any disk image you happen to find. You don’t need to be root, unless you need root to access that particular image. In this case, the image is just a local file so I don’t need root.

$ guestfish -a disk.img
><fs> run

You can see this image is a Linux virtual machine of some sort.

><fs> cat /etc/fstab
[the fstab from an unidentified Linux machine is shown ...]

We can use the “cat” command to look for some identification:

><fs> cat /etc/motd
><fs> cat /etc/redhat-release
><fs> cat /etc/debian_version
[this shows that it is a Fedora 12 VM]

We can also edit files. For example, we can edit the /etc/issue file to change the console login message:

><fs> vi /etc/issue

Guestfish is the shell-scripting interface to the libguestfs API. It exposes the entire API, and as you can see that’s quite large:

><fs> help

(The full list of commands is here in the manpage).

Since the API is quite daunting, we do offer an overview of the whole API in the man page. So I won’t go through that here.

[Demonstration of using the API from Perl and Python]

This is the Perl example. Notice the use of the Augeas configuration API to pull out the list of NTP servers:

#!/usr/bin/perl -w

use strict;

use Sys::Guestfs;

my $g = Sys::Guestfs->new ();
$g->add_drive_ro ("disk.img");
$g->launch ();

my @logvols = $g->lvs ();
print "logical volumes: ", join (", ", @logvols), "\n\n";

$g->mount_ro ("/dev/vg_f12x32/lv_root", "/");
print "----- ISSUE file: -----\n";
print ($g->cat ("/etc/issue"));
print "----- end of ISSUE file -----\n\n";

# Use Augeas to list the NTP servers.
$g->aug_init ("/", 16);
my @nodes = $g->aug_match ("/files/etc/ntp.conf/server");
my @ntp_servers = map { $g->aug_get ($_) } @nodes;
print "NTP servers: ", join (", ", @ntp_servers), "\n\n";

This was the Python example:

#!/usr/bin/python

import guestfs
g = guestfs.GuestFS ()
g.add_drive_ro ("disk.img")
g.launch ()

parts = g.list_partitions ()
print "disk partitions: %s" % (", ".join (parts))

[Demonstration of guestfish on a Fedora live CD]

We show unpacking a Fedora live CD, as shown before on this blog.

[Demonstration of virt-df]

You can see examples of virt-df output similar to what was demonstrated in the talk.

[Demonstration of virt-inspector]

We demonstrated virt-inspector. You can see earlier examples from this blog here and here.

[Demonstration of virt-win-reg]

You can see examples of using virt-win-reg on this blog. The infamous Windows Registry sucks posting.

[Demonstration of guestmount]

On this blog, FUSE support for libguestfs and some screenshots showing a Debian guest being mounted on the host.

[Demonstration of guestmount and hivexsh]

An example of using hivexsh can be found earlier on this blog.

This was the first talk given using Tech Talk PSE. You can download Tech Talk PSE from the git repository.

7 Comments

Filed under Uncategorized

7 responses to “GLLUG talk on libguestfs (18th March 2010)

  1. Baris

    What about security consideration of this standard apis? I don’t see any authentication. Wouldn’t it better to add authentication credentials to virtual hard drives for future versions if this is going to be included in upcoming distro releases?

    • rich

      Not sure what you mean. As far as libguestfs is concerned, a disk image is just a specially formatted file.

      • Baris

        As a user of the disk.img owner you can freely alter the file content. I could understand that and it sounds logical. But this kind of api would let hackers to easily manipulate systems offline which is riskier than chaning file system while it’s running. This way it’s problematic since it make it possible to automate this process and make job of worm writers easier.

        On the other hand, disk.img files may have a password protection for this kind of offline manipulation. What I was suggesting is to add an encryption on fileheader so that libguestfs to authenticate in order to change the content. It would not be necessary to encrypt whole disk.img file (that could be done by other tools) but only the header. (Ignore me if .img files do not have any header just a raw data that VM register as a hard drive, if so it’s job of filesystem to protect it’s integrity.)

  2. Why does PSE not stand for Pony Supreme Edition!?

  3. Pingback: Tech Talk PSE 1.0.0 released « Richard WM Jones

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.