Tag Archives: python

New in nbdkit: Run nbdkit as a captive process

New in nbdkit ≥ 1.1.6, you can run nbdkit as a “captive process” under external programs like qemu or guestfish. This means that nbdkit runs for as long as qemu/guestfish is running, and when they exit it cleans up and exits too.

Here is a rather involved way to boot a Fedora 20 guest:

$ virt-builder fedora-20
$ nbdkit file file=fedora-20.img \
    --run 'qemu-kvm -m 1024 -drive file=$nbd,if=virtio'

The --run parameter is what tells nbdkit to run as a captive under qemu-kvm. $nbd on the qemu command line is substituted automatically with the right nbd: URL for the port or socket that nbdkit listens on. As soon as qemu-kvm exits, nbdkit is killed and cleaned up.

Here is another example using guestfish:

$ nbdkit file file=fedora-20.img \
    --run 'guestfish --format=raw -a $nbd -i'

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

Operating system: Fedora release 20 (Heisenbug)
/dev/sda3 mounted on /
/dev/sda1 mounted on /boot


The main use for this is not to run the nbdkit file plugin like this, but in conjunction with perl and python plugins, to let people easily open and edit OpenStack Glance/Cinder and other unconventional disk images.


Filed under Uncategorized

New in nbdkit: Write plugins in Python

nbdkit is a permissively licensed Network Block Device server that lets you export “unusual” disk sources to qemu and libguestfs.

New in nbdkit 1.1.5, you can write plugins using Python. Here is an example.

1 Comment

Filed under Uncategorized

Get kernel and initramfs from a disk image

This script, a response to the insecure and over-complex disk-image-get-kernel in OpenStack, shows how to use libguestfs to safely and easily get the kernel and initramfs from a disk image so you can boot it using an external kernel.

# Get latest kernel & initramfs safely from a disk image.

# Note this will overwrite /tmp/kernel & /tmp/initramfs which you
# wouldn't want to do in production.

import sys
import guestfs

assert (len (sys.argv) == 2)
disk = sys.argv[1]

g = guestfs.GuestFS (python_return_dict=True)

# To enable tracing, uncomment the next line.
#g.trace (1)

# Attach the disk image read-only to libguestfs.
g.add_drive_opts (disk, readonly=1)

# Run the libguestfs back-end.
g.launch ()

# Ask libguestfs to inspect for operating systems.
roots = g.inspect_os ()
if len (roots) == 0:
    raise (Error ("no operating systems found"))
if len (roots) > 1:
    raise (Error ("dual/multi-boot images are not supported"))

root = roots[0]

# Mount up the disks, like guestfish -i.
# Sort keys by length, shortest first, so that we end up
# mounting the filesystems in the correct order.
mps = g.inspect_get_mountpoints (root)
def compare (a, b): return len(a) - len(b)
for device in sorted (mps.keys(), compare):
        g.mount_ro (mps[device], device)
    except RuntimeError as msg:
        print "%s (ignored)" % msg

# For debugging:
print "/boot directory of this guest:"
print (g.ll ("/boot"))

# Get all kernels & initramfses.
kernels = g.glob_expand ("/boot/vmlinuz-*")
initramfses = g.glob_expand ("/boot/initramfs-*")

# Old RHEL:
if len (initramfses) == 0:
    initramfses = g.glob_expand ("/boot/initrd-*")

# Debian/Ubuntu:
if len (initramfses) == 0:
    initramfses = g.glob_expand ("/boot/initrd.img-*")

if len (kernels) == 0:
    raise (Error ("no kernel found in this disk image"))
if len (initramfses) == 0:
    raise (Error ("no initramfs found in this disk image"))

# Sort by version so we get the latest.
from distutils.version import LooseVersion
kernels.sort (key=LooseVersion)
initramfses.sort (key=LooseVersion)

# Download the latest.
print ("downloading %s -> /tmp/kernel" % kernels[len (kernels)-1])
g.download (kernels[len (kernels)-1], "/tmp/kernel")
print ("downloading %s -> /tmp/initramfs" % initramfses[len (initramfses)-1])
g.download (initramfses[len (initramfses)-1], "/tmp/initramfs")

# Shutdown.
g.shutdown ()
g.close ()

Leave a comment

Filed under Uncategorized

Using libguestfs remotely with Python and rpyc

libguestfs has high quality Python bindings. Using rpyc you can make a remote libguestfs server with almost no effort at all.

Firstly start an rpyc server:

$ /usr/lib/python2.7/site-packages/rpyc/servers/classic_server.py
[SLAVE      INFO       13:21:17 tid=140019939981120] server started on
[SLAVE      INFO       13:21:17 tid=140019784894208] started background auto-register thread (interval = 60)
[REGCLNT    INFO       13:21:17] registering on
[REGCLNT    WARNING    13:21:19] no registry acknowledged

Now, possibly from the same machine or some other machine, you can connect to this server and use Python objects remotely as if they were local:

$ python
Python 2.7.3 (default, Aug  9 2012, 17:23:57) 
[GCC 4.7.1 20120720 (Red Hat 4.7.1-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpyc
>>> c = rpyc.classic.connect('localhost')

You can now create a libguestfs handle, following the example here.

>>> g = c.modules.guestfs.GuestFS()
>>> g.version()
{'release': 36L, 'major': 1L, 'minor': 21L, 'extra': 'fedora=20,release=1.fc20,libvirt'}
>>> g.add_drive('/dev/fedora/f18x64', readonly=True)
>>> g.launch()
>>> roots = g.inspect_os()
>>> g.inspect_get_product_name(roots[0])
'Fedora release 18 (Spherical Cow)'
>>> g.inspect_get_mountpoints(roots[0])
[('/', '/dev/mapper/fedora-root'), ('/boot', '/dev/sda1')]

As you can see, the g object is transparently remoted without you needing to do anything.

Leave a comment

Filed under Uncategorized

Which foreign function interface is the best?

I’ve written libguestfs language bindings for Perl, Python, Ruby, Java, OCaml, PHP, Haskell, Erlang and C#. But which of these is the best? Which is the easiest? What makes this hard? Grubbing around in the internals of a language reveals mistakes made by the language designers, but what are the worst mistakes?

Note: There is source that goes with this. Download libguestfs-1.13.13.tar.gz and look in the respective directories.

The best

It’s going to be a controversial choice, but in my opinion: C#. You just add some simple annotations to your functions and structs, and you can call into shared libraries (or “DllImport”s as Microsoft insisted on calling them) directly. It’s just about as easy as directly calling C and that is no simple achievement considering how the underlying runtime of C# is very different from C.

Example: a C struct:

[StructLayout (LayoutKind.Sequential)]
public class _int_bool {
  int i;
  int b;

The worst

There are two languages in the doghouse: Haskell and PHP. PHP first because their method of binding is just very broken. For example, 64 bit types aren’t possible on a 32 bit platform. It requires a very complex autoconf setup. And the quality of their implementation is very poor verging on broken — it makes me wonder if the rest of PHP can be this bad.

Haskell: even though I’m an experienced functional programmer and have done a fair bit of Haskell programming in the past, the FFI is deeply strange and very poorly documented. I simply could not work out how to return anything other than integers from my functions. You end up with bindings that look like this:

write_file h path content size = do
  r <- withCString path $ \path -> withCString content $ \content -> withForeignPtr h (\p -> c_write_file p path content (fromIntegral size))
  if (r == -1)
    then do
      err <- last_error h
      fail err
    else return ()

The middle tier

There’s not a lot to choose between OCaml, Ruby, Java and Erlang. For all of them: you write bindings in C, there’s good documentation, it’s a bit tedious but basically mechanical, and in 3 out of 4 you’re dealing with a reasonable garbage collector so you have to be aware of GC issues.

Erlang is slightly peculiar because the method I chose (out of many possible) is to write an external process that talks to the Erlang over stdin/stdout. But I can’t fault their documentation, and the rest of it is sensible.

Example: Here is a function binding in OCaml, but with mechanical changes this could be Ruby, Java or Erlang too:

CAMLprim value
ocaml_guestfs_add_drive_ro (value gv, value filenamev)
  CAMLparam2 (gv, filenamev);
  CAMLlocal1 (rv);

  guestfs_h *g = Guestfs_val (gv);
  if (g == NULL)
    ocaml_guestfs_raise_closed ("add_drive_ro");

  char *filename = guestfs_safe_strdup (g, String_val (filenamev));
  int r;

  caml_enter_blocking_section ();
  r = guestfs_add_drive_ro (g, filename);
  caml_leave_blocking_section ();
  free (filename);
  if (r == -1)
    ocaml_guestfs_raise_error (g, "add_drive_ro");

  rv = Val_unit;
  CAMLreturn (rv);

The ugly

Perl: Get reading. You’d better start with perlxs because Perl uses its own language — C with bizarre macros on top so your code looks like this:

SV *
is_config (g)
      guestfs_h *g;
      int r;
      r = guestfs_is_config (g);
      if (r == -1)
        croak ("%s", guestfs_last_error (g));
      RETVAL = newSViv (r);

After that, get familiar with perlguts. Perl has only 3 structures and you’ll be using them a lot. There are some brilliant things about Perl which shouldn’t be overlooked, including POD which libguestfs uses to make effortless manual pages.

Python: Best described as half arsed. Rather like the language itself.

Python, Ruby, Erlang: If your language depends on “int”, “long”, “long long” without defining what those mean, and differing based on your C compiler and platform, then you’ve made a big mistake that will unfortunately dog you throughout the runtime, FFIs and the language itself. It’s better either to define them precisely (like Java) or to just use int32 and int64 (like OCaml).

And finally, reference counting (Perl, Python). It’s tremendously easy to make mistakes that are fiendishly difficult to track down. It’s a poor way to do GC and it indicates to me that the language designer didn’t know any better.


Filed under Uncategorized

What I learned about AMQP

I’m playing with AMQP at the moment. I thought I’d start off with RabbitMQ first.

The good:

  • It works.
  • OCaml and Python programs can talk to each other.
  • It works across remote hosts. You need to open port 5672/tcp on the firewall.

The bad:

  • RabbitMQ and Apache Qpid use different versions of AMQP and are not interoperable! Good summary of the mess here. This might be resolved when everyone gets around to supporting AMQP 1-0, but even though that standard has been published, no one is expecting interop to happen for at least a year.
  • You can’t cluster different versions of the RabbitMQ broker together.
  • Even if all your hosts are at the same RabbitMQ version, you have to open more firewall ports and make changes to the start-up scripts. (Dynamic ports? Really? Did we learn nothing from NFS?)
  • Long, obscure Erlang error messages which don’t point to the problem. eg. You’ll get a good 25 lines of error message if another process is already bound to a port.
  • Possibly just a Fedora packaging problem: I managed to get my host into some state where it’s impossible to stop the RabbitMQ server except by kill -9, and after that I can’t start or stop it.


Filed under Uncategorized

Tip: Using libguestfs from Perl

I translated the standard libguestfs examples (already available in C/C++, OCaml, Python, Ruby) into Perl.

If you want to call libguestfs from Perl, you have to use Sys::Guestfs.

All 300+ libguestfs API calls are available to all language bindings equally because we generate the bindings.

Leave a comment

Filed under Uncategorized