Tag Archives: programming

How to edit a qcow2 file from C

Suppose you want to edit or read or write the data inside a qcow2 file? One way is to use libguestfs, and that’s the recommended way if you need to mount a filesystem inside the file.

But for accessing the data blocks alone, you can now use the libnbd API and qemu-nbd together and this has a couple of advantages: It’s faster and you can open snapshots (which libguestfs cannot do).

We start by creating a libnbd handle and connecting it to a qemu-nbd instance. The qemu-nbd instance is linked with qemu’s internal drivers that know how to read and write qcow2.

  const char *filename;
  struct nbd_handle *nbd;

  nbd = nbd_create ();
  if (nbd == NULL) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);

  char *args[] = {
    "qemu-nbd", "-f", "qcow2",
    // "-s", snapshot,
    (char *) filename,
  if (nbd_connect_systemd_socket_activation (nbd, args) == -1) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);

Now you can get the virtual size:

  int64_t size = nbd_get_size (nbd);
  printf ("virtual size = %" PRIi64 "\n", size);

Or read and write sectors from the file:

  if (nbd_pread (nbd, buf, sizeof buf, 0, 0) == -1) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);

Once you’re done with the file, call nbd_close on the handle which will also shut down the qemu-nbd process.

A complete example can be found here.

1 Comment

Filed under Uncategorized

NBD’s state machine


Eric and I are writing a Linux NBD client library. There were lots of requirements but the central one for this post is it has to be a library callable from programs written in C and other programming languages (Python, OCaml and Rust being important), and we don’t control those programs so they may be single or multithreaded, or may use non-blocking main loops like gio and glib.

An NBD command involves sending a request over a socket to a remote server and receiving a reply. You can also have multiple requests “in flight” and the reply can be received in multiple parts. On top of this the “fixed newstyle” NBD protocol has a complex multi-step initial handshake. Complicating it further we might be using a TLS transport which has its own handshake.

It’s complicated and we mustn’t block the caller.

There are a few ways to deal with this in my experience — one is to ignore the problem and insist that the main program uses a thread for each NBD connection, but that pushes complexity onto someone else. Another way is to use some variation of coroutines or call/cc — if we get to a place where we would block then we save the stack, return to the caller, and have some way to restore the stack later. However this doesn’t necessarily work well with non-C programming languages. It likely won’t work with either OCaml or Ruby’s garbage collectors since they both involve stack walking to find GC roots. I’d generally want to avoid “tricksy” stuff in a library.

The final way that I know about is to implement a state machine. However large state machines are hellishly difficult to write. Our state machine has 75 states (so far — it’s nowhere near finished). So we need a lot of help.

I came up with a slightly nicer way to write state machines.

The first idea is that states in a large state machine could be grouped. You can consider each group like a mini state machine — it has its own namespace, lives in a single file (example), and may only be entered via a single START state (so you don’t need to consider what happens if another part of the state machine jumps into the middle of the group).

Secondly groups can be hierarchical. This lets us organise the groups logically, so for example there is a group for “fixed newstyle handshaking” and within that there are separate groups for negotiating each newstyle option. States can refer to each other using either relative or absolute paths in this hierarchy.

Thirdly all states and transitions are defined and checked in a single place, allowing us to enforce rules about what transitions are permitted.

Fourthly the final C code that implements the state machine is generated (mostly). This lets us generate helper functions to (eg) turn state transitions into debug messages, or return whether the connection is in a mode where it’s expecting to read or write from the socket (making it easier to integrate with main loops).

The final code looks like this and generates currently 173K lines of C (although as it’s mostly large switch statements it compiles down to a reasonably small size).

Has anyone else implemented a large state machine in a similar way?


Filed under Uncategorized

Writing a Planner to solve a tricky programming optimization problem

Suppose a monkey is in one corner of a room, a box is in another corner of a room, and a banana is hanging from the ceiling in the middle of the room. The monkey can’t reach the banana without standing on the box, but he first has to move the box under the banana. The problem of how to get a computer to work out that the monkey has to move the box first, then climb on the box second, was solved by Nils Nilsson’s STRIPS system in 1971. STRIPS is now an A.I. standard, and is used in game A.I. and elsewhere.

Suppose you have a disk image template that you want to uncompress, convert to another format, and resize. You can run xzcat, followed by qemu-img convert followed by virt-resize. But virt-resize can also do format conversion, so you don’t need to run qemu-img convert. Unless the user was happy with the original size, in which case qemu-img convert is faster than virt-resize. But what if the original template isn’t compressed and is already in the right format and size? You can just run cp.

How can a computer work out the right sequence of steps to convert the disk image most efficiently? Virt-builder has exactly this problem, and it solves it using a STRIPS-inspired planner.

The STRIPS planner in virt-builder is only 50 lines of code, was easy to write, finds the near optimal plan for almost any user input, and is a useful technique that can be applied to many programming problems. This article will explain how it works. I have changed some of the academic terms and simplified things to make this easier to understand.

First of all I’ll introduce tags on the original template. These define the state of that template:

Input tags: ✚xz ✚template ✚size=4G ✚format=raw

Secondly I’ll set up my goal state:

Goal tags: ❌xz ❌template ✚size=4G ✚format=qcow2

where means the tag MUST NOT exist in the final state.

I want my planner to find me the best path from my input state to my goal. As it can’t go straight from the input to the goal in one step, I have to tell the planner what transitions are possible, using a function:

transitions (input_tags) {
  if ✚xz then {
    you could run 'xzcat'
        which will ❌xz and ❌template;
  else /* no xz tag */ {
    you could run 'virt-resize'
       which will change ✚format and ✚size, and ❌template;
    you could run 'qemu-img convert'
       which will change ✚format, and ❌template;

  you could run 'cp'
      which will ❌template;

Notice that the transitions function returns a list of all possible transitions from the input state. It’s not judgemental about which one should be taken, although it won’t return impossible transitions (for example, running virt-resize is not possible on xz-compressed files). The actual transitions function also returns a weight for each transition, so that the planner can choose the least expensive plan if there are several plans possible.

The ✚template tag may appear a bit mysterious. It’s there to make sure that the planner always copies the original template, even if the original template already has the desired goal size and format. Since xzcat, virt-resize and qemu-img convert always copy the disk image, they drop the template tag (❌template).

The transitions function in virt-builder can be found here.

The planner does a breadth-first search over the tree of transitions, starting with the input state, finishing when it finds any branch that satisfies the output goals, or when it reaches a maximum depth in which case it gives up (and the user sees an error message).

The planner in virt-builder (50 lines of code) can be found here.

If the planner finds several paths that satisfy the goals, the planner chooses the one with the smallest weight. However my planner is not clever enough to look deeper in the tree to see if a longer path might have a smaller weight (it’s not very likely in virt-builder).

Also my planner is not smart enough to prune bogus paths. For example, if a path runs cp in adjacent steps, then that path should be pruned.

Nevertheless the planner always gets the right result, and it is considerably simpler than the original hand-written code. The old code had become unmaintainable and wasn’t even efficient: it sometimes made unnecessary copies in order to make the code simpler, wasting end-user time. Because of the ease of maintenance I was able to add new functionality: virt-builder can now run qemu-img resize to expand a disk by < 256 MB, a case where virt-resize doesn’t work (previously the user would have got an error).

Applying old academic techniques like this one doesn’t need to be hard and can help with real world problems. I hope this technique helps others with similar optimization problems.

Edit: The Hacker News discussion includes links to alternative solving tools.


Filed under Uncategorized

Arrrgghh writing GUI programs is tedious

Last week I thought it would be a good idea to write a small GUI front end to virt-resize.

After two days, I nearly have the first tab (out of four) working.

Granted, maybe the first tab is the hardest one:

The job of the first tab is to ask the user for the source (disk image or libvirt guest). It then fires off a background job to open that guest and inspect it for operating systems. Based on the outcome of that (opened OK, found OSes, no OSes found, etc) it has to update and enable the other tabs.

Also the user can preselect a guest on the command line. We also have to deal with connecting to libvirt asynchronously to read the list of guests (and this might also fail in a number of ways).

So far, 1600 lines of code, and the first tab is by no means complete.

One part of the problem is there’s a certain “impedance mismatch” between functional programming in OCaml and writing Gtk. Gtk is heavily mutable and object based. OCaml prefers (but does not require) immutability, and objects in OCaml are obscure and not widely used, and the Gtk bindings are written in a very hard-core object OCaml style.

Another part is just that it’s tedious. It would be tedious if I was doing this in C or Python too. You’ve got asynchronous actions going off here and there which update bits of state. Every control or input could potentially affect every other control or output, resulting in a kind of O(n2) mess of wiring up signals and callbacks.

Is there an easier way? I don’t know …


Filed under Uncategorized

Excellent (anti-)C++ rant

“If you’ve allocated some memory for an object, and then you throw an exception, you don’t have that pointer anymore, and because C++ doesn’t have garbage collection you’ve just leaked memory. The only way around this is to implement garbage collection yourself; C++ weenies call this ‘RAII’ and if they’re really far down the rabbit hole they sometimes don’t even realize that it’s just them implementing shitty reference-counting garbage collection.”


Filed under Uncategorized

Learning: CRM114

Read this introduction to CRM114 (a programming / scripting language). It’s an interesting take on string matching and document classification.

Edit: CRM114 Revealed book (PDF).

Leave a comment

Filed under Uncategorized

JONESFORTH git repository

A few years ago I wrote a literate FORTH compiler and tutorial called JONESFORTH. It’s a good way, I think, to understand the power and limitations of FORTH, and a good way to learn a completely different and mind-blowing programming language.

If you’ve not heard of FORTH before, cogitate on this: It is possible to write a FORTH program in 2,000 lines. A program which will boot and provide an entire development environment (inc. editor, compiler etc) on bare hardware.

Anyhow, I just uploaded my semi-private CVS repository to git. You can find it here:


The original tutorial is in two parts:


Filed under Uncategorized

Xavierbot 0.9

Xavierbot is an OCaml IRC bot that we sometimes use on the Freenode #ocaml channel for teaching OCaml to new users. I say “sometimes” because some channel users have objected to having a bot in the channel, to which I say they should send the xavierbot, shut up command.

OCaml is of course the fantastic expressive/safe programming language which is the main reason why I’m very cynical in person about programming as a profession. You’re still using C/C++/Perl/Python/Ruby/Java/…? Why have we not left the programming equivalent of the 1970s yet?

There’s a new version of Xavierbot here: http://people.redhat.com/~rjones/xavierbot/


Filed under Uncategorized

Half-baked ideas: View source button for Fedora

For more half-baked ideas, see my ideas tag.

I’ll mention first that this isn’t my idea, and it’s not new or original. OLPC already implemented a View Source button.

Why can’t we have the same for Fedora? This is how it would work …

The View Source button would hover in your task bar. When pressed it opens up this dialog:

View source dialog

Like xkill and xwininfo, pressing the “point at a window” button changes your mouse so you click on the program you want to view the source of. X sort of makes it possible to find out (with a bit of effort) which binary is behind each program (see for example the xprop command).

You do an rpm -qf on this binary (or use a yum search) to locate the source.

Use yumdownloader --source to download the source. Unpack it into a standard rpmbuild location, and open up the user’s preferred editor.

With experience, and many custom rules and heuristics, you can extend this idea. For example, if they pointed at a dialog box, search the source for strings from the dialog box to try to locate the exact lines of code.

Or have some debuginfo-like metadata packages which are generated when packages are built, to allow very precise file/line locations to be determined.

Combine the whole thing with LXR so we can browse source intuitively.

This is a great way to encourage contributions to Fedora and Free software in general, because I think it would really make code much more accessible to casual programmers, tinkerers and children. Even experienced programmers would find it useful when tracking down bugs in random applications.


Filed under Uncategorized