Tag Archives: c++

Which foreign function interface is the best?

I’ve written libguestfs language bindings for Perl, Python, Ruby, Java, OCaml, PHP, Haskell, Erlang and C#. But which of these is the best? Which is the easiest? What makes this hard? Grubbing around in the internals of a language reveals mistakes made by the language designers, but what are the worst mistakes?

Note: There is source that goes with this. Download libguestfs-1.13.13.tar.gz and look in the respective directories.

The best

It’s going to be a controversial choice, but in my opinion: C#. You just add some simple annotations to your functions and structs, and you can call into shared libraries (or “DllImport”s as Microsoft insisted on calling them) directly. It’s just about as easy as directly calling C and that is no simple achievement considering how the underlying runtime of C# is very different from C.

Example: a C struct:

[StructLayout (LayoutKind.Sequential)]
public class _int_bool {
  int i;
  int b;
}

The worst

There are two languages in the doghouse: Haskell and PHP. PHP first because their method of binding is just very broken. For example, 64 bit types aren’t possible on a 32 bit platform. It requires a very complex autoconf setup. And the quality of their implementation is very poor verging on broken — it makes me wonder if the rest of PHP can be this bad.

Haskell: even though I’m an experienced functional programmer and have done a fair bit of Haskell programming in the past, the FFI is deeply strange and very poorly documented. I simply could not work out how to return anything other than integers from my functions. You end up with bindings that look like this:

write_file h path content size = do
  r <- withCString path $ \path -> withCString content $ \content -> withForeignPtr h (\p -> c_write_file p path content (fromIntegral size))
  if (r == -1)
    then do
      err <- last_error h
      fail err
    else return ()

The middle tier

There’s not a lot to choose between OCaml, Ruby, Java and Erlang. For all of them: you write bindings in C, there’s good documentation, it’s a bit tedious but basically mechanical, and in 3 out of 4 you’re dealing with a reasonable garbage collector so you have to be aware of GC issues.

Erlang is slightly peculiar because the method I chose (out of many possible) is to write an external process that talks to the Erlang over stdin/stdout. But I can’t fault their documentation, and the rest of it is sensible.

Example: Here is a function binding in OCaml, but with mechanical changes this could be Ruby, Java or Erlang too:

CAMLprim value
ocaml_guestfs_add_drive_ro (value gv, value filenamev)
{
  CAMLparam2 (gv, filenamev);
  CAMLlocal1 (rv);

  guestfs_h *g = Guestfs_val (gv);
  if (g == NULL)
    ocaml_guestfs_raise_closed ("add_drive_ro");

  char *filename = guestfs_safe_strdup (g, String_val (filenamev));
  int r;

  caml_enter_blocking_section ();
  r = guestfs_add_drive_ro (g, filename);
  caml_leave_blocking_section ();
  free (filename);
  if (r == -1)
    ocaml_guestfs_raise_error (g, "add_drive_ro");

  rv = Val_unit;
  CAMLreturn (rv);
}

The ugly

Perl: Get reading. You’d better start with perlxs because Perl uses its own language — C with bizarre macros on top so your code looks like this:

SV *
is_config (g)
      guestfs_h *g;
PREINIT:
      int r;
   CODE:
      r = guestfs_is_config (g);
      if (r == -1)
        croak ("%s", guestfs_last_error (g));
      RETVAL = newSViv (r);
 OUTPUT:
      RETVAL

After that, get familiar with perlguts. Perl has only 3 structures and you’ll be using them a lot. There are some brilliant things about Perl which shouldn’t be overlooked, including POD which libguestfs uses to make effortless manual pages.

Python: Best described as half arsed. Rather like the language itself.

Python, Ruby, Erlang: If your language depends on “int”, “long”, “long long” without defining what those mean, and differing based on your C compiler and platform, then you’ve made a big mistake that will unfortunately dog you throughout the runtime, FFIs and the language itself. It’s better either to define them precisely (like Java) or to just use int32 and int64 (like OCaml).

And finally, reference counting (Perl, Python). It’s tremendously easy to make mistakes that are fiendishly difficult to track down. It’s a poor way to do GC and it indicates to me that the language designer didn’t know any better.

16 Comments

Filed under Uncategorized

Half-baked ideas: C: please let me free (static data)

For more half-baked ideas, see the ideas tag.

In modern C, you can free (NULL) safely. I’d like to suggest a small extension: let it be safe to free static data too.

This would avoid many cases where we end up with a “useless” strdup, as in the pseudo-code below:

freeable char *
type_to_string (union type u)
{
  switch (u.type) {
  case INT: return "int"; // strdup avoided
  case CHAR: return "char"; // strdup avoided
  case STRUCT:
    asprintf (&str, "struct %s",
              type_to_string (u.strct.t));
    return str;
  }
}

Notice that

free (type_to_string (u));

would work, whether the type u was INT, CHAR or STRUCT.

This feature is inspired by OCaml. In OCaml, the garbage collector keeps track of pages allocated on the OCaml heap versus everything else, in a hash table. It uses the hash table to avoid following pointers outside the OCaml heap. This gives you great flexibility: an OCaml “pointer” can point to arbitrary stuff if you want (eg. C-allocated structures, static data), and the GC just ignores it. This makes clever stuff like OCaml ancient possible.

A C implementation of this would also need to track which pages come from the C heap, and which pages are static data. It’s quite possible that most C malloc implementations can or do have this already. glibc certainly seems to be able to detect when you free up a non-heap pointer.

Eagle-eyed reviewers will have noted this would break “const correctness” (which is a very woolly and broken concept in C anyway, compared to functional languages). It needs a new class of pointers which can be freed but cannot be written to. I used freeable char * in the pseudo-code above.

3 Comments

Filed under Uncategorized

Tip: Code for getting DHCP address from a virtual machine disk image

Previously (1) and previously (2) I showed there are many different ways to get the IP address from a virtual machine.

The example below shows one way to use libguestfs and hivex from a C program (virt-dhcp-address) to get the DHCP address that a virtual machine has picked up.

Continue reading

7 Comments

Filed under Uncategorized

Tip: Using libguestfs from Perl

I translated the standard libguestfs examples (already available in C/C++, OCaml, Python, Ruby) into Perl.

If you want to call libguestfs from Perl, you have to use Sys::Guestfs.

All 300+ libguestfs API calls are available to all language bindings equally because we generate the bindings.

Leave a comment

Filed under Uncategorized

Changes ahead for libguestfs RHEL 6.1 package

I previously said that libguestfs in RHEL 6.1 would be based on the recent upstream 1.6 release.

This plans have had to change slightly. It looks like we’ll rebase to 1.7.16 (a development version).

The reason is simply that to get into the next release of RHEV we had to remove the Perl dependencies on a number of key programs, because the tiny RHEV-H hypervisor [PDF] doesn’t have space to include Perl. Several programs like virt-inspector and virt-df had to be rewritten in C. We could backport all of the changes but they amount to nearly every change since 1.6 anyway.

What I do have to do is to meticulously check each C program precisely matches the old Perl version, in terms of output, command line arguments and so on, so that scripts written against RHEL 6.0 won’t break. But that’s what you pay Red Hat for.

Preview packages will be available here.

3 Comments

Filed under Uncategorized

libguestfs examples in C, OCaml, Python and Ruby

I took two example C programs (one appeared on this blog earlier) and translated them precisely into OCaml, Python and Ruby:

  1. C/C++ example
  2. OCaml example
  3. Python example
  4. Ruby example

Leave a comment

Filed under Uncategorized

Example: using the libguestfs API from C

I spent about the last 18 hours contributing a section on libguestfs to the RHEL 6 Virtualization Guide. There is precisely zero in the guide at the moment on this subject, but hopefully there will be lots when RHEL 6.1 is released.

You might have forgotten that libguestfs is really a C library that just happens to have a lot of high level wrappers around it. But you can use the C API directly and (for a C API) it’s not too hard.

Below is the example that will appear in the Virtualization Guide next year. You can compile and run the program by saving it to a file test.c and typing:

gcc -Wall test.c -o test -lguestfs
./test

After you run it, there will be a disk image in the current directory called disk.img which you can view using guestfish:

guestfish -a disk.img -m /dev/sda1

For more information about the libguestfs C API, read the API overview section of the documentation.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <guestfs.h>

int                                                                            
main (int argc, char *argv[])
{
  guestfs_h *g;
  size_t i;

  g = guestfs_create ();
  if (g == NULL) {
    perror ("failed to create libguestfs handle");
    exit (EXIT_FAILURE);
 }

  /* Create a raw-format sparse disk image, 512 MB in size. */
  int fd = open ("disk.img", O_CREAT|O_WRONLY|O_TRUNC|O_NOCTTY, 0666);
  if (fd == -1) {
    perror ("disk.img");
    exit (EXIT_FAILURE);
  }
  if (ftruncate (fd, 512 * 1024 * 1024) == -1) {
    perror ("disk.img: truncate");
    exit (EXIT_FAILURE);
  }
  if (close (fd) == -1) {
    perror ("disk.img: close");
    exit (EXIT_FAILURE);
  }

  /* Set the trace flag so that we can see each libguestfs call. */
  guestfs_set_trace (g, 1);

  /* Set the autosync flag so that the disk will be synchronized
   * automatically when the libguestfs handle is closed.
   */
  guestfs_set_autosync (g, 1);

  /* Add the disk image to libguestfs. */
  if (guestfs_add_drive_opts (g, "disk.img",
        GUESTFS_ADD_DRIVE_OPTS_FORMAT, "raw", /* raw format */
        GUESTFS_ADD_DRIVE_OPTS_READONLY, 0, /* for write */
        -1) /* this marks end of optional arguments */
      == -1)
    exit (EXIT_FAILURE);

  /* Run the libguestfs back-end. */
  if (guestfs_launch (g) == -1)
    exit (EXIT_FAILURE);

  /* Get the list of devices.  Because we only added one drive
   * above, we expect that this list should contain a single
   * element.
   */
  char **devices = guestfs_list_devices (g);
  if (devices == NULL)
    exit (EXIT_FAILURE);
  if (devices[0] == NULL || devices[1] != NULL) {
    fprintf (stderr, "error: expected a single device from list-devices\n");
    exit (EXIT_FAILURE);
  }

  /* Partition the disk as one single MBR partition. */
  if (guestfs_part_disk (g, devices[0], "mbr") == -1)
    exit (EXIT_FAILURE);

  /* Get the list of partitions.  We expect a single element, which
   * is the partition we have just created.
   */
  char **partitions = guestfs_list_partitions (g);
  if (partitions == NULL)
    exit (EXIT_FAILURE);
  if (partitions[0] == NULL || partitions[1] != NULL) {
    fprintf (stderr, "error: expected a single partition from list-partitions\n");
    exit (EXIT_FAILURE);
  }

  /* Create a filesystem on the partition. */
  if (guestfs_mkfs (g, "ext4", partitions[0]) == -1)
    exit (EXIT_FAILURE);

  /* Now mount the filesystem so that we can add files. */
  if (guestfs_mount_options (g, "", partitions[0], "/") == -1)
    exit (EXIT_FAILURE);

  /* Create some files and directories. */
  if (guestfs_touch (g, "/empty") == -1)
    exit (EXIT_FAILURE);
  const char *message = "Hello, world\n";
  if (guestfs_write (g, "/hello", message, strlen (message)) == -1)
    exit (EXIT_FAILURE);
  if (guestfs_mkdir (g, "/foo") == -1)
    exit (EXIT_FAILURE);

  /* This one uploads the local file /etc/resolv.conf into the disk image. */
  if (guestfs_upload (g, "/etc/resolv.conf", "/foo/resolv.conf") == -1)
    exit (EXIT_FAILURE);

  /* Because 'autosync' was set (above) we can just close the handle
   * and the disk contents will be synchronized.  You can also do
   * this manually by calling guestfs_umount_all and guestfs_sync.
   */
  guestfs_close (g);

  /* Free up the lists. */
  for (i = 0; devices[i] != NULL; ++i)
    free (devices[i]);
  free (devices);
  for (i = 0; partitions[i] != NULL; ++i)
    free (partitions[i]);
  free (partitions);

  exit (EXIT_SUCCESS);
}

Leave a comment

Filed under Uncategorized