NBD is a high performance protocol for exporting disks between processes and machines. We use it as a kind of “universal connector” for connecting hypervisors with data sources, and previously myself and Eric Blake wrote a general purpose NBD server called nbdkit. (If you’re interested in the topic of nbdkit as a universal connector, watch my FOSDEM talk.)
Up til now our NBD client has been qemu or one of the qemu tools like qemu-img. That was fine if you wanted to expose a disk source as a running virtual machine (ie. running it with qemu), or if you wanted to perform one of the limited copying operations that qemu-img convert
can do, but there were many cases where it would have been nice to have a general client library.
For example I started to add NBD support to Jen Axboe’s FIO. Lacking a client library I synthesized NBD request packets as C structs and sent them on the wire using low level socket commands. The performance was, to put it bluntly, crap.
Although NBD is a very simple protocol and you can write it by hand, it would be nicer to have a library wrap the low-level stuff, and that’s why we have written libnbd (downloads).
Getting reasonable performance from NBD requires a few tricks:
- You must issue as many commands as possible “in flight” (the server will reply to them out of order, but requests and replies are tied together by a unique ID).
- You may need to open multiple connections to the server, but doing that requires attention to the special MULTI_CONN flag which the server will use to indicate that this is safe.
- Most crucially you must disable Nagle’s algorithm.
This isn’t an exhaustive list. In fact while writing libnbd over about 3 weeks we improved performance by a factor of over 15 times, just by paying attention to system calls, maximizing parallelism and minimizing latency. One advantage of libnbd is that it encodes all this knowledge in an easy to use library so NBD clients won’t have to reinvent it in future.
The library has a simple high-level synchronous API which works how you would expect (but doesn’t get the best performance). A typical program might look like:
struct nbd_handle *nbd; int64_t exportsize; char buf[512]; nbd = nbd_create (); if (!nbd) goto error; if (nbd_connect_tcp (nbd, "localhost", "nbd") == -1) goto error; exportsize = nbd_get_size (nbd); if (nbd_pread (nbd, buf, sizeof buf, 0, 0) == -1) { error: fprintf (stderr, "%s\n", nbd_get_error ()); }
To get the best performance you have to use the more low-level asynchronous API which allows you to queue up commands and bring your own main loop.
There are also bindings in OCaml and Python (and Rust, soon). There’s also a nice little shell written in Python so you can access NBD servers interactively:
$ nbdsh nbd> h.connect_command (["nbdkit", "-s", "memory", "1M"]) nbd> print ("%r" % h.get_size ()) 1048576 nbd> h.pwrite (b"12345", 0) nbd> h.pread (5, 0) b'12345'
libnbd and the shell, nbdsh, are available now in Fedora 29 and above.
Pingback: libnbd 0.9.8 and stable APIs | Richard WM Jones
Pingback: Using American Fuzzy Lop on network clients | Richard WM Jones
Pingback: libnbd + FUSE = nbdfuse | Richard WM Jones