Loop mount an S3 or Ceph object

This is a fun, small nbdkit Python plugin using the Boto3 AWS SDK:

#!/usr/sbin/nbdkit python

import nbdkit
import boto3
from contextlib import closing


def thread_model():
    return nbdkit.THREAD_MODEL_PARALLEL

def config(key, value):
    global access_key, secret_key, endpoint_url, bucket_name, key_name

    if key == "access-key" or key == "access_key":
        access_key = value
    elif key == "secret-key" or key == "secret_key":
        secret_key = value
    elif key == "endpoint-url" or key == "endpoint_url":
        endpoint_url = value
    elif key == "bucket":
        bucket_name = value
    elif key == "key":
        key_name = value
        raise Exception("unknown parameter %s" % key)

def open(readonly):
    global access_key, secret_key, endpoint_url

    s3 = boto3.client("s3",
                      aws_access_key_id = access_key,
                      aws_secret_access_key = secret_key,
                      endpoint_url = endpoint_url)
    if s3 is None:
        raise Exception("could not connect to S3")
    return s3

def get_size(s3):
    global bucket_name, key_name

    resp = s3.get_object(Bucket = bucket_name, Key = key_name)
    size = resp['ResponseMetadata']['HTTPHeaders']['content-length']
    return int(size)

def pread(s3, buf, offset, flags):
    global bucket_name, key_name

    size = len(buf)
    rnge = 'bytes=%d-%d' % (offset, offset+size-1)
    resp = s3.get_object(Bucket = bucket_name, Key = key_name, Range = rnge)
    body = resp['Body']
    with closing(body):
        buf[:] = body.read(size)

This lets you loop mount a single object (file):

$ ./nbdkit-S3-plugin -f -v -U /tmp/sock \
  access_key="XYZ" secret_key="XYZ" \
  bucket="my_files" key="fedora-28.iso"
$ sudo nbd-client -b 2048 -unix /tmp/sock /dev/nbd0
Negotiation: ..size = 583MB
$ ls /dev/nbd0
 nbd0    nbd0p1  nbd0p2  
$ sudo mount -o ro /dev/nbd0p1 /tmp/mnt
$ ls -l /tmp/mnt
 total 11
 dr-xr-xr-x. 3 root root 2048 Apr 25  2018 EFI
 -rw-r--r--. 1 root root 2532 Apr 23  2018 Fedora-Legal-README.txt
 dr-xr-xr-x. 3 root root 2048 Apr 25  2018 images
 drwxrwxr-x. 2 root root 2048 Apr 25  2018 isolinux
 -rw-r--r--. 1 root root 1063 Apr 21  2018 LICENSE
 -r--r--r--. 1 root root  454 Apr 25  2018 TRANS.TBL

I should note this is a bit different from s3fs which is a FUSE driver that mounts all the files in a bucket.

Leave a comment

Filed under Uncategorized

Ridiculously big “files”

In the last post I showed how you can combine nbdfuse with nbdkit’s RAM disk to mount a RAM disk as a local file. In a talk I gave at FOSDEM last year I described creating these absurdly large RAM-backed filesystems and you can do the same thing now to create ridiculously big “files”. Here’s a 7 exabyte file:

$ touch /var/tmp/disk.img
$ nbdfuse /var/tmp/disk.img --command nbdkit -s memory 7E &
$ ll /var/tmp/disk.img 
 -rw-rw-rw-. 1 rjones rjones 8070450532247928832 Nov  4 13:37 /var/tmp/disk.img
$ ls -lh /var/tmp/disk.img 
 -rw-rw-rw-. 1 rjones rjones 7.0E Nov  4 13:37 /var/tmp/disk.img

What can you actually do with this file, and more importantly does anything break? As in the talk, creating a Btrfs filesystem boringly just works. mkfs.ext4 spins using 100% of CPU. I let it go for 15 minutes but it seemed no closer to either succeeding or crashing. Emacs said:

File disk.img is large (7 EiB), really open? (y)es or (n)o or (l)iterally

and I was too chicken to find out what it would do if I really opened it.

I do wonder if there’s a DoS attack here if I leave this seemingly massive regular file lying around in a public directory.


Filed under Uncategorized

FUSE mounting on top of a file

Our tool nbdfuse lets you mount an NBD block device as a file, using Linux FUSE. For example you could create a directory with a single file in it (called nbd) which contains the contents of the NBD export:

$ mkdir /var/tmp/test
$ nbdfuse /var/tmp/test --command nbdkit -s memory 1G &
$ ls -l /var/tmp/test/
total 0
 -rw-rw-rw-. 1 rjones rjones 1073741824 Nov  4 13:25 nbd
$ fusermount -u /var/tmp/test

This is cool, but wouldn’t it be nice to get rid of the directory and create the file anywhere? Recently Max Reitz found out you can mount a FUSE filesystem over a regular file.

It works! (After a few adjustments to the nbdfuse code)

$ touch /var/tmp/disk.img
$ nbdfuse /var/tmp/disk.img --command nbdkit -s memory 1G &
$ ls -l /var/tmp/disk.img
 -rw-rw-rw-. 1 rjones rjones 1073741824 Nov  4 13:29 /var/tmp/disk.img
$ fusermount -u /var/tmp/disk.img 

1 Comment

Filed under Uncategorized

Notes to self on frama-c

Frama-C is a giant modular system for writing formal proofs of C code. For months I’ve been on-and-off trying to see if we could use it to do useful proofs for any parts of the projects we write, like qemu, libvirt, libguestfs, nbdkit etc. I got side-tracked at first with this frama-c tutorial which is fine, but I got stuck trying to make the GUI work.

Yesterday I discovered this set of 3 short command-line based tutorials: https://maniagnosis.crsr.net/2017/06/AFL-brute-force-search.html https://maniagnosis.crsr.net/2017/06/AFL-bug-in-quicksearch.html https://maniagnosis.crsr.net/2017/07/AFL-correctness-of-quicksearch.html

I thought I’d start by trying to apply this to a small section of qemu code, the fairly self-contained range functions.

The first problem is how to invoke frama-c:

frama-c -wp -wp-rte -wp-print util/range.c -cpp-extra-args=" -I include -I build -I /usr/include -DQEMU_WARN_UNUSED_RESULT= "

You have to give all the include directories and define out some qemu-isms.

The first time you run it, this won’t work for “reasons”. You have to initialize the why3 verifier using:

why3 config --full-config

Really frama-c should just do this for you, or at least tell you what you need to do in the obscure error message it prints.

This still won’t work because util/range.c includes glib headers which use GCC attributes and builtins and frama-c simply cannot parse any of that. So I ended up hacking on the source to replace the headers with standard C headers and remove the one glib-based function in the file.

At this point it does compile and the frama-C WP plugin runs. Of course without having added any annotations it simply produces a long list of problems. Also it takes a fair bit of time to run, which is interesting. I wonder if it will get faster with annotations?

That’s as far as I’ve got for the moment. I’ll come back later and try to add annotations.

1 Comment

Filed under Uncategorized



September 30, 2020 · 9:12 am

Raspberry Pi 4 running Fedora 32

I got Fedora 32 installed on an RPi 4 8GB, booting off USB, with UEFI and ACPI. I followed Robert Grimm’s instructions here, and had an additional set of complications summarised here. There’s not much to say except that it was fiendishly complicated. But it works beautifully now, and is reasonably quick too especially when you consider how little it cost.

So let’s talk about costs (all include tax and delivery):

Raspberry Pi 4 8GB£77.33
SanDisk 500GB SSD x 2£149.98
small SD card needed for booting£free

Only one of the SSDs is actually used, but if you follow Robert’s instructions you will need two. I didn’t have any external USB SSDs that were both USB 3 and not spinning hard disks, so I had to buy these, but I’ll be able to reuse one in a future project. The SD card is required to work around a bug in the UEFI firmware, but I happened to have one lying around.


Filed under Uncategorized

nbdkit Windows port contd.

We ported nbdkit to Windows. That port is now upstream and should appear in the next stable release (1.24). There is also a new native file plugin for Windows which supports Windows files and volumes, hole punching for sparse files, querying file sparseness, and efficient zeroing.


Filed under Uncategorized

nbdkit now ported to Windows

This week I ported nbdkit, our high performance plugin-based Network Block Device server, to Windows. Currently it’s not upstream but you can download the Windows branch from here.

There were several possible ways we could have done this including Cygwin which might have been easier, but in the end I did a port to the raw Win32/Winsock APIs. You can compile it on Fedora using the mingw-w64-based Fedora Windows Cross Compiler and run it using Wine. Familiar commands like this work:

$ ./nbdkit.exe -f -v memory 1G

Windows is such a trash pile of awful APIs it’s a wonder how anyone can use it. Like what, and huh and WTF having to split a 64 bit int across two fields in a struct?? Not to mention the whole mess which is HANDLEs vs SOCKETs vs file descriptors and errno handling in Winsock. But I got there in the end.

I got many existing plugins and filters compiled. Not all of those listed will be working, but the main features are fine. You can also write your own plugins to the same API as Linux ones. I’m hoping that someone can write a Windows block device plugin (especially one which integrates with features like VSS).

$ find \( -name '*.exe' -o -name '*.dll' \) -a -printf "%f\n" | sort -u


Filed under Uncategorized

AMD Zen 2 laptop

AMD Zen 2 laptops are a thing, and they’re blazingly fast.

I just bought the HP Envy x360 which has a 6 core AMD Ryzen 5 4500U. Measuring some real world compiles it’s comfortably two and half times faster than my year old Intel-based Thinkpad T480s (which has 4 cores but 8 threads, and cost at least twice as much).


Filed under Uncategorized

nbdkit tar filter

nbdkit is our high performance liberally licensed Network Block Device server, and OVA files are a common pseudo-standard for exporting virtual machines including their disk images.

A .ova file is really an uncompressed tar file:

$ tar tf rhel.ova

Since tar files usually store their content unmangled, this opens an interesting possibility for reading (or even writing) the embedded disk image without needing to unpack the tar. You just have to work out the offset of the disk image within the tar file. virt-v2v has used this trick to save a copy when importing OVAs for years.

nbdkit has also included a tar plugin which can access a file inside a local tar file, but the problem is what if the tar file doesn’t happen to be a local file? (eg. It’s on a webserver). Or what if it’s compressed?

To fix this I’ve turned the plugin into a filter. Using nbdkit-tar-filter you can unpack even non-local compressed tar files:

$ nbdkit curl http://example.com/qcow2.tar.xz \
         --filter=tar --filter=xz tar-entry=disk.qcow2

(To understand how filters are stacked, see my FOSDEM talk from last year). Because in this example the disk inside the tarball is a qcow2 file, it appears as qcow2 on the wire, so:

$ guestfish --ro --format=qcow2 -a nbd://localhost

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

><fs> run
><fs> list-filesystems 
/dev/sda1: ext2
><fs> mount /dev/sda1 /
><fs> ll /
total 19
drwxr-xr-x   3 root root  1024 Jul  6 20:03 .
drwxr-xr-x  19 root root  4096 Jul  9 11:01 ..
-rw-rw-r--.  1 1000 1000    11 Jul  6 20:03 hello.txt
drwx------   2 root root 12288 Jul  6 20:03 lost+found

Leave a comment

Filed under Uncategorized