nbdkit is a pluggable NBD server with a filter system that you can layer over plugins to transform block devices. One of the filters is the error filter which lets you inject errors. We can use this to find out how well filesystems cope with errors and recovering from errors.
$ rm -f /tmp/inject $ nbdkit -fv --filter=error memory size=$(( 2**32 )) \ error-rate=100% error-file=/tmp/inject
# nbd-client localhost /dev/nbd0
We can create a filesystem normally:
# sgdisk -n 1 /dev/nbd0 # gdisk -l /dev/nbd0 Number Start (sector) End (sector) Size Code Name 1 1024 4194286 4.0 GiB 8300 # mkfs.ext4 /dev/nbd0p1 # mount /dev/nbd0p1 /mnt
It’s very interesting watching the verbose output of nbdkit -fv
because you can see the lazy metadata creation which the Linux ext4 kernel driver carries out in the background after you mount the filesystem the first time.
So far we have not injected any errors. To do that we create the error-file
(/tmp/inject
) which the error filter will notice and respond by injecting EIO errors until we remove the file:
# touch /tmp/inject # ls /mnt ls: reading directory '/mnt': Input/output error # rm /tmp/inject # ls /mnt lost+found
Ext4 recovered once we stopped injecting errors, but …
# touch /mnt/hello touch: cannot touch '/mnt/hello': Read-only file system
… it responded to the error by remounting the filesystem read-only. Interestingly I was not able to simply remount the filesystem read-write. Ext4 forced me to unmount the filesystem and run e2fsck
before I could mount it again.
e2fsck
also said:
e2fsck: Unknown code ____ 251 while recovering journal of /dev/nbd0p1
which I guess is a bug (already found upstream).