There was a big discussion last week about whether zram swap should be the default in a future version of Fedora.
This lead me to think about the RAM disk implementation in nbdkit. In nbdkit up to 1.20 it supports giant virtual disks up to 8 exabytes using a sparse array implemented with a 2-level page table. However it’s still a RAM disk and so you can’t actually store more real data in these disks than you have available RAM (plus swap).
But what if we compressed the data? There are some fine, very fast compression libraries around nowadays — I’m using Facebook’s Zstandard — so the overhead of compression can be quite small, and this lets you make limited RAM go further.
So I implemented allocators for nbdkit ≥ 1.22, including:
$ nbdkit memory 1T allocator=zstd
Compression ratios can be really good. I tested this by creating a RAM disk and filling it with a filesystem containing text and source files, and was getting 10:1 compression. (Note that filesystems start with very regular, easily compressible metadata, so you’d expect this ratio to quickly drop if you filled the filesystem up with a lot of files).
The compression overhead is small, although the current nbdkit-memory-plugin isn’t very smart about locking so it has rather poor performance under multi-threaded loads anyway. (A fun little project to fix that for someone who loves pthread and C.)
I also implemented allocator=malloc
which is a non-sparse direct-mapped RAM disk. This is simpler and a bit faster, but has rather obvious limitations compared to using the sparse allocator.
Could we also get dynamic, pre-emptive swapping?
It is really annoying that when you run out of physical RAM it is like hitting a tarpit. It would be nice, if the system slowly swapped out earlier, before it was really needed – especially if the disk is idle.
E.g.
if free_pct less than 50% {
if ioidle greater than free_pct {
swap some pages out
}
}
It might also be good if the system always swapped out and in chunks of 10 MB: Reading and writing sequentially is pretty fast, but seeking is slow – even on SSDs.
I guess this is more a question about zram rather than nbdkit, and I can’t talk for zram. But for nbdkit, how plugins are written is very flexible, so even if nbdkit-memory-plugin doesn’t exactly do what you want then I’m sure you could write your own. Also nbdkit may be used to back loop-mounted swap files, even from the local machine.