Tag Archives: zstd

Creating a modifiable gzipped disk image

Dusty Mabe set me a challenge yesterday. He wants to create several compressed disk images that have slightly different content, but are otherwise mostly the same. The disk images are large and compressing them takes a long time (30 minutes each, apparently), so ideally what we’d want to do is compress the disk image just the once and then do the updates on the gzipped image.

Modifying a file which has already been compressed is not usually possible.

However if we make some relatively uncontroversial assumptions and accept a few limitations then we can create a compressed disk image which is modifiable in this way, certainly for gzip and xz (I need to investigate zstd).

Firstly let’s assume we’re using some kind of block-based compression with fixed size blocks. This is true for gzip and xz already. Secondly let’s assume that we want to only modify a single, small partition of the image (Dusty only wants to modify the /boot partition). Lastly we’ll assume that the partition boundaries are aligned to the compression blocks. Since partitions can be placed under control of whoever creates the disk image this last one is pretty easy to arrange.

The trick is to use uncompressed blocks for the part of the input covering the partition you want to modify, and compress the rest of the file normally. Both gzip and xz have an uncompressed block type. (In fact, just about any reasonable compression algorithm works like this – if the input data doesn’t become smaller after trying to compress it, the algorithm will emit the data uncompressed since that takes less space.)

Normal tools won’t let you create files like this, but I wrote one for gzip here, and I’m confident that one could be written for xz (exercise for readers! … or me if we decide to productise this).

I created a normal Fedora 36 image using virt-builder and used guestfish to find the partition boundaries:

$ guestfish --ro -a /var/tmp/fedora-36.img -i part-list /dev/sda
[0] = {
  part_num: 1
  part_start: 1048576
  part_end: 2097151
  part_size: 1048576
}
[1] = {
  part_num: 2
  part_start: 2097152
  part_end: 1075838975
  part_size: 1073741824
}
[2] = {
  part_num: 3
  part_start: 1075838976
  part_end: 6441402367
  part_size: 5365563392
}

We will compress partition #3 while leaving partitions #1 and #2 uncompressed so they can be modified in place later. The tool is a bit crude, but:

$ ./partial-deflate /var/tmp/fedora-36.img /var/tmp/fedora-36.img.gz 0 1075838976 6441402368 

The output is a regular gzip file (albeit rather large because the first 1GB is uncompressed – if I was doing this for real I’d make that boot partition smaller):

$ ll /var/tmp/fedora-36.img*
-rw-r--r--. 1 rjones rjones 6442450944 Nov 30 17:03 /var/tmp/fedora-36.img
-rw-r--r--. 1 rjones rjones 1698849910 Dec  1 16:23 /var/tmp/fedora-36.img.gz
$ zcat /var/tmp/fedora-36.img.gz | md5sum 
57cbd5ebcbe59613378c8aee7ad9e40d  -
$ md5sum /var/tmp/fedora-36.img
57cbd5ebcbe59613378c8aee7ad9e40d  /var/tmp/fedora-36.img

Then I went ahead and modified some known content inside the gzip file (but not compressed). I used a hex editor for this, but you could play around with guestfish + nbdkit-offset-filter for something more supportable:

And the result is a gzipped file with the modifications:

$ zcat /var/tmp/fedora-36.img.gz > /var/tmp/fedora-36.img-modified
gzip: /var/tmp/fedora-36.img.gz: invalid compressed data--crc error
$ guestfish --ro -a /var/tmp/fedora-36.img-modified -i cat /boot/mydata.txt 
helloworld------

…and a CRC error. That’s to be expected, as I haven’t yet worked out how to update CRC-32 after making changes, but it should be easy to solve (with brute force if necessary).

1 Comment

Filed under Uncategorized

Compressed RAM disks

There was a big discussion last week about whether zram swap should be the default in a future version of Fedora.

This lead me to think about the RAM disk implementation in nbdkit. In nbdkit up to 1.20 it supports giant virtual disks up to 8 exabytes using a sparse array implemented with a 2-level page table. However it’s still a RAM disk and so you can’t actually store more real data in these disks than you have available RAM (plus swap).

But what if we compressed the data? There are some fine, very fast compression libraries around nowadays — I’m using Facebook’s Zstandard — so the overhead of compression can be quite small, and this lets you make limited RAM go further.

So I implemented allocators for nbdkit ≥ 1.22, including:

$ nbdkit memory 1T allocator=zstd

Compression ratios can be really good. I tested this by creating a RAM disk and filling it with a filesystem containing text and source files, and was getting 10:1 compression. (Note that filesystems start with very regular, easily compressible metadata, so you’d expect this ratio to quickly drop if you filled the filesystem up with a lot of files).

The compression overhead is small, although the current nbdkit-memory-plugin isn’t very smart about locking so it has rather poor performance under multi-threaded loads anyway. (A fun little project to fix that for someone who loves pthread and C.)

I also implemented allocator=malloc which is a non-sparse direct-mapped RAM disk. This is simpler and a bit faster, but has rather obvious limitations compared to using the sparse allocator.

2 Comments

Filed under Uncategorized