Using virt-sparsify and xz, we can really compress VMs for storage

17G	debian5x64.img.orig

The original image has about 4.7 GB of data, plus a large swap partition, according to virt-df:

$ virt-df -a debian5x64.img.orig -h
Filesystem                        Size       Used  Available  Use%
debian5x64.img.orig:/dev/sda1     322M        66M       239M   21%
debian5x64.img.orig:/dev/debian5x64.home.annexia.org/home
                                  3.4G       359M       2.9G   11%
debian5x64.img.orig:/dev/debian5x64.home.annexia.org/root
                                  320M       301M       2.4M   95%
debian5x64.img.orig:/dev/debian5x64.home.annexia.org/tmp
                                  300M       8.2M       276M    3%
debian5x64.img.orig:/dev/debian5x64.home.annexia.org/usr
                                  3.4G       2.0G       1.2G   60%
debian5x64.img.orig:/dev/debian5x64.home.annexia.org/var
                                  2.6G       2.0G       536M   76%

Using virt-sparsify, all unused space in the image is made sparse. The most recent version can sparsify swap partitions too:

4.6G	debian5x64.img

xz --best -T 0 reduces the final image to under a gigabyte:

971M	debian5x64.img.xz
Advertisement

5 Comments

Filed under Uncategorized

5 responses to “Using virt-sparsify and xz, we can really compress VMs for storage

  1. Amadeus

    Would it still be able to compress the image so well, if you had done
    dd if=/dev/random of=/t bs=1000 count=1000000
    rm -f /t

    ?

    • rich

      If you run virt-sparsify afterwards, then yes. virt-sparsify turns all unused space sparse, even if it’s filled with random data.

  2. sdowdle

    So, that method works for storage… and of course distribution too… so that if you want to share a disk image with others, they have less to download.

    Let’s pretend that is the scenerio. Could you please give the opposite set of instructions whereby one would get that image and try to make it as performant as possible… where perforance is more important that disk space used? I would assume one would turn it back into a raw partition?

    • rich

      Some rules of thumb below.

      Use the fastest disk format possible: It used to be a logical volume or partition containing a raw guest image, but … (a) filesystems have got faster, so now you don’t necessarily need to use LVs, and (b) qcow2 got a lot faster (thanks kwolf and friends!) so qcow2 is now a reasonable alternative to raw. If you do use qcow2, preallocate metadata.

      Don’t use a sparse format on the host (since it slows down writes).

      Make sure the partitions in the guest are aligned: virt-alignment-scan, possibly followed by virt-resize.

      Take usual steps to ensure your host disks are fast, perhaps using RAID 10 or a SAN.

      Test everything to see what works and what doesn’t.

      (And those are just for the storage)

  3. Pingback: virt-resize from an NBD source | Richard WM Jones

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.