Tag Archives: qcow2

How to edit a qcow2 file from C

Suppose you want to edit or read or write the data inside a qcow2 file? One way is to use libguestfs, and that’s the recommended way if you need to mount a filesystem inside the file.

But for accessing the data blocks alone, you can now use the libnbd API and qemu-nbd together and this has a couple of advantages: It’s faster and you can open snapshots (which libguestfs cannot do).

We start by creating a libnbd handle and connecting it to a qemu-nbd instance. The qemu-nbd instance is linked with qemu’s internal drivers that know how to read and write qcow2.

  const char *filename;
  struct nbd_handle *nbd;

  nbd = nbd_create ();
  if (nbd == NULL) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);
  }

  char *args[] = {
    "qemu-nbd", "-f", "qcow2",
    // "-s", snapshot,
    (char *) filename,
    NULL
  };
  if (nbd_connect_systemd_socket_activation (nbd, args) == -1) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);
  }

Now you can get the virtual size:

  int64_t size = nbd_get_size (nbd);
  printf ("virtual size = %" PRIi64 "\n", size);

Or read and write sectors from the file:

  if (nbd_pread (nbd, buf, sizeof buf, 0, 0) == -1) {
    fprintf (stderr, "%s\n", nbd_get_error ());
    exit (EXIT_FAILURE);
  }

Once you’re done with the file, call nbd_close on the handle which will also shut down the qemu-nbd process.

A complete example can be found here.

1 Comment

Filed under Uncategorized

Testing exabyte-sized filesystems using qcow2 and guestfish

You can use qcow2 backing files as a convenient way to test what happens when you try to create exabyte-sized filesystems. Just to remind you, 1 exabyte is a million terabytes, or a pile of ordinary hard disks stacked 8 miles high.

There is a bug in qemu that prevents you from creating very large disks unless you adjust the cluster_size option (thanks Kevin Wolf):

$ qemu-img create -f qcow2 huge.qcow2 \
      $((1024*1024))T -o cluster_size=2M
Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=2097152 lazy_refcounts=off 

After that you can just attach the disk to guestfish and start playing with huge filesystems.

[I should note that virt-rescue is probably a better choice of tool here, especially for people who need to experiment with unusual filesystem or LVM options]

$ guestfish -a huge.qcow2

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

><fs> run
><fs> blockdev-getsize64 /dev/sda
1152921504606846976
><fs> part-disk /dev/sda gpt

Ext4 (according to Wikipedia) is supposed to support 1 exabyte disks, but I couldn’t get that to work, possibly because there was not enough RAM:

><fs> mkfs ext4 /dev/sda1
libguestfs: error: mkfs: ext4: /dev/sda1: mke2fs 1.42.5 (29-Jul-2012)
/dev/sda1: Not enough space to build proposed filesystem while setting up superblock

XFS could create a filesystem, but I didn’t let it run to completion because it would need about 5 petabytes to store the filesystem metadata:

><fs> mkfs xfs /dev/sda1
[ disks churn for many minutes while qcow2 file grows
and grows and grows ... ]

LVM2 PVs are possible, but creating a VG requires us to adjust the extent size:

><fs> pvcreate /dev/sda1
><fs> vgcreate VG /dev/sda1
libguestfs: error: vgcreate:   PV /dev/sda1 too large for extent size 4.00 MiB.
  Format-specific setup of physical volume '/dev/sda1' failed.
  Unable to add physical volume '/dev/sda1' to volume group 'VG'.
><fs> debug sh "vgcreate -s 1G VG /dev/sda1"
  Volume group "VG" successfully created
><fs> lvcreate LV VG 1000000000
><fs> lvs-full
[0] = {
  lv_name: LV
[...]
  lv_size: 1048576536870912
}

Previously …

Leave a comment

Filed under Uncategorized

Maximum qcow2 disk size

I don’t see this documented anywhere obvious, so I examined the source of qemu and performed some tests.

qemu stores the size in a 64 bit unsigned integer. However the qemu-img command line parsing code refuses to parse any number larger than 263-513 (9223372036854774784), and therefore that appears to be the largest disk size you can create:

$ qemu-img create -f qcow2 test1.img $((2**63-513))
Formatting 'test1.img', fmt=qcow2 size=9223372036854774784 encryption=off cluster_size=65536 
$ ll -h test1.img
-rw-r--r--. 1 rjones rjones 192K Oct  3 17:27 test1.img
$ guestfish -a test1.img run : blockdev-getsize64 /dev/sda
9223372036854774784

Interestingly things go horribly wrong as soon as you try to partition this:

$ virt-rescue test1.img
><rescue> parted /dev/vda print
Error: /dev/vda: unrecognised disk label                                  
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 9223372TB
Sector size (logical/physical): 512B/512B
Partition Table: unknown
><rescue> parted /dev/vda mklabel gpt
$ # bang! back to the prompt!

https://bugs.launchpad.net/qemu/+bug/865518

Leave a comment

Filed under Uncategorized

In development: QEMU Enhanced Disk format (QED)

As I wrote about on the virt-tools site, you presently need to make a number of choices about how to store disk images — file or LV? allocated or sparse? raw or qcow2?

If you’ve decided to go with using a host file, raw is the obvious choice, but qcow2 offers more features, like compression, snapshots, and backing files. Unfortunately the flexibility of qcow2 makes it somewhat slower, and so Stefan Hajnoczi at IBM has proposed a simplified qcow-like format called QED.

Stefan’s description summarises the problems and the proposed solution well:

QEMU Enhanced Disk format is a disk image format that forgoes features found in qcow2 in favor of better levels of performance and data integrity. Due to its simpler on-disk layout, it is possible to safely perform metadata updates more efficiently.

Installations, suspend-to-disk, and other allocation-heavy I/O workloads will see increased performance due to fewer I/Os and syncs. Workloads that do not cause new clusters to be allocated will perform similar to raw images due to in-memory metadata caching.

The format supports sparse disk images. It does not rely on the host filesystem holes feature, making it a good choice for sparse disk images that need to be transferred over channels where holes are not supported.

QED won’t contain compression or encryption which Stefan argues (rightly IMHO) are already done better at other levels in the stack. A new format is a chance to think about what has changed in filesystems and block devices since last time, notably support for TRIM, the popularity of extent-based filesystems, and encryption now being found in most guest OSes (so not needed in the block device). We can also learn from what went wrong with qcow2, such as its propensity towards disk corruption when used in some configurations.

See this mailing list posting for the full description and patch.

Of course libguestfs will get QED support for free, since it will come with qemu.

Leave a comment

Filed under Uncategorized

Tip: compress raw disk images using qcow2

$ qemu-img convert -c -f raw -O qcow2 win.img winq.img
$ ls -lh win*
-rw-r--r--. 1 root   root    10G May 18 14:34 win.img
-rw-r--r--. 1 rjones rjones 6.5G May 18 14:59 winq.img

Of course the degree of compression you get depends on the amount of zeroed free space in the image, and the amount by which qcow2 is able to compress the other blocks containing data.

qcow2 uses zlib for compression, so the compression won’t be that spectacular. It’s better to keep the filesystems “sparse” in the first place, by ensuring unused disk blocks are zeroed.

For ext2/3 filesystems, Fedora ships a utility called zerofree, which you can either run inside the guest, or run offline from guestfish. This turns unused filesystem blocks into zeroes, which will make outside compression eg with qcow2 much more efficient. For other filesystems, the usual trick is to create a large file of all zeroes until you fill up the free space, then delete it.

qcow2 files are completely interchangeable with raw disk images:

$ virt-df -h win.img
Filesystem                                Size       Used  Available  Use%
win.img:/dev/vda1                       100.0M      24.1M      75.9M   25%
win.img:/dev/vda2                         9.9G       7.4G       2.5G   75%
$ virt-df -h winq.img
Filesystem                                Size       Used  Available  Use%
winq.img:/dev/vda1                      100.0M      24.1M      75.9M   25%
winq.img:/dev/vda2                        9.9G       7.4G       2.5G   75%

4 Comments

Filed under Uncategorized