Tag Archives: libguestfs

Caseless virtualization cluster, part 4

AMD supports nested virtualization a bit more reliably than Intel, which was one of the reasons to go for AMD processors in my virtualization cluster. (The other reason is they are much cheaper)

But how well does it perform? Not too badly as it happens.

I tested this by creating a Fedora 20 guest (the L1 guest). I could create a nested (L2) guest inside that, but a simpler way is to use guestfish to carry out some baseline performance measurements. Since libguestfs is creating a short-lived KVM appliance, it benefits from hardware virt acceleration when available. And since libguestfs ≥ 1.26, there is a new option that lets you force software emulation so you can easily test the effect with & without hardware acceleration.

L1 performance

Let’s start on the host (L0), measuring L1 performance. Note that you have to run the commands shown at least twice, both because supermin will build and cache the appliance first time and because it’s a fairer test of hardware acceleration if everything is cached in memory.

This AMD hardware turns out to be pretty good:

$ time guestfish -a /dev/null run
real	0m2.585s

(2.6 seconds is the time taken to launch a virtual machine, all its userspace and a daemon, then shut it down. I’m using libvirt to manage the appliance).

Forcing software emulation (disabling hardware acceleration):

$ time LIBGUESTFS_BACKEND_SETTINGS=force_tcg guestfish -a /dev/null run
real	0m9.995s

L2 performance

Inside the L1 Fedora guest, we run the same tests. Note this is testing L2 performance (the libguestfs appliance running on top of an L1 guest), ie. nested virt:

$ time guestfish -a /dev/null run
real	0m5.750s

Forcing software emulation:

$ time LIBGUESTFS_BACKEND_SETTINGS=force_tcg guestfish -a /dev/null run
real	0m9.949s

Conclusions

These are just some simple tests. I’ll be doing something more comprehensive later. However:

  1. First level hardware virtualization performance on these AMD chips is excellent.
  2. Nested virt is about 40% of non-nested speed.
  3. TCG performance is slower as expected, but shows that hardware virt is being used and is beneficial even in the nested case.

Other data

The host has 8 cores and 16 GB of RAM. /proc/cpuinfo for one of the host cores is:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 2
model name	: AMD FX(tm)-8320 Eight-Core Processor
stepping	: 0
microcode	: 0x6000822
cpu MHz		: 1400.000
cache size	: 2048 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bmi1
bogomips	: 7031.39
TLB size	: 1536 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro

The L1 guest has 1 vCPU and 4 GB of RAM. /proc/cpuinfo in the guest:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 21
model		: 2
model name	: AMD Opteron 63xx class CPU
stepping	: 0
microcode	: 0x1000065
cpu MHz		: 3515.548
cache size	: 512 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx pdpe1gb lm rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c hypervisor lahf_lm svm abm sse4a misalignsse 3dnowprefetch xop fma4 tbm arat
bogomips	: 7031.09
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

Update

As part of the discussion in the comments about whether this has 4 or 8 physical cores, here is the lstopo output:

lstopo

7 Comments

Filed under Uncategorized

Tip: Use virt-builder to install Fedora packages from updates-testing

Virt-builder ≥ 1.26 now lets you flexibly edit configuration files before you install packages. (1.24 didn’t). So finally you can enable the Fedora updates-testing repository and build a guest with packages from that:

$ virt-builder fedora-20 \
  --edit '/etc/yum.repos.d/fedora-updates-testing.repo:
            s/enabled=0/enabled=1/' \
  --install git,emacs,yum-utils,net-tools,libguestfs
[   0.0] Downloading: http://libguestfs.org/download/builder/fedora-20.xz
[   1.0] Planning how to build this image
[   1.0] Uncompressing
[  11.0] Opening the new disk
[  16.0] Setting a random seed
[  16.0] Updating core packages
[ 269.0] Editing: /etc/yum.repos.d/fedora-updates-testing.repo
[ 269.0] Installing packages: git emacs yum-utils net-tools libguestfs
[ 349.0] Setting passwords
Setting random password of root to ***
[ 349.0] Finishing off
Output: fedora-20.img
Output size: 4.0G
Output format: raw
Total usable space: 5.2G
Free space: 3.7G (71%)

Leave a comment

Filed under Uncategorized

a-fedora-appliance updated for supermin 5

a-fedora-appliance is a supermin demonstration Fedora appliance.

I have scratch-built a Fedora RPM here which is just a 235K download but contains (by magic!) a fully bootable Fedora appliance. After installing the RPM in Fedora 20, do the following to boot the virtual machine:

# boot-a-fedora-appliance

Screenshot - 310314 - 22:54:03

Note that the scratch build will only last in Koji for a few days. After that you’ll have to follow the README file included in the source.

Leave a comment

Filed under Uncategorized

libguestfs 1.26 released

Yesterday the new stable version of libguestfs (1.26) was released. There are many new features and you can find the release notes here.

On this blog I’ve covered a lot of the new features already:

Leave a comment

Filed under Uncategorized

Using virt-customize to make custom guests with a single backing file

Yesterday I hinted that virt-customize could be used to make custom guests sharing a single backing file. Here is how you do that.

Firstly download a cloud image, or use virt-builder to create one:

$ virt-builder fedora-20 -o backing.img
[   0.0] Downloading: http://libguestfs.org/download/builder/fedora-20.xz
[   1.0] Planning how to build this image
[   1.0] Uncompressing
[  11.0] Opening the new disk
[  15.0] Setting a random seed
[  15.0] Setting passwords
Setting random password of root to 8obMIvmrWe6CCkAv
[  16.0] Finishing off

Now use qemu-img to create overlays for each guest:

$ qemu-img create -b backing.img -f qcow2 guest1.img
$ qemu-img create -b backing.img -f qcow2 guest2.img

You must leave the backing file untouched. In particular don’t try to customize it, else you’ll corrupt all the guests using that backing file.

Now you can customize each guest overlay:

$ virt-customize -a guest1.img \
    --hostname guest1 --timezone Europe/London \
    --install gcc
[   0.0] Examining the guest ...
[   3.0] Setting a random seed
[   3.0] Setting the hostname: guest1
[   3.0] Setting the timezone: Europe/London
[   3.0] Installing packages: gcc

$ virt-customize -a guest2.img \
  --hostname guest2 --install /usr/bin/soffice
[   0.0] Examining the guest ...
[   5.0] Setting a random seed
[   5.0] Setting the hostname: guest2
[   5.0] Installing packages: /usr/bin/soffice

As expected, each guest overlay uses a different amount of space depending on what has been installed:

$ ls -lh guest?.img
-rw-r--r--. 1 rjones rjones 613M Mar 26 15:49 guest1.img
-rw-r--r--. 1 rjones rjones 924M Mar 26 16:04 guest2.img

6 Comments

Filed under Uncategorized

New tool: virt-customize

The final big feature of libguestfs 1.26 has arrived. Virt-customize is the customization bits from virt-builder, in a separate program. This lets you take any virtual machine and install packages, edit configuration files, run scripts, set passwords and so on.

One of the most requested features for virt-builder is the ability to customize templates while keeping a shared backing file, and virt-customize lets you do this.

Here’s how to use virt-customize:

$ virt-customize -a fedora-20.img \
    --update --install gcc
[   0.0] Examining the guest ...
[  37.0] Setting a random seed
[  37.0] Updating core packages
[ 238.0] Installing packages: gcc

virt-inspector has a way to list out the packages installed in a virtual machine disk image, and we can use it to show that gcc was installed:

$ virt-inspector -a fedora-20.img |
    xmlstarlet sel -t -c '//application[name="gcc"]'
<application>
        <name>gcc</name>
        <version>4.8.2</version>
        <release>7.fc20</release>
        <arch>x86_64</arch>
</application>

4 Comments

Filed under Uncategorized

Analysis of the size of libguestfs dependencies

In libguestfs ≥ 1.26 we are going to start splitting the package up into smaller dependencies. Since the full libguestfs package has lots of dependencies because it has to be able to process lots of obscure filesystems, the question is how best to split up the dependencies? We could split off, say, XFS support into a subpackage, but how do we know if that will save any space?

Given the set of dependencies, we want to know the incremental cost of adding another dependency.

We can get an exact measure of this by using supermin to build a chroot containing the set of dependencies, and a second chroot containing the set of dependencies + the additional package. Then we simply compare the sizes of the two chroots. The advantage of using supermin is that the exact same script [see end of posting] will work for Fedora and Debian/Ubuntu since supermin hides the complexity of dealing with the different package managers through its package manager abstraction.

The results of this, using the libguestfs appliance dependencies, on Fedora 20, sorted by dependency size, with my comments added:

  1. gdisk adds 25420 KB

    This is a surprising result in first place, since gdisk is a fairly small, unassuming C++ program (only ~11KLoC). My initial thought was it must be something to do with being written in C++, but I tested that and it’s not true. The real problem is that gdisk depends on libicu (a Unicode library) which adds 24.6 MB to the appliance. [Note: this issue has been fixed in Rawhide.]

  2. lvm2 adds 19432 KB

    The default disk layout of many Linux distros uses LVM so this and similar dependencies have to stay in base libguestfs.

  3. binutils adds 16604 KB

    This is a sorry tale. The one file we use from binutils is /usr/bin/strings (33KB). Unfortunately this single binary pulls in a huge dependency (even worse, it’s a development package, and this causes problems on production systems). I don’t really understand why strings is included in binutils.

  4. gfs2-utils adds 9648 KB
  5. zfs-fuse adds 5208 KB

    Split off in the proposed reorganization.

  6. ntfsprogs adds 4572 KB
  7. e2fsprogs adds 4312 KB

    Most Linux distros use ext4, and we want to support Windows out of the box, so these are included in base libguestfs.

  8. xfsprogs adds 3532 KB

    Split off in the proposed reorganization.

  9. iproute adds 3180 KB

    We use /sbin/ip to set up the network card inside the appliance. It’s a shame this “better” replacement for ifconfig is so large.

  10. tar adds 2896 KB
  11. btrfs-progs adds 2800 KB
  12. openssh-clients adds 2428 KB
  13. parted adds 2420 KB
  14. jfsutils adds 1668 KB
  15. genisoimage adds 1644 KB
  16. syslinux-extlinux adds 1420 KB
  17. augeas-libs adds 1404 KB
  18. iputils adds 1128 KB
  19. reiserfs-utils adds 1076 KB
  20. mdadm adds 1032 KB
  21. strace adds 976 KB
  22. lsof adds 972 KB
  23. vim-minimal adds 912 KB
  24. rsync adds 812 KB
  25. libldm adds 616 KB
  26. psmisc adds 592 KB
  27. nilfs-utils adds 520 KB
  28. hfsplus-tools adds 480 KB

The test script used to produce these results:

#!/bin/bash -

# NB: For this program to work, you must have the following
# packages (or as many as possible) installed locally.
pkgs='acl attr augeas-libs bash binutils bsdmainutils btrfs-progs
bzip2 coreutils cpio cryptsetup cryptsetup-luks diffutils dosfstools
e2fsprogs extlinux file findutils gawk gdisk genisoimage gfs2-utils
grep grub grub-pc gzip hfsplus hfsplus-tools hivex iproute iputils
jfsutils kernel kmod less libaugeas0 libcap libcap2 libhivex0 libldm
libpcre3 libselinux libsystemd-id128-0 libsystemd-journal0 libxml2
libyajl2 linux-image lsof lsscsi lvm2 lzop mdadm module-init-tools
mtools nilfs-utils ntfs-3g ntfsprogs openssh-clients parted pcre
procps procps-ng psmisc reiserfs-utils reiserfsprogs rsync scrub sed
strace syslinux syslinux-extlinux systemd sysvinit tar udev ufsutils
util-linux util-linux-ng vim-minimal vim-tiny xfsprogs xz xz-utils
yajl zerofree zfs-fuse'

# These are the packages (from the above list) that we want to test.
testpkgs="$pkgs"

# Helper function to construct an appliance and see how big it is.
function appliance_size
{
    set -e
    supermin --prepare -o /tmp/supermin.d "$@" >&/dev/null
    supermin --build -f chroot -o /tmp/appliance.d \
      /tmp/supermin.d >&/dev/null
    du -s /tmp/appliance.d | awk '{print $1}'
}

# Construct entire appliance to see how big that would be.
totalsize=`appliance_size $pkgs`

# Remove each package from the list in turn, and find out
# how much extra that package contributes.
for p in $testpkgs; do
    opkgs=
    for o in $pkgs; do
        if [ $o != $p ]; then opkgs="$opkgs $o"; fi
    done
    size=`appliance_size $opkgs`
    extra=$(($totalsize - $size))

    echo $p adds $extra KB
done

1 Comment

Filed under Uncategorized

Transactions with guestfish

I was asked a few days ago if libguestfs has a way to apply a group of changes to an image together. The question was really about transaction support — applying a group of changes and then committing them or doing a rollback, with the final image either containing all the changes or none of them.

Although libguestfs doesn’t support this, you can do it using libguestfs and the qemu-img tool together. This post shows you how.

First I use virt-builder to quickly get a test image that I can play with:

$ virt-builder fedora-20

We create an overlay which will store the changes until we decide to commit or rollback:

$ qemu-img create -f qcow2 -b fedora-20.img overlay.img

Now open the overlay and make your changes:

$ guestfish -a overlay.img -i

Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

Operating system: Fedora release 20 (Heisenbug)
/dev/sda3 mounted on /
/dev/sda1 mounted on /boot

><fs> write-append /etc/issue.net \
    "THIS IS A CHANGE TO ISSUE.NET\n"
><fs> cat /etc/issue.net
Fedora release 20 (Heisenbug)
Kernel \r on an \m (\l)
THIS IS A CHANGE TO ISSUE.NET
><fs> exit

The base image (fedora-20.img) is untouched, and the overlay contains the changes we made:

$ virt-cat -a fedora-20.img /etc/issue.net
Fedora release 20 (Heisenbug)
Kernel \r on an \m (\l)
$ virt-cat -a overlay.img /etc/issue.net
Fedora release 20 (Heisenbug)
Kernel \r on an \m (\l)
THIS IS A CHANGE TO ISSUE.NET

Rollback

Rollback is pretty simple!

$ rm overlay.img

Commit

The more interesting one is how to commit the changes back to the original file. Using qemu-img you just do:

$ qemu-img commit overlay.img
Image committed.
$ rm overlay.img

The changes are now contained in the original image file:

$ virt-cat -a fedora-20.img /etc/issue.net
Fedora release 20 (Heisenbug)
Kernel \r on an \m (\l)
THIS IS A CHANGE TO ISSUE.NET

ACID

Have we discovered the ACID properties of disk images? Not quite.

Although the change is atomic (A)1, the disk image is consistent (C) before and after the change, and the change is durable (D)2, the final property is not satisfied.

There is no isolation (I). Because it is infeasible to resolve conflicts at the block layer where qemu-img operates, it would be guaranteed corruption if you tried this technique in parallel on the same disk image. The only way to make it work reliably is to serialize every operation on the disk image with a mutex.

1 The change is only atomic if you don’t look at the backing file for the short time that qemu-img commit runs.

2 Strictly speaking, you must call sync or fsync after the qemu-img commit in order for the change to be durable.

Leave a comment

Filed under Uncategorized

My 10 minute lightning talk on virt-builder from FOSDEM 2014

image

My 10 minute lightning talk about virt-builder is available to download now (video).

Since there are a few sound problems early on in the talk, I have also created a subtitles file: Advanced_disk_image_management_with_libguestfs.srt With VLC you can just drop this file into the same directory as the video file, and VLC will automatically display the subs. With other players you might need to load the subs separately.

Leave a comment

Filed under Uncategorized

New in virt-sparsify: In place sparsification

New in virt-sparsify ≥ 1.25.44, you can now sparsify disk images without copying them, so-called in-place sparsification.

It’s easy to use:

$ virt-sparsify --in-place fedora.img
Trimming /dev/sda1 ...
Clearing Linux swap on /dev/sda2 ...
Trimming /dev/sda3 ...

Sparsify in-place operation completed with no errors.

… and much faster. However it does require very recent kernel and qemu support.

Thanks: Paolo Bonzini, Eric Sandeen & Kevin Wolf for implementing discard support and patiently helping out when we started to test and use it.

5 Comments

Filed under Uncategorized