What is the overhead of qemu/KVM?

To clarify, what is the memory overhead, or how many guests can you cram onto a single host, memory being the typical limiting factor when you virtualize.

This was the question someone asked at work today. I don’t know the answer either, but the small program I wrote (below) aims to find out. If you believe the numbers below from qemu 1.2.2 running on Fedora 18, then the overhead is around 150 MB per qemu process that cannot be shared, plus around 200 MB per host (that is, shared between all qemu processes).

guest size 256 MB:
Shared memory backed by a file: 201.41 MB
Anonymous memory (eg. malloc, COW, stack), not shared: 404.20 MB
Shared writable memory: 0.03 MB

guest size 512 MB:
Shared memory backed by a file: 201.41 MB
Anonymous memory (eg. malloc, COW, stack), not shared: 643.76 MB
Shared writable memory: 0.03 MB

guest size 1024 MB:
Shared memory backed by a file: 201.41 MB
Anonymous memory (eg. malloc, COW, stack), not shared: 1172.38 MB
Shared writable memory: 0.03 MB

guest size 2048 MB:
Shared memory backed by a file: 201.41 MB
Anonymous memory (eg. malloc, COW, stack), not shared: 2237.16 MB
Shared writable memory: 0.03 MB

guest size 4096 MB:
Shared memory backed by a file: 201.41 MB
Anonymous memory (eg. malloc, COW, stack), not shared: 4245.13 MB
Shared writable memory: 0.03 MB

The number to pay attention to is “Anonymous memory” since that is what cannot be shared between guests (except if you have KSM and your guests are such that KSM can be effective).

There are some known shortcomings with my testing methodology that I summarise below. You may be able to see others.

We’re testing a libguestfs appliance. A libguestfs appliance does not have the full range of normal qemu devices that a real guest would have, and so the overhead of a real guest is likely to be higher. The main difference is probably lack of a video device (so no video RAM is evident).
This uses virtio-scsi. Real guests use IDE, virtio-blk, etc which may have quite different characteristics.
This guest has one user network device (ie. SLIRP) which could be quite different from a real network device.
During the test, the guest only runs for a few seconds. A normal, long-running guest would experience qemu memory growth or even memory leaks. You could fix this relatively easily by adding some libguestfs busy-work after the launch.
The guest does not do any significant writes, so during the test qemu won’t be storing any cached or in-flight data blocks.
It only accounts for memory used by qemu in userspace, not memory used by the host kernel on behalf of qemu.
The effectiveness or otherwise of KSM is not tested. It’s likely that KSM depends heavily on your workload, so it wouldn’t be fair to publish any KSM figures.
The script uses /proc/PID/maps but it would be better to use smaps so that we can see how much of the file-backed copy-on-write segments have actually been copied. Currently the script overestimates these by assuming that (eg) all the data pages from a library would be dirtied by qemu.

Another interesting question would be whether qemu is getting better or worse over time.

#!/usr/bin/perl -w

# Estimate memory usage of qemu-kvm at different guest RAM sizes.
# By Richard W.M. Jones <rjones@redhat.com>

use strict;
use Sys::Guestfs;
no warnings "portable"; # 64 bit platform required.

# Loop over different guest RAM sizes.
my $mbytes;
for $mbytes (256, 512, 1024, 2048, 4096) {
    print "guest size ", $mbytes, " MB:\n";

    my $g = Sys::Guestfs->new;

    # Ensure we're using the direct qemu launch backend, otherwise
    # libvirt stops us from finding the qemu PID.
    $g->set_attach_method ("appliance");

    # Set guest memory size.
    $g->set_memsize ($mbytes);

    # Enable user networking just to be more like a "real" guest.
    $g->set_network (1);

    # Launch guest with one dummy disk.
    $g->add_drive ("/dev/null");
    $g->launch ();

    # Get process ID of qemu.
    my $pid = $g->get_pid ();
    die unless $pid > 0;

    # Read the memory maps of the guest.
    open MAPS, "/proc/$pid/maps" or die "cannot open memory map of pid $pid";
    my @maps = <MAPS>;
    close MAPS;

    # Kill qemu.
    $g->close ();

    # Parse the memory maps.
    my $shared_file_backed = 0;
    my $anonymous = 0;
    my $shared_writable = 0;

    my $map;
    foreach $map (@maps) {
        chomp $map;

        if ($map =~ m/
                     ^([0-9a-f]+)-([0-9a-f]+) \s
                     (....) \s
                     [0-9a-f]+ \s ..:.. \s (\d+) \s+ (\S+)?
                    /x) {
            my ($start, $end) = (hex $1, hex $2);
            my $size = $end - $start;
            my $mode = $3;
            my $inode = $4;
            my $filename = $5; # could also be "[heap]", "[vdso]", etc.

            # Shared file-backed text: r-xp, r--p, etc. with a file backing.
            if ($inode != 0 &&
                ($mode eq "r-xp" || $mode eq "r--p" || $mode eq "---p")) {
                $shared_file_backed += $size;
            }

            # Anonymous memory: rw-p.
            elsif ($mode eq "rw-p") {
                $anonymous += $size;
            }

            # Writable and shared.  Not sure what this is ...
            elsif ($mode eq "rw-s") {
                $shared_writable += $size;
            }

            # Ignore [vdso], [vsyscall].
            elsif (defined $filename &&
                   ($filename eq "[vdso]" || $filename eq "[vsyscall]")) {
            }

            # Ignore ---p with no file.  What's this?
            elsif ($inode == 0 && $mode eq "---p") {
            }

            # Ignore kvm-vcpu.
            elsif ($filename eq "anon_inode:kvm-vcpu") {
            }

            else {
                warn "warning: could not parse '$map'\n";
            }
        }
        else {
            die "incorrect maps format: '$map'";
        }
    }

    printf("Shared memory backed by a file: %.2f MB\n",
           $shared_file_backed / 1024.0 / 1024.0);
    printf("Anonymous memory (eg. malloc, COW, stack), not shared: %.2f MB\n",
           $anonymous / 1024.0 / 1024.0);
    printf("Shared writable memory: %.2f MB\n",
           $shared_writable / 1024.0 / 1024.0);

    print "\n";
}

3 responses to “What is the overhead of qemu/KVM?”

Gary Scarborough

February 14, 2013 at 1:14 am

Have you seen this article?
http://arstechnica.com/information-technology/2012/10/megadroid-300000-androids-clustered-together-to-study-network-havoc/

A good example of extreme VM loading.

Pingback: Using top to Find Memory Overhead for qemu/KVM | Professional OpenStack
Yan Fridland

June 19, 2016 at 11:17 am

Hi,
Can you please comment on the overhead QEMU brings in terms of latency and CPU utilization?
I am running a highly latency sensitive application on VM and I see some degradation in performance when running 2 QEMU processes on a single core, or even when I run 1 QEMU process on a single core without a dedicated affinity and RT priority to the QEMU process. Can I make the QEMU lighter to be able to run 2,3 or more QEMUs on a single core?

Thanks
Yan

	Jonathan on SSH from RHEL 9 to RHEL 5 or R…
	Steve on BLKDISCARD, BLKZEROOUT, BLKDIS…
	Tony on guestfish now supports 502…
	Tony on New guestfish -N options in…
	Kitty on SSH from RHEL 9 to RHEL 5 or R…
	rich on BLKDISCARD, BLKZEROOUT, BLKDIS…
	Steve on BLKDISCARD, BLKZEROOUT, BLKDIS…
	Joachim on New tool: virt-customize
	rich on New tool: virt-customize
	Joachim on New tool: virt-customize

What is the overhead of qemu/KVM?

3 responses to “What is the overhead of qemu/KVM?”

Leave a comment Cancel reply

Recent Posts

Recent Comments

About the author

What is the overhead of qemu/KVM?

Share this:

Related

3 responses to “What is the overhead of qemu/KVM?”

Leave a comment Cancel reply

Recent Posts

Recent Comments

About the author