Containers are the future! and lots of misinformed talk.
The truth here is more boring. Both full-fat virtualization (KVM), and containers (LXC) are the future, and which you use will depend on what you want to do. You’ll probably use both. You’re probably using both now, but you don’t know it.
The first thing to say is that full virt and containers are used for different things, although there is a little bit of overlap. Containers are only useful when the kernel of all your VMs is identical. That means you can run 1000 copies of Fedora 18 as containers, probably not something you could do with full virt, but you can’t run Windows or FreeBSD or possibly not even Debian in one of those containers. Nor can your users install their own kernels.
The second important point is that containers in Linux are not secure at all. It’s moderately easy to escape from a container and take control of the host or other containers. I also doubt they’ll ever be secure because any local kernel exploit in the enormous syscall API is a host exploit when you’re using containers. This is in contrast with full virt on Fedora and RHEL where the qemu process is protected by multiple layers: process isolation, Unix permissions, SELinux, seccomp, and, yes, a container.
So containers are useless? Not at all. If you need to run lots and lots of VMs and you don’t care about multiple kernels or security, containers are just about the only game in town (although it turns out that KVM is no slouch these days).
Also containers (or the cgroups technology behind them) are being used in very innovative places. They are used in systemd for resource control, in libvirt to isolate qemu, and for sandboxing.
Paolo Bonzini discovered that you can issue SCSI ioctls to virtio devices which are passed down to the host.
The very unfortunate part about this is it easily allows guests to read and write parts of host devices that they are not supposed to. For example, if a guest was confined to host device
/dev/sda3, it could read or write other partitions or the boot sector on
In your guest, try this command which reads the host boot sector:
sg_dd if=/dev/vda blk_sgio=1 bs=512 count=1 of=output
of arguments around to exploit the host.
Here’s Paolo’s write-up on LKML.
Here is the libguestfs mitigation patch. The libvirt mitigation patch.
The latest version of hivex — the library for extracting and modifying Windows Registry hive files has been released. You can get the source from here.
I spent a lot of time examining real hive files from Windows machines and running the library under the awesome valgrind tool, and found one or two places where a corrupt hive file could cause hivex to read uninitialized memory. It’s not clear to me if these are security issues — I think they are not — but everyone is advised to upgrade to this version anyway.
hivex would be a great candidate for fuzz testing if anyone wants to try that.
The script after the fold searches for setuid and setgid files on a disk image or in a virtual machine. When you run it you will see output like this:
# chmod +x findsuid.ml
# ./findsuid.ml RHEL60
/dev/vg_rhel6brewx64/lv_root:/sbin/mount.nfs is setuid file
/dev/vg_rhel6brewx64/lv_root:/sbin/netreport is setgid file
/dev/vg_rhel6brewx64/lv_root:/sbin/pam_timestamp_check is setuid file
/dev/vg_rhel6brewx64/lv_root:/sbin/unix_chkpwd is setuid file
/dev/vg_rhel6brewx64/lv_root:/usr/bin/Xorg is setuid file
/dev/vg_rhel6brewx64/lv_root:/usr/bin/at is setuid file
/dev/vg_rhel6brewx64/lv_root:/usr/bin/chage is setuid file
/dev/vg_rhel6brewx64/lv_root:/usr/bin/chfn is setuid file
/dev/vg_rhel6brewx64/lv_root:/usr/bin/chsh is setuid file
You could make simple adaptions to this script to audit for all sorts of things of interest: public writeable directories, unusual SELinux labels, hard links to setuid files, over-sized files. With more work you could look for files with unusual/changed checksums, infected files and so on.
One comment at the test day described virt-cat as “awesome but dangerous”.
This user was surprised that he could do:
virt-cat aguest /etc/shadow
and read the shadow password file from a guest. “Is there”, I was asked, “a security model for this?”
Here’s the news: You can already look at the shadow password file in any disk image using a hex editor. libguestfs, guestfish and virt-cat just make it easier.
Could you encrypt the virtual disk? That will protect the VM while it is (virtually) switched off, but as soon as you boot it up, the encryption key is stored somewhere in guest memory, and the host administrator can read that too.
No security model can help you here. You need to own and manage the hardware yourself, or you need to trust your cloud provider. If your data is at all personally or commercially sensitive, keep it on hardware you physically control.
On Windows, the file C:\windows\system32\config\SAM contains the users and passwords known to the local machine. hivex can process this file to reveal the usernames and password (hashes):
$ virt-win-reg WinGuest HKLM\\SAM > sam.reg
For each local user you’ll see a key like this:
With typical technical brilliance Microsoft developers have written a zero-length key with the type field (0x3e9) overloaded as a key to use in another part of the registry:
(Apparently the number 0x3e9 is called the “RID” in Microsoft parlance).
My password hint is the “usual”. The “F” key is a dumped C structure containing the last login date amongst other things. The “V” key is another C structure containing my full name, home directory, the password hash and a bunch of other stuff.
With a bit of effort it looks like you could read and even modify these entries.