Tag Archives: kvm

virt-install + nbdkit live install

This seems to be completely undocumented which is why I’m writing this … It is possible to boot a Linux guest (Fedora in this case) from a live CD on a website without downloading it. I’m using our favourite flexible NBD server, nbdkit and virt-install.

First of all we’ll run nbdkit and attach it to the Fedora 29 live workstation ISO. To make this work more efficiently I’m going to place a couple of filters on top — one is the readahead (prefetch) filter recently added to nbdkit 1.12, and the other is the cache filter. In combination these filters should reduce the load on the website and improve local performance.

$ rm /tmp/socket
$ nbdkit -f -U /tmp/socket --filter=readahead --filter=cache \
    curl https://download.fedoraproject.org/pub/fedora/linux/releases/29/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-29-1.2.iso

I actually replaced that URL with a UK-based mirror to make the process a little faster.

Now comes the undocumented virt-install command:

$ virt-install --name test --ram 2048 \
    --disk /var/tmp/disk.img,size=10 
    --disk device=cdrom,source_protocol=nbd,source_host_transport=unix,source_host_socket=/tmp/socket \
    --os-variant fedora29

After a bit of grinding that should boot into Fedora 29, and you never (not explicitly at least) had to download the ISO.

Screenshot_2019-04-13_10-30-00

To be fair qemu does also have a curl driver which virt-install could use, but nbdkit is better with the filters and plugins system giving you ultimate flexibility — check out my video about it.

1 Comment

Filed under Uncategorized

New in nbdkit: Create a virtual floppy disk

nbdkit is our flexible, plug-in based Network Block Device server.

While I was visiting the KVM Forum last week, one of the most respected members of the QEMU development team mentioned to me that he wanted to think about deprecating QEMU’s VVFAT driver. This QEMU driver is a bit of an oddity — it lets you point QEMU to a directory of files, and inside the guest it will see a virtual floppy containing those files:

$ qemu -drive file=fat:/some/directory

That’s not the odd thing. The odd thing is that it also lets you make the drive writable, and the VVFAT driver then turns those writes back into modifications to the host filesystem (remember that these are writes happening to raw FAT32 data structures, the driver has to infer from just seeing the writes what is happening at the filesystem level). Which is both amazing and crazy (and also buggy).

Anyway I have implemented the read-only part of this in nbdkit. I didn’t implement the write stuff because that’s very ambitious, although if you were going to implement that, doing it in nbdkit would be better than qemu since the only thing that can crash is nbdkit, not the whole hypervisor.

Usage is very simple:

$ nbdkit floppy /some/directory

This gives you an NBD source which you can connect straight to a qemu virtual machine:

$ qemu -drive nbd:localhost:10809

or examine with guestfish:

$ guestfish --ro --format=raw -a nbd://localhost -m /dev/sda1
Welcome to guestfish, the guest filesystem shell for
editing virtual machine filesystems and disk images.

Type: ‘help’ for help on commands
      ‘man’ to read the manual
      ‘quit’ to quit the shell

> ll /
total 2420
drwxr-xr-x 14 root root  16384 Jan  1  1970 .
drwxr-xr-x 19 root root   4096 Oct 28 10:07 ..
-rwxr-xr-x  1 root root     40 Sep 17 21:23 .dir-locals.el
-rwxr-xr-x  1 root root    879 Oct 27 21:10 .gdb_history
drwxr-xr-x  8 root root  16384 Oct 28 10:05 .git
-rwxr-xr-x  1 root root   1383 Sep 17 21:23 .gitignore
-rwxr-xr-x  1 root root   1453 Sep 17 21:23 LICENSE
-rwxr-xr-x  1 root root  34182 Oct 28 10:04 Makefile
-rwxr-xr-x  1 root root   2568 Oct 27 22:17 Makefile.am
-rwxr-xr-x  1 root root  32085 Oct 27 22:18 Makefile.in
-rwxr-xr-x  1 root root    620 Sep 17 21:23 OTHER_PLUGINS
-rwxr-xr-x  1 root root   4628 Oct 16 22:36 README
-rwxr-xr-x  1 root root   4007 Sep 17 21:23 TODO
-rwxr-xr-x  1 root root  54733 Oct 27 22:18 aclocal.m4
drwxr-xr-x  2 root root  16384 Oct 27 22:18 autom4te.cache
drwxr-xr-x  2 root root  16384 Oct 28 10:04 bash
drwxr-xr-x  5 root root  16384 Oct 27 18:07 common
[etc]

Previously … create ISO images on the fly in nbdkit

Leave a comment

Filed under Uncategorized

Supernested on the QEMU Advent Calendar

screenshot_2016-12-13_08-51-04

I wrote supernested a few years ago to see if I could break nested KVM. It works by repeatedly nesting KVM guests until either something breaks or the whole thing grinds to a halt. Even on my very fastest machine I can only get to an L4 guest (L0 = host, L1 = normal guest).

Kashyap and Thomas Huth resurrected the QEMU Advent Calendar this year, and today (day 13) supernested is featured.

Please note that supernested should only be run on idle machines which aren’t doing anything else, and it can crash the machine.

1 Comment

Filed under Uncategorized

Tip: Updating RHEL 7.1 cloud images using virt-customize and subscription-manager

Red Hat provide RHEL KVM guest and cloud images. At time of writing, the last one was built in Feb 2015, and so undoubtedly contains packages which are out of date or insecure.

You can use virt-customize to update the packages in the cloud image. This requires the libguestfs subscription-manager feature which will only be available in RHEL 7.3, but see here for RHEL 7.3 preview packages. Alternatively you can use Fedora ≥ 22.

$ virt-customize \
  -a rhel-guest-image-7.1-20150224.0.x86_64.qcow2 \
  --sm-credentials 'USERNAME:password:PASSWORD' \
  --sm-register --sm-attach auto \
  --update
[   0.0] Examining the guest ...
[  17.2] Setting a random seed
[  17.2] Registering with subscription-manager
[  28.8] Attaching to compatible subscriptions
[  61.3] Updating core packages
[ 976.8] Finishing off
  1. You should probably use --sm-credentials USERNAME:file:FILENAME to specify your password using a file, rather than having it exposed on the command line.
  2. The command above will leave the image template registered to RHN. To unregister it, add --sm-unregister at the end.

3 Comments

Filed under Uncategorized

My KVM Forum 2015 talk: New qemu technology used in virt-v2v

All KVM Forum talks can be found here.

1 Comment

Filed under Uncategorized

KVM Forum 2015

Assuming HMG can get my passport back to me in time, I am speaking at the KVM Forum 2015 in Seattle USA (full schedule of talks here).

I’m going to be talking about virt-v2v and new features of qemu/KVM that made it possible for virt-v2v to be faster and more reliable than ever.

Leave a comment

Filed under Uncategorized

Super-nested KVM

Regular readers of this blog will of course be familiar with the joys of virtualization. One of those joys is nested virtualization — running a virtual machine in a virtual machine. Nested KVM is a thing too — that is, emulating the virtualization extensions in the CPU so that the second level guest gets at least some of the acceleration benefits that a normal first level guest would get.

My question is: How deeply can you nest KVM?

This is not so easy to test at the moment, so I’ve created a small project / disk image which when booted on KVM will launch a nested guest, which launches a nested guest, and so on until (usually) the host crashes, or you run out of memory, or your patience is exhausted by the poor performance of nested KVM.

The answer, by the way, is just 3 levels [on AMD hardware], which is rather disappointing. Hopefully this will encourage the developers to take a closer look at the bugs in nested virt.

Git repo: http://git.annexia.org/?p=supernested.git;a=summary
Binary images: http://oirase.annexia.org/supernested/

How does this work?

Building a simple appliance is easy. I’m using supermin to do that.

The problem is how does the appliance run another appliance? How do you put the same appliance inside the appliance? Obviously that’s impossible (right?)

The way it works is inside the Lx hypervisor it runs the L(x+1) qemu on /dev/sda, with a protective overlay stored in memory so we don’t disrupt the Lx hypervisor. Since /dev/sda literally is the appliance disk image, this all kinda works.

3 Comments

Filed under Uncategorized

Notes on getting VMware ESXi to run under KVM

This is mostly adapted from this long thread on the VMware community site.

I got VMware ESXi 5.5.0 running on upstream KVM today.

First I had to disable the “VMware backdoor”. When VMware runs, it detects that qemu underneath is emulating this port and tries to use it to query the machine (instead of using CPUID and so on). Unfortunately qemu’s emulation of the VMware backdoor is very half-assed. There’s no way to disable it except to patch qemu:

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index eaf3e61..ca1c422 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -204,7 +204,7 @@ static void pc_init1(QEMUMachineInitArgs *args,
     pc_vga_init(isa_bus, pci_enabled ? pci_bus : NULL);
 
     /* init basic PC hardware */
-    pc_basic_device_init(isa_bus, gsi, &rtc_state, &floppy, xen_enabled(),
+    pc_basic_device_init(isa_bus, gsi, &rtc_state, &floppy, 1,
         0x4);
 
     pc_nic_init(isa_bus, pci_bus);

It would be nice if this was configurable in qemu. This is now being fixed upstream.

Secondly I had to turn off MSR emulation. This is, unfortunately, a machine-wide setting:

# echo 1 > /sys/module/kvm/parameters/ignore_msrs
# cat /sys/module/kvm/parameters/ignore_msrs
Y

Thirdly I had to give the ESXi virtual machine an IDE disk and an e1000 vmxnet3 network card. Note also that ESXi requires ≥ 2 vCPUs and at least 2 GB of RAM.

Screenshot - 190514 - 16:04:38

Screenshot - 190514 - 16:21:09

9 Comments

Filed under Uncategorized

KVM working on the Cubietruck

ctg

I managed to get KVM working on the Cubietruck last week. It’s not exactly simple, but this post describes in overview how to do it.

(1) You will need a Cubietruck, a CP2102 serial cable, a micro SDHC card, a card reader for your host computer, and a network patch cable (the board supports wifi but it doesn’t work with the newer kernel we’ll be using). Optional: 2.5″ SATA HDD or SSD.

(2) Start with Hans De Goede’s AllWinner remix of Fedora 19, and get that working. It’s important to read his README file carefully.

(3) Build this upstream kernel with this configuration:

make oldconfig
make menuconfig

In menuconfig, enable Large Page Address Extension (LPAE), and then enable KVM in the Virtualization menu.

LOADADDR=0x40008000 make uImage dtbs
make modules
sudo cp arch/arm/boot/uImage /boot/uImage.sunxi-test
sudo cp arch/arm/boot/dts/sun7i-a20-cubietruck.dtb /boot/sun7i-a20-cubietruck.dtb.sunxi-test
sudo make modules_install

Reboot, interrupt u-boot (using the serial console), and type the following commands to load the new kernel:

setenv bootargs console=ttyS0,115200 loglevel=9 earlyprintk ro rootwait root=/dev/mmcblk0p3
ext2load mmc 0 0x46000000 uImage.sunxi-test
ext2load mmc 0 0x4b000000 sun7i-a20-cubietruck.dtb.sunxi-test
env set fdt_high ffffffff
bootm 0x46000000 - 0x4b000000

(4) Build this modified u-boot which supports Hyp mode.

make cubietruck_config
make
sudo dd if=u-boot-sunxi-with-spl.bin of=/dev/YOURSDCARD bs=1024 seek=8

Reboot again, use the commands above to boot into the upstream kernel, and if everything worked you should see:

Brought up 2 CPUs
SMP: Total of 2 processors activated.
CPU: All CPU(s) started in HYP mode.
CPU: Virtualization extensions available.

Also /dev/kvm should exist.

(5) Hack QEMU to create Cortex-A7 CPUs using this one-line patch.

Edit: dgilmore tells me this is no longer necessary. Instead make sure you use the qemu -cpu host option.

Then you should be able to create VMs using libvirt. Note if using libguestfs you will need to use the direct backend (LIBGUESTFS_BACKEND=direct) because of this libvirt bug.

7 Comments

Filed under Uncategorized

Creating a cloud-init config disk for non-cloud boots

There are lots of cloud disk images floating around. They are designed to run in clouds where there is a boot-time network service called cloud-init available that provides initial configuration. If that’s not present, or you’re just trying to boot these images in KVM/libvirt directly without any cloud, then things can go wrong.

Luckily it’s fairly easy to create a config disk (aka “seed disk”) which you attach to the guest and then let cloud-init in the guest get its configuration from there. No cloud, or even network, required.

I’m going to use a tool called virt-make-fs to make the config disk, as it’s easy to use and doesn’t require root. There are other tools around, eg. make-seed-disk which do a similar job. (NB: You might hit this bug in virt-make-fs, which should be fixed in the latest version).

I’m also using a cloud image downloaded from the Fedora project, but any cloud image should work.

First I create my cloud-init metadata. This consists of two files. meta-data contains host and network configuration:

instance-id: iid-123456
local-hostname: cloudy

user-data contains other custom configuration (note #cloud-config is
not a comment, it’s a directive to tell cloud-init the format of the file):

#cloud-config
password: 123456
runcmd:
 - [ useradd, -m, -p, "", rjones ]
 - [ chage, -d, 0, rjones ]

(The idea behind this split is probably not obvious, but apparently it’s because the meta-data is meant to be supplied by the Cloud, and the user-data is meant to be supplied by the Cloud’s customer. In this case, no cloud, so we’re going to supply both!)

I put these two files into a directory, and run virt-make-fs to create the config disk:

$ ls
meta-data  user-data
$ virt-make-fs --type=msdos --label=cidata . /tmp/seed.img
$ virt-filesystems -a /tmp/seed.img --all --long -h
Name      Type        VFS   Label   MBR  Size  Parent
/dev/sda  filesystem  vfat  cidata  -    286K  -
/dev/sda  device      -     -       -    286K  -

Now I need to pass some kernel options when booting the Fedora cloud image, and the only way to do that is if I boot from an external kernel & initrd. This is not as complicated as it sounds, and virt-builder has an option to get the kernel and initrd that I’m going to need:

$ virt-builder --get-kernel Fedora-cloud.raw
download: /boot/vmlinuz-3.9.5-301.fc19.x86_64 -> ./vmlinuz-3.9.5-301.fc19.x86_64
download: /boot/initramfs-3.9.5-301.fc19.x86_64.img -> ./initramfs-3.9.5-301.fc19.x86_64.img

Finally I’m going to boot the guest using KVM (you could also use libvirt with a little extra effort):

$ qemu-kvm -m 1024 \
    -drive file=Fedora-cloud.raw,if=virtio \
    -drive file=seed.img,if=virtio \
    -kernel ./vmlinuz-3.9.5-301.fc19.x86_64 \
    -initrd ./initramfs-3.9.5-301.fc19.x86_64.img \
    -append 'root=/dev/vda1 ro ds=nocloud-net'

You’ll be able to log in either as fedora/123456 or rjones (no password), and you should see that the hostname has been set to cloudy.

5 Comments

Filed under Uncategorized