Tag Archives: qemu

Run Linux on RISC-V in your browser

http://riscv.org/angel/

Previously, running Linux/RISC-V on qemu.

Leave a comment

Filed under Uncategorized

Booting RISC-V Linux with qemu

There are various open source ISAs and chip designs. I’ve previously run OpenRISC 1200 on an FPGA. Another effort is the RISC-V (“RISC Five”) project, which is developing an open, patent-free 64 bit ISA. It has a sister project lowRISC which aims to produce a synthesizable RISC-V FPGA design “in 6 months”, and tape out by the end of this year (I’m a little skeptical of the timeframes).

RISC-V has added support to a fork of qemu:

$ git remote add riscv https://github.com/riscv/riscv-qemu
$ git fetch riscv
$ git checkout -b riscv-master --track riscv/master
$ ./configure --target-list="riscv-softmmu"
$ make
$ ./riscv-softmmu/qemu-system-riscv -cpu \?
RISCV 'riscv-generic'
$ ./riscv-softmmu/qemu-system-riscv -machine \?
Supported machines are:
board                RISCV Board (default)
none                 empty machine

To save yourself a world of pain, download a RISC-V Linux kernel binary and root image from here.

$ file ~/vmlinux
/home/rjones/vmlinux: ELF 64-bit LSB executable, UCB RISC-V, version 1 (SYSV), statically linked, BuildID[sha1]=d0a6d680362018e0f3b9208a7ea7f79b2b403f7c, not stripped

Then you can boot the image in the usual way:

$ ./riscv-softmmu/qemu-system-riscv \
    -display none \
    -kernel ~/vmlinux \
    -hda ~/root.bin \
    -serial stdio

The root filesystem is very sparse:

# uname -a
Linux ucbvax 3.14.15-g4073e84-dirty #4 Sun Jan 11 07:17:06 PST 2015 riscv GNU/Linux
# ls /bin
ash       chgrp     dd        ln        mv        rmdir     touch
base64    chmod     df        ls        nice      sleep     true
busybox   chown     echo      mkdir     printenv  stat      uname
cat       cp        false     mknod     pwd       stty      usleep
catv      date      fsync     mount     rm        sync
# ls /sbin
init
# ls /usr/bin
[          dirname    groups     mkfifo     sha1sum    tac        uniq
[[         dos2unix   head       nohup      sha256sum  tail       unix2dos
basename   du         hostid     od         sha3sum    tee        uudecode
cal        env        id         printf     sha512sum  test       uuencode
cksum      expand     install    readlink   sort       tr         wc
comm       expr       logname    realpath   split      tty        whoami
cut        fold       md5sum     seq        sum        unexpand   yes

Obligatory comic strip

1 Comment

Filed under Uncategorized

Edit UEFI varstores

See end of post for an important update

UEFI firmware has a concept of persistent variables. They are used to control the boot order amongst other things. They are stored in non-volatile RAM on the system board, or for virtual machines in a host file.

When a UEFI machine is running you can edit these variables using various tools, such as Peter Jones’s efivar library, or the efibootmgr program.

These programs don’t actually edit the varstore directly. They access the kernel /sys/firmware/efi interface, but even the kernel doesn’t edit the varstore. It just redirects to the UEFI runtime “Variable Services”, so what is really running is UEFI code (possibly proprietary, but more usually from the open source TianoCore project).

So how can you edit varstores offline? The NVRAM file format is peculiar to say the least, and the only real specification is the code that writes it from Tianocore. So somehow you must reuse that code. To make it more complicated, the varstore NVRAM format is tied to the specific firmware that uses it, so varstores used on aarch64 aren’t compatible with those on x86-64, nor are SecureBoot varstores compatible with normal ones.

virt-efivars is an attempt to do that. It’s rather “meta”. You write a small editor program (an example is included), and virt-efivars compiles it into a tiny appliance. You then boot the appliance using qemu + UEFI firmware + varstore combination, the editor program runs and edits the varstore, using the UEFI code.

It works .. at least on aarch64 which is the only convenient machine I have that has virtualized UEFI.

Git repo: http://git.annexia.org/?p=virt-efivars.git;a=summary

Update:

After studying this problem some more, Laszlo Ersek came up with a different and better plan:

  1. Boot qemu with only the OVMF code & varstore attached. No OS or appliance.
  2. This should drop you into a UEFI shell which is accessible over qemu’s serial port.
  3. Send appropriate setvar commands to update the variables. Using expect this should be automatable.

Leave a comment

Filed under Uncategorized

Half-baked ideas: qemu -M container

For more half-baked ideas, see the ideas tag.

Containers offer a way to do limited virtualization with fewer resources. But a lot of people have belatedly realized that containers aren’t secure, and so there’s a trend for putting containers into real virtual machines.

Unfortunately qemu is not very well suited to just running a single instance of the Linux kernel, as we in the libguestfs community have long known. There are at least a couple of problems:

  1. You have to allocate a fixed amount of RAM to the VM. This is basically a guess. Do you guess too large and have memory wasted in guest kernel structures, or do you guess too small and have the VM fail at random?
  2. There’s a large amount of overhead — firmware, legacy device emulation and other nonsense — which is essentially irrelevant to the special case of running a Linux appliance in a VM.

Here’s the half-baked idea: Let’s make a qemu “container mode/machine” which is better for this one task.

Unlike other proposals in this area, I’m not suggesting that we throw away or rewrite qemu. That’s stupid, as qemu gives us lots of useful abilities.

Instead the right way to do this is to implement a special virtio-ram device where the guest kernel can start off with a very tiny amount of RAM and request more memory on demand. And an empty machine type which is just for running appliances (qemu on ARM already has this: mach-virt).

Libguestfs people and container people, all happy. What’s not to like?

1 Comment

Filed under Uncategorized

Streaming NBD server

The command:

qemu-img convert input output

does not work if the output is a pipe.

It’d sure be nice if it did though! For one thing, we could use this in virt-v2v to stream images into OpenStack Glance (instead of having to spool them into a temporary file).

I mentioned this to Paolo Bonzini yesterday and he suggested a simple workaround. Just replace the output with:

qemu-img convert -n input nbd:...

and write an NBD server that turns the sequence of writes from qemu-img into a stream that gets written to a pipe. Assuming the output is raw, then qemu-img convert will write, starting at disk offset 0, linearly through to the end of the disk image.

How to write such an NBD server easily? nbdkit is a project I started to make it easy to write NBD servers.

So I wrote a streaming plugin which does exactly that, in 243 lines of code.

Using a feature called captive nbdkit, you can rewrite the above command as:

nbdkit -U - streaming pipe=/tmp/output --run '
  qemu-img convert -n input -O raw $nbd
'

(This command will “hang” when you run it — you have to attach some process to read from the pipe, eg: md5sum < /tmp/output)

Further work

The streaming plugin will a lot more generally useful if it supported a sliding window, allowing limited reverse seeking and reading. So there’s a nice little project for a motivated person. See here

5 Comments

Filed under Uncategorized

virt-v2v: better living through new technology

If you ever used the old version of virt-v2v, our software that converts guests to run on KVM, then you probably found it slow, but worse still it was slow and could fail at the end of the conversion (after possibly an hour or more). No one liked that, least of all the developers and support people who had to help people use it.

A V2V conversion is intrinsically going to take a long time, because it always involves copying huge disk images around. These can be gigabytes or even terabytes in size.

My main aim with the rewrite was to do all the work up front (and if the conversion is going to fail, then fail early), and leave the huge copy to the last step. The second aim was to work much harder to minimize the amount of data that we need to copy, so the copy is quicker. I achieved both of these aims using a lot of new technology that we developed for qemu in RHEL 7.

Virt-v2v works (now) by putting an overlay on top of the source disk. This overlay protects the source disk from being modified. All the writes done to the source disk during conversion (eg. modifying config files and adding device drivers) are saved into the overlay. Then we qemu-img convert the overlay to the final target. Although this sounds simple and possibly obvious, none of this could have been done when we wrote old virt-v2v. It is possible now because:

  • qcow2 overlays can now have virtual backing files that come from HTTPS or SSH sources. This allows us to place the overlay on top of (eg) a VMware vCenter Server source without having to copy the whole disk from the source first.
  • qcow2 overlays can perform copy-on-read. This means you only need to read each block of data from the source once, and then it is cached in the overlay, making things much faster.
  • qemu now has excellent discard and trim support. To minimize the amount of data that we copy, we first fstrim the filesystems. This causes the overlay to remember which bits of the filesystem are used and only copy those bits.
  • I added support for fstrim to ntfs-3g so this works for Windows guests too.
  • libguestfs has support for remote storage, cachemode, discard, copy-on-read and more, meaning we can use all these features in virt-v2v.
  • We use OCaml — not C, and not type-unsafe languages — to ensure that the compiler is helping us to find bugs in the code that we write, and also to ensure that we end up with an optimized, standalone binary that requires no runtime support/interpreters and can be shipped everywhere.

10 Comments

Filed under Uncategorized

Tip: Use gdbserver to debug qemu running under libguestfs

If qemu crashes or fails when run under libguestfs, it can be a bit hard to debug things. However a small qemu wrapper and gdbserver can help.

Create a file called qemu-wrapper chmod +x and containing:

#!/bin/bash -

if ! echo "$@" | grep -sqE -- '-help|-version|-device \?' ; then
  gdbserver="gdbserver :1234"
fi

exec $gdbserver /usr/bin/qemu-system-x86_64 "$@"

Set your environment variables so libguestfs will use the qemu wrapper instead of running qemu directly:

$ export LIBGUESTFS_BACKEND=direct
$ export LIBGUESTFS_HV=/path/to/qemu-wrapper

Now we run guestfish or another virt tool as normal:

$ guestfish -a /dev/null -v -x run

When qemu starts up, gdbserver will run and halt the process, printing:

Listening on port 1234

At this point you can connect gdb:

$ gdb
(gdb) file /usr/bin/qemu-system-x86_64
(gdb) target remote tcp::1234
set breakpoints etc here
(gdb) cont

Leave a comment

Filed under Uncategorized