Tag Archives: openstack
Of course I could use OpenStack RDO but OpenStack is a vast box of somewhat working bits and pieces. I think for a small cluster like mine you can get the essential functionality of OpenStack a lot more simply — in 1300 lines of code as it turns out.
The first thing that small cluster management software doesn’t need is any permanent daemon running on the nodes. The reason is that we already have sshd (for secure management access) and libvirtd (to manage the guests) out of the box. That’s quite sufficient to manage all the state we care about. My Mini Cloud/Cluster software just goes out and queries each node for that information whenever it needs it (in parallel of course). Nodes that are switched off are handled by ignoring them.
The second thing is that for a small cloud we can toss features that aren’t needed at all: multi-user/multi-tenant, failover, VLANs, a nice GUI.
The old mclu (Mini Cluster) v1.0 was written in Python and used Ansible to query nodes. If you’re not familiar with Ansible, it’s basically parallel ssh on steroids. This was convenient to get the implementation working, but I ended up rewriting this essential feature of Ansible in ~ 60 lines of code.
The huge down-side of Python is that even such a small program has loads of hidden bugs, because there’s no safety at all. The rewrite (in OCaml) is 1,300 lines of code, so a fraction larger, but I have a far higher confidence that it is mostly bug free.
I also changed around the way the software works to make it more “cloud like” (and hence the name change from “Mini Cluster” to “Mini Cloud”). Guests are now created from templates using virt-builder, and are stateless “cattle” (although you can mix in “pets” and mclu will manage those perfectly well because all it’s doing is remote libvirt-over-ssh commands).
$ mclu status ham0 on total: 8pcpus 15.2G used: 8vcpus 8.0G by 2 guest(s) free: 6.2G ham1 on total: 8pcpus 15.2G free: 14.2G ham2 on total: 8pcpus 30.9G free: 29.9G ham3 off
You can grab mclu v2.0 from the git repository.
OpenStack can now be installed using Fedora 21 or Rawhide, on aarch64 hardware.
You have to use the
packstack --allinone install method. Ceilometer doesn’t work because we don’t have mongodb on aarch64 yet, and there are a selection of bugs which you need to work around until they are fixed.
The big problem is I don’t have a convenient set of aarch64 cloud images to run on it yet 😦
Happy holidays everyone 🙂
qemu-img convert input output
does not work if the output is a pipe.
It’d sure be nice if it did though! For one thing, we could use this in virt-v2v to stream images into OpenStack Glance (instead of having to spool them into a temporary file).
I mentioned this to Paolo Bonzini yesterday and he suggested a simple workaround. Just replace the output with:
qemu-img convert -n input nbd:...
and write an NBD server that turns the sequence of writes from qemu-img into a stream that gets written to a pipe. Assuming the output is raw, then
qemu-img convert will write, starting at disk offset 0, linearly through to the end of the disk image.
How to write such an NBD server easily? nbdkit is a project I started to make it easy to write NBD servers.
So I wrote a streaming plugin which does exactly that, in 243 lines of code.
Using a feature called captive nbdkit, you can rewrite the above command as:
nbdkit -U - streaming pipe=/tmp/output --run ' qemu-img convert -n input -O raw $nbd '
(This command will “hang” when you run it — you have to attach some process to read from the pipe, eg:
md5sum < /tmp/output)
The streaming plugin will a lot more generally useful if it supported a sliding window, allowing limited reverse seeking and reading. So there’s a nice little project for a motivated person. See here
There haven’t been too many updates around here for a while, and that’s for a very good reason: I’ve been “heads down” writing the new versions of virt-v2v and virt-p2v, our tools for converting VMware and Xen virtual machines, or physical machines, to run on KVM.
The new virt-v2v [manual page] can slurp in a guest from a local disk image, local Xen, VMware vCenter, or (soon) an OVA file — convert it to run on KVM — and write it out to RHEV-M, OpenStack Glance, local libvirt or as a plain disk image.
It’s easy to use too. Unlike the old virt-v2v there are no hairy configuration files to edit or complicated preparations. You simply do:
$ virt-v2v -i disk xen_disk.img -o local -os /tmp
That command (which doesn’t need root, naturally) takes the Xen disk image, which could be any supported Windows or Enterprise Linux distro, converts it to run on KVM (eg. installing virtio drivers, adjusting dozens of configuration files), and writes it out to
To connect to a VMware vCenter server, change the
-i options to:
$ virt-v2v -ic vpx://vcenter/Datacenter/esxi "esx guest name" [-o ...]
To output the converted disk image to OpenStack glance, change the
-o options to:
$ virt-v2v [-i ...] -o glance [-on glance_image_name]
New in nbdkit ≥ 1.1.6, you can run nbdkit as a “captive process” under external programs like qemu or guestfish. This means that nbdkit runs for as long as qemu/guestfish is running, and when they exit it cleans up and exits too.
Here is a rather involved way to boot a Fedora 20 guest:
$ virt-builder fedora-20 $ nbdkit file file=fedora-20.img \ --run 'qemu-kvm -m 1024 -drive file=$nbd,if=virtio'
--run parameter is what tells nbdkit to run as a captive under
$nbd on the qemu command line is substituted automatically with the right
nbd: URL for the port or socket that nbdkit listens on. As soon as qemu-kvm exits, nbdkit is killed and cleaned up.
Here is another example using guestfish:
$ nbdkit file file=fedora-20.img \ --run 'guestfish --format=raw -a $nbd -i' Welcome to guestfish, the guest filesystem shell for editing virtual machine filesystems and disk images. Type: 'help' for help on commands 'man' to read the manual 'quit' to quit the shell Operating system: Fedora release 20 (Heisenbug) /dev/sda3 mounted on / /dev/sda1 mounted on /boot ><fs>
The main use for this is not to run the nbdkit file plugin like this, but in conjunction with perl and python plugins, to let people easily open and edit OpenStack Glance/Cinder and other unconventional disk images.
$ virt-builder fedora-19 --size 20G --install nmap [ 0.0] Downloading: http://libguestfs.org/download/builder/fedora-19.xz [ 2.0] Uncompressing: http://libguestfs.org/download/builder/fedora-19.xz [ 25.0] Running virt-resize to expand the disk to 20.0G [ 74.0] Opening the new disk [ 78.0] Random root password: RCuMKJ4NPak0ptJQ [did you mean to use --root-password?] [ 78.0] Installing packages: nmap [ 93.0] Finishing off
Some notable features:
- Fast: As you can see above, once it has downloaded and cached the template first time, it can churn out new guests in around 90 seconds.
- Install packages.
- Set the hostname.
- Generate a random seed for the guest.
- Upload files.
- Set passwords, create user accounts.
- Run custom scripts.
- Install firstboot scripts.
- Fetch packages from private repos and ISOs.
- Secure: Everything is assembled in a container (using SELinux if available).
- Guest templates are PGP-signed.
- No root or privileged access needed at all (no setuid, no sudo).
- Fully scriptable.
- Can be used in locked-down no-network scenarios.
- Can use UML as a backend (good for use in a cloud).
This script, a response to the insecure and over-complex disk-image-get-kernel in OpenStack, shows how to use libguestfs to safely and easily get the kernel and initramfs from a disk image so you can boot it using an external kernel.
#!/usr/bin/python # Get latest kernel & initramfs safely from a disk image. # Note this will overwrite /tmp/kernel & /tmp/initramfs which you # wouldn't want to do in production. import sys import guestfs assert (len (sys.argv) == 2) disk = sys.argv g = guestfs.GuestFS (python_return_dict=True) # To enable tracing, uncomment the next line. #g.trace (1) # Attach the disk image read-only to libguestfs. g.add_drive_opts (disk, readonly=1) # Run the libguestfs back-end. g.launch () # Ask libguestfs to inspect for operating systems. roots = g.inspect_os () if len (roots) == 0: raise (Error ("no operating systems found")) if len (roots) > 1: raise (Error ("dual/multi-boot images are not supported")) root = roots # Mount up the disks, like guestfish -i. # # Sort keys by length, shortest first, so that we end up # mounting the filesystems in the correct order. mps = g.inspect_get_mountpoints (root) def compare (a, b): return len(a) - len(b) for device in sorted (mps.keys(), compare): try: g.mount_ro (mps[device], device) except RuntimeError as msg: print "%s (ignored)" % msg # For debugging: print "/boot directory of this guest:" print (g.ll ("/boot")) # Get all kernels & initramfses. kernels = g.glob_expand ("/boot/vmlinuz-*") initramfses = g.glob_expand ("/boot/initramfs-*") # Old RHEL: if len (initramfses) == 0: initramfses = g.glob_expand ("/boot/initrd-*") # Debian/Ubuntu: if len (initramfses) == 0: initramfses = g.glob_expand ("/boot/initrd.img-*") if len (kernels) == 0: raise (Error ("no kernel found in this disk image")) if len (initramfses) == 0: raise (Error ("no initramfs found in this disk image")) # Sort by version so we get the latest. from distutils.version import LooseVersion kernels.sort (key=LooseVersion) initramfses.sort (key=LooseVersion) # Download the latest. print ("downloading %s -> /tmp/kernel" % kernels[len (kernels)-1]) g.download (kernels[len (kernels)-1], "/tmp/kernel") print ("downloading %s -> /tmp/initramfs" % initramfses[len (initramfses)-1]) g.download (initramfses[len (initramfses)-1], "/tmp/initramfs") # Shutdown. g.shutdown () g.close ()
Brian is Red Hat’s CTO, and hence my boss’s boss’s boss (or something like that). This is a pretty good (and honest) talk about Red Hat’s plans for OpenStack.
Edit: By the way, the thumbnail (the one I see at any rate) is not Brian.