Why can’t you live migrate from newer to older versions of qemu/KVM?

NB: While the following was true in 2012 and RHEL 6, more modern versions of qemu and RHEL ≥ 7 do now support some sorts of forwards and backwards live migration. Read the article for historical context only.

I answered a question on a mailing list about live migration versus copying guests between different versions of KVM on RHEL. The complainant observed that you can’t live migrate from RHEL 6.2 to RHEL 6.1. But you can shut down a guest, copy it from RHEL 6.2 to 6.1, and boot it.

Why is there this difference? It comes down to how live migration is implemented.

Live migration is completely different from shutting down and copying a guest. During live migration we must send the complete state of system RAM, virtual CPUs, and all virtual devices, over to the remote side. In qemu this is done by sending “VMState” structures over the wire, one struct for each device that the guest is using. These structures are mostly a memory dump, but so that you don’t need byte-for-byte compatible versions of qemu when live migrating, each struct is preceded by a version ID.

The receiving qemu checks that it can handle that version of the struct. In some (but not all) cases, qemu knows how to “upgrade”, say, a version 1 struct into a version 2 struct. Downgrades are never possible, and some upgrades are also rejected (eg. if version 2 is a complete rewrite over version 1, then it’s possible for a device to refuse to deal with version 1 structs at all).

Downgrades are not possible, and that’s the basic reason why live migration doesn’t work from a newer to an older version of qemu.

Why does copying work? When a VM is shut down, there is no RAM, vCPU or device state. All the state that remains is the contents of the hard disk. If the hard disk is booted on an older qemu, then the kernel, during boot, will test the available CPUs, devices, etc and adjust itself, exactly the same as if you took a physical hard disk and transplanted it between real machines.

Indirectly related to all this is the qemu machine type. If you created guests on RHEL 6.0, then you may notice the libvirt XML contains:

<type arch='x86_64' machine='rhel6.0.0'>hvm

This machine type stays with the guest even when you update the host.

The machine type controls what devices and PCI slots we present to the guest at boot, and it’s mainly there so that Windows doesn’t try to reactivate itself when you upgrade your host. The newer qemu presents the old devices and PCI assignments, so Windows doesn’t “notice” the updated hypervisor.

For Linux guests this is usually not a problem you have to worry about and you can go ahead and change the machine type at will.

3 Comments

Filed under Uncategorized

3 responses to “Why can’t you live migrate from newer to older versions of qemu/KVM?

  1. excellent. i am also interested how live migration work. so it suspend the source os and copy everything? how long it takes?

  2. Pingback: Cloud Roundup for January 19, 2012 | Scripting4U Blog

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.