Watchdog, by Emmanuel Tabard, used with permission from Flickr
In a physical server, a watchdog is a simple piece of hardware which is supposed to restart the server if it hangs without needing any administrator intervention. Watchdogs used to come as separate cards, but nowadays the feature is found in many chipsets, often integrated with other useful bits of server / remote access functionality like remote serial port, wake-on-LAN, hardware event monitoring etc.
So how does the watchdog work? “Is the machine hung?” is a tricky question to answer if you just look at the hardware level. In a “hung” machine, the CPU is most likely still running, and the kernel might still be up and responding to pings.
Hardware watchdogs instead rely on a piece of software running on the machine which must “tickle” a particular port, say every 10 seconds. If the watchdog doesn’t get “tickled” after, say, 60 seconds (so 6 missed events), then it asserts the RESET line which results in a hard reboot. (Of course, how often you must tickle the hardware varies from watchdog device to watchdog device, and is usually configurable. Some hardware watchdogs have more elaborate states — for example, they can deliver a “second chance” interrupt shortly before they deal the final death blow to the machine. In reality no one uses anything but the basic tickle/reset function.)
So the hardware defers to some software which has to keep tickling the hardware, or else face reboot. But how does this software work? In Linux we use the venerable watchdog daemon project. This is an unusual case where you actually want the software to do lots of “useless” work. So the daemon will typically ping a remote network address, do a process listing, maybe write something to disk and access the service, and run some custom scripts, and only if all those succeed will it tickle the watchdog port. Every 10 seconds.
If you think about some ways in which a server can “hang” you can see why this works. Example (1): Hard disk drive stops sending back interrupts. Processes begin to go into the uninterruptible “D” state, and don’t come out. Watchdog daemon itself enters the D state when it writes to the disk, hence the watchdog port is never tickled and the server gets rebooted. Example (2): Web server parent process segfaults. Existing and some new requests are still being serviced, so the server appears to be working from the outside, but less reliably, at least for a little while. The watchdog daemon lists out the processes in the system and notices that the web server parent process is gone (because of a custom test). As a result, it doesn’t tickle the watchdog port, and so the machine is rebooted. Example (3): A large SQL request results in an important database table getting locked. The watchdog daemon periodically fetches a database-backed web page from the web server. The watchdog daemon’s request hangs. The watchdog port isn’t tickled. The web server reboots.
What does this have to do with virtualization? Virtual machines can also hang in various unpredictable ways, and for the same reasons I outlined above, it’s hard to know whether a VM is hanging, overloaded or just slow. And for all the same reasons, you might want to reboot a wedged VM without administrator intervention. For this reason I wrote a virtual watchdog device for qemu and KVM. It’s simple to configure the watchdog using libvirt:
# virsh edit domname
<watchdog model='i6300esb'/> into the devices section of the XML.
That will create a virtual Intel 6300 ESB (just the watchdog part of this multi-function Intel chipset). You’ll see this PCI device appear when the VM boots:
$ dmesg | grep 6300 i6300ESB timer: initialized (0xffffc20000016000). heartbeat=30 sec (nowayout=0) $ /sbin/lspci | grep 6300 00:05.0 System peripheral: Intel Corporation 6300ESB Watchdog Timer
In the guest, install the watchdog daemon. Linux contains a driver already for the i6300ESB, and for Windows you can download drivers from Intel’s website.
/etc/watchdog.conf and perhaps write a few custom tests. Make sure the watchdog service is set to start at boot. Once the watchdog service is started, it will tickle the (virtual) hardware watchdog. If qemu / KVM notices that the software is no longer tickling the virtual port, it will hard reboot the VM.
12 responses to “What is a watchdog?”
This is awesome!
I ran into one glitch here with it however:
My virtual machines were all made with qemu-0.11, and thus my virt xml had:
machine=’pc-0.11′ in them.
Simply adding the watchdog didn’t do anything, I had to edit that to ‘0.12’ and restart libvirt in order for it to show up. ;(
Great work though. This will be very handy for my rawhide virt instance. 😉
One other glitch:
Boxer dogs are pretty much absolute sweethearts. I wouldn’t be afraid of one as a watch dog.
Is there any possibility to do the same with xen?
No. Xen had a PV watchdog proposed but it was never integrated. You need KVM or QEMU to use my work.
Thanks for this Richard! Intel’s drivers for the 6300esb watchdog can be found at: http://downloadcenter.intel.com/Detail_Desc.aspx?ProductID=1706&DwnldID=7246&lang=eng
Although only for 32 bit Windows. There never was a 64 bit version of this driver AFAICT.
It’s worth noting that most of the Intel driver packs for Windows contain an INF file which install NO driver for this device! Windows also helpfully displays a “driver date”, and “driver version”, for the device in Device Manager.
Make sure the device is using the WDTDRVR.sys driver file by clicking on the “Driver Details” button. Without the real driver installed this just gives “No driver files are required or have been loaded for this device”.
Also we found a huge bug in the Intel 6300 Windows driver: It simply assumes that the watchdog is the first PCI device. It doesn’t look at the PCI configuration at all. In practice this means this driver is useless. The only solution will be for someone to write an open source i6300 Windows driver.
Maybe a stupid question: do I need a hardware watchdog in the hypervisor in order for this to work? Or is this a pure software solution?
It’s pure software.
One question, Does qemu bring down the vm (i have configured the action as poweroff) or the i6300esb driver in VM bring it down? If the driver brings it down, how do we catch cases wherein the VM is stuck in bios (basically
before the driver is loaded scenarios)?
Pingback: 종료시 메시지 : 워치 독이 중지되지 않았습니다! 랩톱은 종료하지 않고 몇 - How IT