Getting the libguestfs appliance boot time down to 1.2s

libguestfs can securely mount any disk image, but to do this it requires a small appliance to be run. The appliance is a very cut down Linux distro, but it still takes time to boot. For a while that time has floated around 3-5 seconds. This excludes libguestfs from some important use cases — one being the ability to monitor 1000s of VMs every few minutes (simple maths: 1000×3 > 5×60, so you cannot monitor 1000 VMs every 5 minutes without using a lot of parallel appliances).

Last year Intel announced Clear Containers. You may be forgiven for being confused (unclear?) by what Clear Containers actually is, but Intel’s demo is quite neat. (You can run these commands as non-root, and at time of writing they won’t damage your machine.)

$ wget https://download.clearlinux.org/demos/containers/clear-containers-demo.tar.xz
$ mv clear-containers-demo.tar.xz clear-containers-demo.tar.bz2
$ bunzip clear-containers-demo.tar.bz2
$ cd containers
$ bash ./boot.sh

It’s a complete Linux guest that boots in a fraction of a second. I take that as a challenge!

The first step is to have a good idea what all the parts are doing and what is taking the time. Booting an appliance involves several actors — qemu, BIOS, the guest kernel — and without being able to measure how much time each one spends doing things, it’s rather hard to say what needs work or if we’re making improvements. This was why I spent last week unsuccessfully looking at QEMU tracing. I have now settled on a simpler approach which is to time boot messages. The new boot analysis program produces quite clear output:

Warming up the libguestfs cache ...                                                                              
Running the tests in 5 passes ...                                                                                
    pass 1: 798 events collected in 1347184178 ns                                                                
    pass 2: 798 events collected in 1324153548 ns                                                                
    pass 3: 798 events collected in 1342126721 ns                                                                
    pass 4: 798 events collected in 1279500931 ns                                                                
    pass 5: 798 events collected in 1317457653 ns                                                                
Analyzing the results ...                                                                                        
                                                                                                                 
0.000000s: ▲ run mean:1.321973s ±24.0ms (100.0%)                                                              
0.000065s: │ ▲ supermin:build mean:0.010523s ±0.1ms (0.8%)                                                  
           │ │                                                                                               
0.010588s: │ ▼                                                                                               
0.010612s: │ ▲ qemu:feature-detect mean:0.149075s ±4.2ms (11.3%)                                            
           │ │                                                                                               
0.159687s: │ ▼                                                                                               
           │                                                                                                   
0.161412s: │ ▲ ▲ qemu mean:1.160562s ±22.6ms (87.8%) qemu:overhead mean:0.123142s ±4.5ms (9.3%)          
           │ │ │                                                                                           
0.263153s: │ │ │ ▲ seabios mean:0.241488s ±2.8ms (18.3%)                                                
           │ │ │ │                                                                                       
0.284554s: │ │ ▼ │                                                                                       
0.284554s: │ │   │ ▲ bios:overhead mean:0.220087s ±2.8ms (16.6%)                                        
           │ │   │ │                                                                                     
0.504641s: │ │   ▼ ▼                                                                                     
0.504641s: │ │ ▲ ▲ kernel mean:0.817332s ±21.4ms (61.8%) kernel:overhead mean:0.374896s ±5.2ms (28.4%) 
           │ │ │ │                                                                                       
0.879537s: │ │ │ ▼                                                                                       
0.879537s: │ │ │ ▲ supermin:mini-initrd mean:0.086014s ±7.9ms (6.5%)                                    
           │ │ │ │                                                                                       
0.881863s: │ │ │ │ ▲ supermin: internal insmod crc32-pclmul.ko mean:0.001399s ±0.1ms (0.1%)           
           │ │ │ │ │                                                                                   
0.883262s: │ │ │ │ ▼                                                                                   
0.883262s: │ │ │ │ ▲ supermin: internal insmod crc32c-intel.ko mean:0.000226s ±0.5ms (0.0%)           
0.883488s: │ │ │ │ ▼                                                                                   
0.883488s: │ │ │ │ ▲ supermin: internal insmod crct10dif-pclmul.ko mean:0.000882s ±0.4ms (0.1%)       
0.884370s: │ │ │ │ ▼                                                                                   
0.884370s: │ │ │ │ ▲ supermin: internal insmod crc32.ko mean:0.001121s ±0.0ms (0.1%)                  
           │ │ │ │ │                                                                                   
0.885490s: │ │ │ │ ▼                                                                                   
0.885490s: │ │ │ │ ▲ supermin: internal insmod virtio.ko mean:0.001634s ±0.5ms (0.1%)                 
           │ │ │ │ │                                                                                   
0.887124s: │ │ │ │ ▼                                                                                   
0.887124s: │ │ │ │ ▲ supermin: internal insmod virtio_ring.ko mean:0.000581s ±0.7ms (0.0%)            
0.887706s: │ │ │ │ ▼                                                                                   
0.887706s: │ │ │ │ ▲ supermin: internal insmod virtio_blk.ko mean:0.001115s ±0.0ms (0.1%)             
           │ │ │ │ │                                                                                   
0.888821s: │ │ │ │ ▼                                                                                   
0.888821s: │ │ │ │ ▲ supermin: internal insmod virtio-rng.ko mean:0.000884s ±0.4ms (0.1%)             
0.889705s: │ │ │ │ ▼                                                                                   
0.889705s: │ │ │ │ ▲ supermin: internal insmod virtio_console.ko mean:0.001923s ±0.4ms (0.1%)         
           │ │ │ │ │                                                                                   
0.891627s: │ │ │ │ ▼                                                                                   
0.891627s: │ │ │ │ ▲ supermin: internal insmod virtio_net.ko mean:0.001483s ±0.4ms (0.1%)             
           │ │ │ │ │                                                                                   
0.893111s: │ │ │ │ ▼                                                                                   
0.893111s: │ │ │ │ ▲ supermin: internal insmod virtio_scsi.ko mean:0.000686s ±0.6ms (0.1%)            
0.893797s: │ │ │ │ ▼                                                                                   
0.893797s: │ │ │ │ ▲ supermin: internal insmod virtio_balloon.ko mean:0.000663s ±0.5ms (0.1%)         
0.894460s: │ │ │ │ ▼                                                                                   
0.894460s: │ │ │ │ ▲ supermin: internal insmod virtio_input.ko mean:0.000875s ±0.4ms (0.1%)           
0.895336s: │ │ │ │ ▼                                                                                   
0.895336s: │ │ │ │ ▲ supermin: internal insmod virtio_mmio.ko mean:0.001097s ±0.0ms (0.1%)            
           │ │ │ │ │                                                                                   
0.896433s: │ │ │ │ ▼                                                                                   
0.896433s: │ │ │ │ ▲ supermin: internal insmod virtio_pci.ko mean:0.050700s ±7.8ms (3.8%)             
           │ │ │ │ │                                                                                   
0.947133s: │ │ │ │ ▼                                                                                   
0.947133s: │ │ │ │ ▲ supermin: internal insmod crc-ccitt.ko mean:0.001144s ±0.6ms (0.1%)              
           │ │ │ │ │                                                                                   
0.948277s: │ │ │ │ ▼                                                                                   
0.948277s: │ │ │ │ ▲ supermin: internal insmod crc-itu-t.ko mean:0.000001s ±0.0ms (0.0%)              
0.948278s: │ │ │ │ ▼                                                                                   
0.948278s: │ │ │ │ ▲ supermin: internal insmod crc8.ko mean:0.001368s ±0.3ms (0.1%)                   
           │ │ │ │ │                                                                                   
0.949646s: │ │ │ │ ▼                                                                                   
0.949646s: │ │ │ │ ▲ supermin: internal insmod libcrc32c.ko mean:0.001043s ±0.9ms (0.1%)              
           │ │ │ │ │                                                                                   
0.950689s: │ │ │ │ ▼                                                                                   
           │ │ │ │                                                                                       
0.965551s: │ │ │ ▼                                                                                       
0.965551s: │ │ │ ▲ ▲ /init mean:0.318045s ±18.0ms (24.1%) bash:overhead mean:0.015855s ±3.1ms (1.2%) 
           │ │ │ │ │                                                                                   
0.981407s: │ │ │ │ ▼                                                                                   
           │ │ │ │                                                                                       
1.283597s: │ │ │ ▼                                                                                       
1.283597s: │ │ │ ▲ guestfsd mean:0.019151s ±1.9ms (1.4%)                                                
           │ │ │ │                                                                                       
1.294818s: │ │ │ │ ▲ shutdown mean:0.027156s ±4.1ms (2.1%)                                            
           │ │ │ │ │                                                                                   
1.302747s: │ │ │ ▼ │                                                                                   
           │ │ │   │                                                                                     
1.321973s: │ │ ▼   │                                                                                     
1.321973s: ▼ ▼     ▼                                                                                       

Armed with this analysis I made a good start on reducing the boot time. It’s now down to 1.2s (on my laptop) and there is scope for sub-second boots.

Some of the things I’ve changed to get to 1.2s:

Some of the things that may reduce boot times further:

  • Stop SeaBIOS from probing the entire PCI space looking for a boot device it will never use.
  • Implement DAX so that the appliance can execute files directly from backing disk instead of loading them into RAM.
  • A much more detailed look at the qemu and kernel startup process, taking a knife to anything that unnecessarily sleeps or wastes time.

By the way: Even if you never use libguestfs, but you do use virtualized linux, this benefits you too.

4 Comments

Filed under Uncategorized

4 responses to “Getting the libguestfs appliance boot time down to 1.2s

    • rich

      kvmtool doesn’t support lots of features we need, primarily the ability to read and write qcow2 (or indeed any format except raw).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s