The final two questions that I posed last time were to do with constructing a timeline of what this guest is spending time on.
We can easily see system calls in the trace log, and we can also see when a kernel function is entered the first time (indicating that a new bit of the kernel is now running), and I wrote a Perl script to analyze that. That gave me a 115K line log file from which I did the rest of the analysis by hand to generate a timeline.
I would reproduce it here, but the results aren’t very enlightening. In particular I doubt it’s more interesting that what you can get by reading the kernel printk’s from a boot log.
What is my conclusion after using these techniques to analyze a booting guest? Here I go:
- It’s clunky and undocumented. Hopefully this series should help a little.
- It would be much more powerful with stack traces. It should be possible to get them from QEMU, at least in theory, but it’s a lot of work to do so.
- It would be much more powerful if we could analyze into kernel modules and userspace.
- More tooling around this might make it more bearable.
Something I’ve been meaning to explore for some time is libvmi (https://github.com/libvmi/libvmi). It seems Xen specific, but perhaps useful for obtaining the kind of information you’re looking for in a lightweight manner, perhaps even under nested virt?
There was a nice presentation at 31c3 about it:https://youtu.be/ElggombHA8E
Anyway, thanks for the very interesting and helpful series of posts!
Also maybe we can reuse the GDB Python scripts of the Linux kernel to parse some kernel memory. One day, one day.
Pingback: Getting the libguestfs appliance boot time down to 1.2s | Richard WM Jones
Just wanted to tell you that I read your paper on Optimizing QEMU Boot Time. Its one of the kewlest serious engineering papers on kernel engineering I’ve seen in years. I wish I could be there with you helping with that work. Keep it up!!