Tag Archives: profiling

Tracing QEMU guest execution part 2

After an overnight 12+ hour run of my Perl scripts I now have a 52 million line file that consists of timestamps, kernel symbols, and other untranslated linear addresses. The only possible way to analyze this will be with yet more scripts, but already there are lots of interesting things.

Here is the kernel code entering userspace:

11409434.8589973 prepare_exit_to_usermode in section .text
11409435.5189973 retint_user + 8 in section .text
11409436.4899973 7fb95ab3cd20
11409447.6899973 7fb95ab4c930
11409469.2169973 7fb95ab577f0

Userspace symbols cannot be decoded because we don’t know which process is being run. More importantly, code in kernel modules cannot be decoded, so we only see core kernel functions.

Handling a timer interrupt:

18000723.5261105 apic_timer_interrupt in section .text
18000725.2241105 smp_apic_timer_interrupt in section .text
18000726.7681105 native_apic_mem_write in section .text
18000729.2691105 smp_apic_timer_interrupt + 46 in section .text
18000729.8361105 irq_enter in section .text
18000730.3941105 rcu_irq_enter in section .text
18000731.1401105 rcu_irq_enter + 92 in section .text
18000731.7111105 irq_enter + 14 in section .text
18000732.3171105 smp_apic_timer_interrupt + 51 in section .text
18000740.9941105 exit_idle in section .text
18000741.5481105 smp_apic_timer_interrupt + 56 in section .text
18000742.0881105 local_apic_timer_interrupt in section .text
18000744.0341105 tick_handle_periodic in section .text
18000744.6341105 _raw_spin_lock in section .text
18000745.9291105 tick_periodic + 67 in section .text
18000747.3941105 do_timer in section .text

Userspace loading a kernel module:

7806760.57896065 40087d
7806765.09696065 4442b0
7806765.65396065 entry_SYSCALL_64 in section .text
7806766.14496065 entry_SYSCALL_64 + 32 in section .text
7806766.46296065 entry_SYSCALL_64 + 36 in section .text
7806788.75296065 sys_init_module in section .text
7806796.76296065 sys_init_module + 62 in section .text
7806797.28296065 sys_init_module + 62 in section .text
7806800.64896065 sys_init_module + 65 in section .text
7806801.94496065 capable in section .text
7806802.91196065 security_capable in section .text
7806804.30796065 cap_capable in section .text
7806804.87796065 security_capable + 72 in section .text
7806805.43596065 ns_capable + 41 in section .text
7806805.92096065 capable + 23 in section .text
7806810.46796065 sys_init_module + 75 in section .text
7806815.59796065 sys_init_module + 86 in section .text
7806821.10196065 sys_init_module + 96 in section .text
7806827.28496065 sys_init_module + 109 in section .text
7806831.23396065 sys_init_module + 129 in section .text
7806839.75396065 security_kernel_module_from_file in section .text

What am I interested in knowing? My overall goal is to find areas in the kernel and userspace that we can optimize to make boot faster. Specifically it seems interesting at first to look at two questions:

  1. Which kernel functions consume the most time? This is “profiling 101”. It may indicate kernel functions which are particularly slow (eg. they contain sleeps).
  2. The time elapsed before each kernel function is called the first time. Because studying the whole trace is impossible (by hand), finding the first time each kernel function is called may give me an insight into major timeline events — eg. the first time one of the ext4 mount functions is called, we know that userspace has got to the point where it mounts the root filesystem.
  3. When does userspace make system calls? This is related to the previous point. It’s also the easiest type of analysis to do because system call entry is easy to detect from the trace.

More to follow …


Filed under Uncategorized