“make” and queuing theory

One directory in the libguestfs sources has 101 source files with a wide range of sizes:

$ ls -gGS *.c
-r--r--r--. 1 498686 Aug  4 20:01 stubs.c
-rw-rw-r--. 2 203439 Sep 18 14:52 guestfs_protocol.c
-rw-rw-r--. 1  51723 Jul 28 14:15 btrfs.c
-rw-rw-r--. 1  36644 Jul 28 14:15 guestfsd.c
-rw-rw-r--. 1  32477 Jul 28 14:15 ext2.c
                 ...
-rw-rw-r--. 1   1120 Feb 14  2015 rename.c
-rw-rw-r--. 1   1073 Feb 14  2015 sleep.c
-rw-rw-r--. 1   1065 Feb 14  2015 echo-daemon.c
-rw-rw-r--. 1    961 Feb 14  2015 pingdaemon.c

If we take file size as a proxy for compilation time, stubs.c is probably going to take the longest time to compile.

The current Makefile builds them in alphabetical order. Unfortunately because stubs.c is near the end of the list, this means the final link step has to wait for stubs.c to finish. While waiting, only a single core is being used and all the other cores are idle.

Can we organize builds to get an overall faster compile?

Simple queuing theory suggests two (obvious) possibilities: starting the builds from the shortest first to the longest last will minimize the amount of time we have to wait for any job to complete.

But what we care about is overall compile time, so starting the longest jobs first should be better, since that way the final link shouldn’t need to wait for a late-started long compile.

Alphabetical 10.5 s
Shortest (smallest) job first 10.3 s
Longest (largest) job first 8.5 s
A random order 8.5 s

So my conclusion is that make could do better by having some way to reorder the list of input files by file size. Even randomly reordering the files could improve some compiles (although that would depend on the particular output of your PRNG on the day you ran the compile).

GNU make has no $(sort_by_file_size) function, but we can get the same effect by using $(shell ls -1S list of source files).

Unfortunately using GNU make functions is incompatible with automake. Grrrrrr. This is the partial solution I added to libguestfs.

An earlier version of this post had the wrong times in the table, leading to the wrong conclusion.

4 Comments

Filed under Uncategorized

4 responses to ““make” and queuing theory

  1. So you have a patch for gnu make to modify the order? Could you share it?

    And you run on multi-core/processor witk -j N. What other info can Make provide, apart from the total time?

    • rich

      No patch. All of the above tests were done with -j5. With no parallelism, there’s no problem — everything is slow.

      • John Graham-Cumming

        What linker are you using? I’m mildly surprised that you can start linking before you’ve made all the objects.

      • rich

        Ordinary GNU linker. I didn’t explain it well: it’s make that is waiting for all the compiles to finish before starting the linker. The effect is exactly the same as if the linker itself was waiting for its inputs.

        Be nice if we could start doing a partial link before all the object files had been compiled, but I’m not aware of any linker that can do that.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.