Tip: Tracking down OCaml heap corruptors

If you use Obj.magic, or if you have buggy libraries which use Obj.magic (cough Extlib cough), then sometimes you’ll end up corrupting the OCaml heap. An easy way to track these problems down is to add this checkpoint function near the top of your code:

let checkpoint p =
  Gc.compact ();
  prerr_endline ("checkpoint at position " ^ p)

then place calls to checkpoint "A"; (checkpoint "B" etc) around suspect code.

The checkpoint function does two things: Gc.compact () does a full major round of garbage collection and compacts the heap. This is the most aggressive form of GC available, and I’ve found that it’s highly likely to segfault if the heap is corrupted. The second statement, prerr_endline, prints a message to stderr and crucially also flushes stderr, so you’ll see the message printed immediately.

So the effect is that if the checkpoint function prints something, you can be very sure that your heap is not corrupt at that point in the program.

By placing these in and around suspect code, you can quickly narrow down the place where corruption happens.

4 Comments

Filed under Uncategorized

4 responses to “Tip: Tracking down OCaml heap corruptors

  1. I also used this approach to verify C bindings. It’s true that the slightest error tends to segfault the whole program, very practical.

    Note that use of Obj.magic is one of the reasons I never used extlib (the other being that I never felt the need of using it). For many tasks OCaml is sufficiently fast for not having to trade type-safety for speed. Heap corruption bugs are not the kind of heisen bugs I’m interested in chasing.

    Except for bindings to C I look down on any module that uses Obj (which doesn’t mean I’m never *tempted* to use it).

  2. If you are willing to recompile OCaml for debugging your heap corruption problem (and some of them do not come from improper use of Obj.magic, even in early 2010. I don’t want to ruin Damien Doligez’s story for him, so no details, but it’s a good story), defining DEBUG when compiling the run-time activates caml_heap_check, which checks even more thoroughly that the heap is consistent at each cycle transition.

    Grep for caml_heap_check in byterun/ for details.

  3. Damien Doligez

    I don’t even remember which bug we found in early 2010, but anyway, I want to add to Pascal’s advice that, starting with OCaml 4.00.0, there is an easy way to activate the debug version of the runtime:
    1. configure OCaml with “-with-debug-runtime”
    2. compile and install OCaml
    3. compile your program with “-runtime-variant d”

    This will compile your program with a version of the runtime which has assertions all over the place, and which does a thorough check of the heap structure at each major GC (and at each compaction). If you then follow Rich’s advice, you should be able to narrow down the source of the heap corruption quite easily.

Leave a reply to Damien Doligez Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.