Tag Archives: debugging

Half-baked ideas: strace visualizer

For more half-baked ideas, see the ideas tag

When you strace -f a collection of processes you end up with pretty frustrating output that looks like this. It’s almost impossible to keep track without pen and paper and plenty of time.

The half-baked idea is a visualization tool that takes the log file and constructs a picture — changing over time — of what processes are related to what other processes, what binaries they correspond to, what sockets are connected to each process (and connected between processes), what each program wrote to stderr, and so on.

There would be some sort of timeline slider that lets you watch this picture evolving over time, or lets you jump to, say, the place where program X exited with an error.

Make it happen, lazyweb!


Filed under Uncategorized

Half-baked idea: Enable strace/gdb on just one subprocess

For more half-baked ideas, see the ideas tag.

Why’s it not possible to run a deeply nested set of programs (eg. a large build) and have strace or gdb just trigger on a particular program? For example you could do a regular expression match on a command line:

exectrace --run=strace --cmd="\./prog.*-l" -- make check

would (given this theoretical exectrace tool) trigger strace when any child process of the make check matches the regexp \./prog.*-l.

Or perhaps you could trigger on current directory:

exectrace --run=strace --cwd=tests -- make check

I guess this could be implemented using PTRACE_O_TRACEEXEC, and it should have less overhead then doing a full recursive strace, and less annoyance than trying to trace a child in gdb (which AFAIK is next to impossible).


Filed under Uncategorized

Tip: gdb break if first function argument has a value

I wanted to find out who was closing stdout in my program (ie. who is calling close(1);).

This seems to work in gdb:

(gdb) break close if $rdi == 1

The explanation is that on x86-64, register %rdi is used to pass the first argument to a function. In gdb conditionals, register names are prefixed by $. Hence the condition checks if the first argument to the function is 1.

Leave a comment

Filed under Uncategorized

Quick tip: Learn to love core dumps

Core dumps are one way to track down those segfaults which can’t easily be captured in gdb, either because they are rare and unpredictable, or they happen in some deeply nested subprocess. But no one likes core files being left all over the filesystem. Instead, capture your core dumps in a central directory and analyze them later at your pleasure.

Add this to /etc/rc.local and also run it as a root:

mkdir -p /var/log/core
chmod 0777 /var/log/core
echo "/var/log/core/core.%t.%p.%e" > /proc/sys/kernel/core_pattern
echo 0 > /proc/sys/kernel/core_uses_pid

(By the way, ABRT also uses core_pattern so running these commands will disable ABRT, if you care).

You may need to adjust /etc/security/limits.conf:

*               hard    core            unlimited
*               soft    core            unlimited

On some versions of Fedora, also comment out this line from /etc/profile. (What on earth was the thinking behind this?)

#ulimit -S -c 0 > /dev/null 2>&1

Check that core dumps are enabled. This is what you should see (you may need to log out and log in again):

$ ulimit -Hc
$ ulimit -Sc

Now, any time a process crashes, it will leave a core file in /var/log/core:

$ ls -l /var/log/core/
total 1800
-rw-------. 1 rjones rjones 303104 May  5 15:25 core.1273069556.28228.ls
$ gdb /bin/ls /var/log/core/core.1273069556.28228.ls

This doesn’t enable core dumps for all services, since anything started at boot goes down a different path. If you restart individual services by hand, then core dumps will be enabled for those.


Filed under Uncategorized