Tag Archives: ideas

Half-baked ideas: Log level viewer

For more half-baked ideas, see the ideas tag

Back when I used to work on this sort of thing we used to do a lot of testing involving loops. For 1000 passes, for 1000 voltage settings, for 1000 pulse widths, do the test. That sort of thing. Because logging all 1000x1000x1000 tests would produce far too much output, I wrote a little hierarchical logging library that saved the logs at each level:

for (pass = 0; pass < 1000; pass++) {
  log (1, "pass %d", pass);
  for (voltage = 0; voltage < 1000; voltage++) {
    log (2, "voltage %d", voltage);
    for (pulse = 0; pulse < 1000; pulse++) {
      log (3, "pulse %d", pulse);
      log (3, "performing test");
      log (3, "setting pulse");
      if (set_pulse () == -1)
         error ("setting pulse failed");
      /* etc */

The log function didn’t immediately print the message. Instead it saved it in a buffer. Only when a test failed would the log buffer be printed so you’d see something like:

pass 997
voltage 123
pulse 0
performing test
setting pulse
error: setting pulse failed

Well I don’t recall now the details, but it strikes me there are many situations where logging is hierarchical, and I’m not aware of any logging libraries that express this easily.

One example would be make. You have top level rules like all and check. At a high level, a user might only be interested in whether make all check is currently running the all rule or the check rule. At the next level down, there is each directory recursed into. Below that, individual Makefile rules. Below that, individual commands.

How nice would it be to have a log viewing program that could display this? On screen you’d see several subwindows, each individually scrolling:

▃▃ top level ▃▃▃▃▃▃▃▃▃▃▃▃▃▃
all: OK
check: running ...

▃▃ level 1 ▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃
make all in src: OK
make all in tools: OK
make check in src: running ...

▃▃ level 2 ▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃
  CC       config.o
  CC       domain.o
  CC       options.o
  TEST     API: running ...

That seems to me to be a much clearer way of presenting things than the usual make “splurge”.



Filed under Uncategorized

Half-baked ideas: wikipedia for facts

Want more half-baked ideas? See my ideas tag

Would you like to find out about Boston USA? There’s Wikipedia for that. How about travelling there? Wikivoyage Boston.

How about the population of Boston in the years 1625-2013? Or the wages of bartenders in that fine city over the years? Or the peak summer temperature each year? Not so good.

My half-baked idea is a “wikipedia for curated facts”. These can be derived from many sources, but are presented in a uniform way (by place, time, variable, etc), with references to back them up.

This would be a great way to inject factual content into the air-headed anecdote-based nonsense that passes for opinion on the internet.


Filed under Uncategorized

Half-baked idea: Content-addressable web proxy

For more half-baked ideas, see my ideas tag

There are several situations where you want to fetch some content and don’t particularly care which precise source it comes from:

  1. Downloading packages from Linux distro mirrors.
  2. Downloading git commits.
  3. Grabbing a bittorrent data block.

My proposal (which surely has been done??) is that clients can supply the hash of the file they want when connecting to a proxy; something like:

GET http://example.com/foo HTTP/1.1
Content-Hash: sha256 b32683017c9530[etc]

The proxy is entitled to return any object in its cache that has the desired hash. If it doesn’t have any such object then it’ll fetch it from the URI in the usual way. We’ll have to make some assumptions that only cryptographically strong hashes are allowed, both to prevent the client getting wrong data and to stop clients fishing for unauthorized files from the cache.

In the distro mirroring case, the metadata would contain the hashes of the packages (which it probably already does). The client would supply these to the proxy. The proxy would be able to satisfy the request no matter what mirror was selected — you wouldn’t get the situation where the proxy is downloading several copies of the same data from different mirrors.

In the git case, git commits are the hashes. This would finally let us have an intelligent git mirror, something I’ve been wanting for a while given that I’m on slow DSL and downloading gnulib multiple times per day is no fun for anyone.


Filed under Uncategorized

Half-baked ideas: strace visualizer

For more half-baked ideas, see the ideas tag

When you strace -f a collection of processes you end up with pretty frustrating output that looks like this. It’s almost impossible to keep track without pen and paper and plenty of time.

The half-baked idea is a visualization tool that takes the log file and constructs a picture — changing over time — of what processes are related to what other processes, what binaries they correspond to, what sockets are connected to each process (and connected between processes), what each program wrote to stderr, and so on.

There would be some sort of timeline slider that lets you watch this picture evolving over time, or lets you jump to, say, the place where program X exited with an error.

Make it happen, lazyweb!


Filed under Uncategorized

Half-baked idea: Enable strace/gdb on just one subprocess

For more half-baked ideas, see the ideas tag.

Why’s it not possible to run a deeply nested set of programs (eg. a large build) and have strace or gdb just trigger on a particular program? For example you could do a regular expression match on a command line:

exectrace --run=strace --cmd="\./prog.*-l" -- make check

would (given this theoretical exectrace tool) trigger strace when any child process of the make check matches the regexp \./prog.*-l.

Or perhaps you could trigger on current directory:

exectrace --run=strace --cwd=tests -- make check

I guess this could be implemented using PTRACE_O_TRACEEXEC, and it should have less overhead then doing a full recursive strace, and less annoyance than trying to trace a child in gdb (which AFAIK is next to impossible).


Filed under Uncategorized

Half-baked ideas: ELF VM

For more half-baked ideas, see the ideas tag.

dlopen and LD_PRELOAD are crude tools.

It should be possible to load any ELF binary or library into a program, introspect it to find out what functions it calls, individually intercept or replace those function calls, and then run it in a controlled, isolated environment where it cannot harm the host program.

I’m going to call this idea “ELF VM”. I’m using “VM” here in the “JVM” sense (not the “KVM” sense).

One particular use I have for this is to run programs using an alternate API; for example replacing POSIX API calls like open/read/write/chmod/… with libguestfs API calls. One would load up the target binary into the ELF VM, introspect it to find out what functions it uses, and then replace them or give an error if we don’t know how to replace them. And finally run the binary, but safely and controllably (it wouldn’t be able to overwrite the host program and you could control how much CPU time it could use).

Valgrind already runs binaries in a VM/emulator like this, although not (AFAIK) in a way that I can take that work and apply it in other areas.


Filed under Uncategorized

Half-baked ideas: OCR VM console to diagnose state and errors

For more half-baked ideas, see the ideas tag.

These needs a better name, but the idea is simple. Take a screenshot of your guest’s graphical console, and OCR it in order to diagnose the VM’s state, any errors, etc.

Case 1: VM error on boot

Case 2: VM has reached graphical login screen

OCR would analyze the screenshot, find everything that looks like text and convert it to text, allowing for simple regular expressions to be used to identify VM state.


  1. Strictly speaking in the first case you could read the screen text directly out of the text framebuffer, but OCRing is more general and handles the second case.
  2. virt-dmesg could also solve the first case, assuming virt-dmesg wasn’t broken.


Filed under Uncategorized