Tag Archives: ideas

Half-baked idea: Autodetect dependencies

For more half-baked ideas, see the ideas tag.

Large “coordination” libraries like libvirt unify a lot of disparate features through one API, and as a result they depend on many other libraries and external programs. Libvirt directly links to 20 libraries and requires countless other external programs.

We want users to be able to compile libvirt even when they don’t have the full set of libraries (you don’t need, say, PolKit, libvirt will still work with reduced functionality). The issue though is you end up with code which looks like:

switch (cred[i].type) {
case VIR_CRED_EXTERNAL: {
    if (STRNEQ(cred[i].challenge, "PolicyKit"))
        return -1;

#if defined(POLKIT_AUTH)
    if (virConnectAuthGainPolkit(cred[i].prompt) < 0)
        return -1;
#else
    /* Ignore & carry on. Although we can't auth
     * directly, the user may have authenticated
     * themselves already outside context of libvirt
     */
#endif
    break;
}

This code is fragile because (a) it’s hard to reason about all the pathways and (b) it’s combinatorially difficult to test all the different permutations of available libraries. This fragility leads to bug reports and possibly worse.

Before I get to the half-baked idea, I’ll throw in another thought: at the moment we do most of this detection at compile time using a long configure script. It might be better to do it at run time. You could imagine how this could work if you were a very patient programmer who liked writing tedious boilerplate:

libaudit = dlopen ("libaudit.so", 0);
//...
if (libaudit) {
  int (*audit_add_watch) (struct audit_rule_data **rulep,
                          const char *path);
  audit_add_watch = dlsym (libaudit, "audit_add_watch");
  if (audit_add_watch)
    r = audit_add_watch (rule, path);
  else
    goto no_func;
} else {
 no_func:
  // no libaudit, do something else
}

The half-baked idea is this: Write the code as if all the functions exist. Then transform the code into the runtime/dlsym version above. In the first iteration, for each libvirt API entry point we compute the sum of all optional libraries/functions that are required to execute that entry point, and we generate checks like this:

virFoo ()
{
  // The following checks are generated automatically:
  if (!libaudit)
    return error ("virFoo: you need to install libaudit");
  if (!libaudit_audit_add_watch)
    return error ("virFoo: wrong version of libaudit, "
                  "requires audit_add_watch function");
  //..
  // Here we run the programmer's code:
  //..
}

The first iteration is very conservative. In the second iteration of the project we’d allow the programmer to write fallback code, so that partial API functionality is available even if not all the libraries are. But how to do that and avoid the #ifdef problem?

I think you should be allowed to write alternate functions:

authenticate ()
{
  return polkit_context_is_caller_authorized (pkcontext, ...);
}

authenticate ()
{
  return 1;
}

(Remember this is not C, but some sort of C with transformations applied to it).

Our C transformation chooses the “best” function to call at runtime, where best is simply the one which has the most libraries available. In the above case, the first version of authenticate is chosen if the PolKit library is available, the second version if not.

5 Comments

Filed under Uncategorized

Half-baked idea: Standard machine readable output for command line programs

For more half-baked ideas, see the ideas tag.

It’s common, perhaps increasingly common, that one program needs to consume the output of another. This is the Unix philosophy — small, single purpose programs assembled together to carry out a more complex task.

However it’s not necessarily superior to alternative ways of composing programs, like COM or D-Bus.

There are two particular problems: 1. How do you find out if a particular feature is supported by the program? 2. How do you parse the output of the program (eg. to find progress bars or error messages)?

As a concrete example, let’s consider a program I wrote called virt-resize. 1. How do I find out if the version of virt-resize I have supports btrfs? 2. If I want to drive virt-resize from a graphical program, how can I parse the text progress bars that virt-resize prints?

For question 1, typical Unix programs take several approaches:

  1. Ignore the problem completely. Just blindly use the feature, and fail in unpredictable ways if it doesn’t work. This is probably the most popular “strategy”. People who write shell scripts tend to do this all the time. Shell scripts are often either unportable, or end up looking like “configure” because they try to use a very conservative subset of POSIX.
  2. Run the program, if it fails, run the program a bit differently (and if it fails, a bit differently again, …).
  3. Attempt to parse the output of program --help. This depends on help output being stable, when maybe it isn’t, so you end up chasing the upstream project.
  4. Parse program --version and work out if the feature was supported in that particular version. This is not very scalable, and doesn’t work with backports.

Question 2, how to get errors and progress bars, is usually too hard, unless the program offers a special “machine readable” flag (notable examples: rpm, parted).

Here’s my half-baked idea: We should standardize the way that program capabilities, help, progress bars, and error output is done.

An additional option is added to programs that support this:

$ program --machine-readable [...]

Programs that don’t support this, and programs that didn’t support it in earlier versions, ought to give an error if this option is not available.

Firstly, the caller just runs the program with this option on its own, and no other options:

$ program --machine-readable
resize-btrfs
resize-ext3
lv-expand
progress-bars

The program just prints out a list of capabilities, one per line, but with no defined format (that is a contract between the program and the caller). The program then exits with status 0. Using this option should not cause the program to perform any other action.

Secondly, if this was successful, the caller can use this option in combination with other options to produce machine-readable output. At least one other option or command line argument is required for this to work.

I would like to suggest the following standards for version numbers, progress bars and error messages.

Machine-readable version numbers are sent to stdout and have the form “program-name version”, where “program-name” should be one word. This is no different from how most GNU programs work:

$ program --version
program 0.1.2

Machine-readable progress bars are sent to stdout and have the form (example) “[10/100]“:

$ program --machine-readable foo
[0/100]
[1/100]
[2/100]
etc.

Error messages are anything sent to stderr when the status code of the program is non-zero. This is, of course, no change from standard Unix.

$ program --machine-readable foo
foo: File not found

24 Comments

Filed under Uncategorized

Half-baked ideas: C: please let me free (static data)

For more half-baked ideas, see the ideas tag.

In modern C, you can free (NULL) safely. I’d like to suggest a small extension: let it be safe to free static data too.

This would avoid many cases where we end up with a “useless” strdup, as in the pseudo-code below:

freeable char *
type_to_string (union type u)
{
  switch (u.type) {
  case INT: return "int"; // strdup avoided
  case CHAR: return "char"; // strdup avoided
  case STRUCT:
    asprintf (&str, "struct %s",
              type_to_string (u.strct.t));
    return str;
  }
}

Notice that

free (type_to_string (u));

would work, whether the type u was INT, CHAR or STRUCT.

This feature is inspired by OCaml. In OCaml, the garbage collector keeps track of pages allocated on the OCaml heap versus everything else, in a hash table. It uses the hash table to avoid following pointers outside the OCaml heap. This gives you great flexibility: an OCaml “pointer” can point to arbitrary stuff if you want (eg. C-allocated structures, static data), and the GC just ignores it. This makes clever stuff like OCaml ancient possible.

A C implementation of this would also need to track which pages come from the C heap, and which pages are static data. It’s quite possible that most C malloc implementations can or do have this already. glibc certainly seems to be able to detect when you free up a non-heap pointer.

Eagle-eyed reviewers will have noted this would break “const correctness” (which is a very woolly and broken concept in C anyway, compared to functional languages). It needs a new class of pointers which can be freed but cannot be written to. I used freeable char * in the pseudo-code above.

3 Comments

Filed under Uncategorized

Half-baked ideas: classify git commits and rule-based branch building

For more half-baked ideas, see my ideas tag.

This idea is an evolution of the previous idea for git commit dependency analysis.

Let’s say we have a development branch and a stable branch. Lots of bug fixes, features and so on go into the development branch, but for stable users we want to construct a branch containing only the stable, well-tested, obvious bug fixes. (It’s no coincidence that this sounds a lot like the scheme we use in libguestfs, but it’s also the basic scheme that Red Hat use to construct RHEL.)

What you end up doing is manually classifying the commits from the development branch, and cherry-picking the ones which look safe back to your stable branch. And you probably have a bunch of rules about what you want in your stable branch, such as “must fix a customer bug” or “must have had more than X months of testing in Fedora”.

A better idea would be to annotate the development commits with labels:

0d6fd9e fuse: Fix getxattr, listxattr calls and add a regression te
  labels: bugfix, fuse
  depends: 4e529e0

4e529e0 fish: fuse: Add -m dev:mnt:opts to allow mount options to b
  labels: feature, fish, fuse

feaddb0 roadmap: Move QMP to 'beyond 1.10'.
  labels: documentation

b8724e2 Open release notes for version 1.10.0.
  labels: documentation

e751293 ruby: Don't segfault if callbacks throw exceptions (RHBZ#664
  labels: bugfix, ruby
  depends: 6a64114

a0e3b21 RHEL 5: Use mke4fs on RHEL 5 as replacement for mke2fs.
  labels: bugfix, rhel5
  depends: 227bea6

(Before anyone jumps in here, Markus Armbruster pointed out to me that git has a great feature called git notes that makes the labelling part rather easy).

Now you need a rule as to what you’re going to allow into your stable branch, eg:

labels.contains ('bugfix') &&
  forall c in depends: c is in stable

Now your stable branch can be constructed entirely algorithmically. In fact, making many stable branches each with a slightly different emphasis (“bug fixes only”, “doc changes only”, “well-tested new features”, etc.) can be done automatically and algorithmically just by having more rules.

Update Here is a reply from Johan Herland who is one of the authors of git-notes. Thanks Johan for giving me permission to reproduce this email.

IMHO, using notes like you outline in the blog post makes a lot of sense. Attaching text strings as notes to the relevant commit is exactly what git-notes were created for. Storing simple strings for labels, and commit SHA1s (use full 40-char SHA1 sums to prevent future collisions) for dependencies sounds like the best plan.

A couple of different ways to encode this:

A. Place labels in one notes ref (refs/notes/labels), and dependencies in another notes ref (refs/notes/deps). The format of notes is simply one label/dependency per line, e.g.:

  $ git notes --ref=labels show HEAD
  bugfix
  fuse
  ruby
  $ git notes --ref=deps show HEAD
  4e529e0
  b8724e2

B. Place labels and deps in the same notes ref, and use a simple email-style header format, e.g.:

  $ git notes --ref=foo show HEAD
  Labels: bugfix, fuse, ruby
  Dependencies: 4e529e0, b8724e2

Either format is extensible: In (A), if you add a new data type, simply add a new notes ref; in (B) simply add a new header field name.

Whatever you feel most comfortable parsing/maintaining is probably the best choice for you.

3 Comments

Filed under Uncategorized

Half-baked idea: Libraries should run in their own security context

For more half-baked ideas, see the ideas tag.

When your program links to a library, the program and the library run in the same process and so have the same SELinux and other security context (UID, resource limits, etc).

Why can’t libraries have their own context? This would protect your program from bugs in libraries, and allow you to control more precisely what the library is doing, eg. what files it is trying to access.

I originally thought this would be simple: just have the kernel set the context according the userland instruction pointer (ie. %rip). Program code in certain pages would therefore have a different context when calling into the kernel from library code in other pages. You might also have to be careful with system calls that read and write user data. Anyway, this doesn’t quite work because a library could bypass this by changing the memory protection on some code in another part of the program, overwrite that code, then jump to it. Perhaps if the library’s security context prevented it also from calling mprotect etc it could be made to work? Edit: no it can’t be made to work, see the comments.

Another way to do it would be to (behind the scenes) run the library really in a separate process, turning local calls into RPCs. This is also not totally trivial if you consider global variables (shared memory?) and callbacks, and the performance implications could sink this idea too.

In any case the important thing would be to make any technique as transparent to the programmer as possible, so that programmers don’t have to explicitly split programs up into separate communicating processes in order to gain the security advantages. If no code changes were needed, we could apply the technique to existing programs in order to harden Linux still further.

9 Comments

Filed under Uncategorized

Half-baked idea: “Try this patch” tool for RPMs

For more half-baked ideas, see my ideas tag.

RPM has some nice features for easily rebuilding packages. You can, for example, easily structure a source tarball so that an end user can build RPMs from it in a single step, and you can also easily rebuild an RPM from a source RPM. (See my recent notes on how to do all that here).

However for a lot of end users even these simple commands are too complex. And applying a patch to an RPM is beyond even that stage.

Here’s the idea: it’s a “try this patch” graphical tool. It takes a patch from a pastebin or email, and tries to apply it to an installed package. It downloads the source, attempts to apply the patch, rebuilds a new binary RPM, and installs it. (Of course it may not be possible to apply the patch, in which case it should either give the user a very simple message about what went wrong, or help more advanced users to manually fix rejects).

With this tool I could in confidence ask a user: “try this patch and tell me if it works”.

All the user has to do is to drag the patch file into the “try this patch” tool, and it will do the rest. If the patch doesn’t fix the problem, the tool lets the user “yum downgrade” to the previous version.

See also: A “view source” button for Fedora

3 Comments

Filed under Uncategorized

Half-baked ideas: feedback-directed fuzz testing of filesystems

For more half-baked ideas, see my ideas tag

Fuzz-testing (a.k.a random testing) is an automatic method of testing where you feed in random data and try to make the program crash.

You might, for example, write random bytes to a disk, then ask Linux to mount it. This is not very sophisticated since Linux is unlikely to find a filesystem signature in pure random bytes, so the test will almost always fail to test anything. A better approach is to take an existing disk image and write random bits to parts of it. This is what Steve Grubb’s fsfuzzer does, but it’s still not the state of the art, although Steve found several bugs with it.

Suppose we think of a filesystem as just an array of integers:

4, 5, 13, 0, 2, ...

The code in the kernel to read this filesystem might start off:

if ((fs[0] & 3) != 0) {
  printk ("invalid filesystem");
  return -EINVAL;
}

If you generate random filesystems, then only 1 in 4 filesystems will get past this first test in the code, so only 1 in 4 of your random tests is testing anything beyond the first statement.

A better approach is to use feedback from the kernel code to evolve your fuzz tests. You score each fuzz test based on how far into the kernel code it gets, then you use that score to evolve your tests using standard genetic algorithm techniques. The idea is that your fuzz tests evolve to test more and more of the kernel code you are interested in testing, instead of just randomly falling down at the first hurdle.

(This technique is well-known in research as feedback-directed fuzz testing, feedback-directed random testing, or evolutionary fuzz testing. As far as I can tell no one is using it on Linux.)

Here’s the half-baked idea: Use systemtap probes as a way to score the fuzz-tests.

We write a systemtap module which inserts probes at every possible point in the file we are interested in testing, say, fs/minix/*.c. These probes just print a simple “I am here” type message.

When we run our fuzz test, we score it according to how much output it produces (by measuring “dmesg” before and after the test). Hitting probe points scores 1. Causing a kernel oops scores a lot more. You’d probably want to do this in a VM, rather than on your host kernel …

Starting with real, but small filesystems, we randomly fuzz them to generate our initial test cases, then evolve those according to how much they score. The test cases are individually tested using the same method as Steve’s fsfuzzer — ie. mounting the filesystem and running through a few simple system calls (read the directory, read files, read extended attrs ..)

Test cases which hit the equivalent of the 1-in-4 problem above are quickly evolved out, and the test cases we are left with hopefully explore and test more of the code.

Leave a Comment

Filed under Uncategorized

Half-baked ideas: git commit dependency analysis

For more half-baked ideas, see my ideas tag

Consider these three git commits A, B and C:

As far as git’s linear history is concerned, they are three independent commits and git only records the parent relationship A -> B -> C.

But could I cherry pick just patch C, omitting A and B? git won’t tell me, but I can examine the patches themselves (A, B, C) and answer this question (the answer is no, since the change in C depends on both changes A and B being applied first). The real dependency tree looks like this:

   C
  ^ ^
 /   \
A --> B

Many other real dependency trees could have been possible. With another choice of A, B and C these might have been completely independent of each other, or (A, B) might have to be applied together, with C being independent.

The half-baked idea is whether we can write an automatic tool which can untangle these dependencies from the raw git commits? (Or whether such a tool exists already, I cannot find one)

There would be one important practical use for such a tool. When cherry picking commits for the stable branch, I would like to know which previous commits that the commit I’m trying to apply depends on. This gives me extra information: I can decide that applying this commit is too disruptive — perhaps it depends on an earlier feature which I don’t want to add. I can decide to go back and apply the older commits, or that a manual backport is the best way.

The information you can derive from patches doesn’t tell the whole story. There are two particular problems, one revealed by the choice of patches A, B and C above. With a trivial change to B, it is possible to apply A and B independently. There is only a dependency A -> B because a little bit of the context of patch A “leaked” into patch B. It is also possible for two features to logically depend on each other, but not overlap in any way: Consider the case where you add a log collector and log file processor in separate commits. The log file processor might be completely useless without the log collector, but the commits could appear completely independent if you just examine the patches.

5 Comments

Filed under Uncategorized

Half-baked ideas: Accelerated testing for VMs

For more half-baked ideas, see my ideas tag

When manufacturers build consumer hardware it is often subjected to so-called “accelerated aging” or “accelerated testing”. For example, the life of an aircraft is not measured in years, but in pressurization cycles. Aircraft are repeatedly pressurized in tanks, and shaken (“fatigue testing”) to find out when they will fail.

So here’s the idea: Apply accelerated testing to virtual machines. Modify pvclock so it runs 100 or 1000 times faster than normal. Daily and weekly cron jobs accelerated, and so are daemons, so we rapidly find if they slowly leak memory or disk space, or have some other time-related failure. How will the guest behave after it has been running for a year, or 10 years? How about applications? Now we can find out, and not need to wait that long.

4 Comments

Filed under Uncategorized

Presentation software sucks – introducing …

How many dull presentations have you been to where the presenter simply reads bullet points off slides?

FrobSoft Express 2.0 is:

  • up to 5% faster
  • supports Windows
  • XML enabled!

I’m giving a talk about libguestfs on 18th March and I hate reading out slides to people as much as I hate listening to presenters reading out slides to me. In every talk I’ve given in the last few years I have tried to keep my notes separate (written on paper in front of me, or memorized) from what is on the slides. Presentation software, such as the mighty, all-pervasive OpenOffice, doesn’t make this easy. Nor does it make it easy to demonstrate software in the middle of your talk. You end up having to switch away to another virtual desktop, where (hopefully) you’ve remembered to set up some xterms “su”‘d to root and “cd”‘d into the right directory. I usually need several virtual desktops set up like this so I can demonstrate different parts of the software, so I’m standing in front of an audience using [Alt][←] and [Alt][→] while I hastily try to remember which virtual desktop has the next stage of the talk.

Enough!

Introducing “Tech Talk”. Actually, Tech Talk is too generic in Google, so we brainstormed adding extra words on the end until it became unique: Introducing “Tech Talk Platinum Supreme Edition!” (Tech Talk PSE).

The concept is simple. You create a directory and drop a mixture of HTML files and shell scripts in there:

$ ls
10-hello.html
20-shell.sh
30-goodbye.html
$ techtalk-pse

When Tech Talk PSE runs, it sorts the files numerically, and then displays the HTML ones (using Mozilla embedding) as slides and runs the shell script ones. Next and previous keys move through the slides, ensuring that your demonstrations [the shell scripts] run automatically at the right place in the talk.

Only files matching ^\d+(-.*)\.(html|sh)$ are considered, everything else is ignored. So you can style your HTML using stylesheets, include READMEs and Makefiles, and move common shell functionality into sourced shell files:

#!/bin/bash -
# Source common functions and variables.
source functions
# Pre-populate the shell history.
cat > $HISTFILE <<EOF
guestfish -a vm.img
EOF
# Open gnome-terminal.
exec $TERMINAL --geometry=+100+100

Tech Talk PSE itself doesn’t have to deal with rendering, which is pushed off to a browser, making it far more flexible, powerful and simpler than existing presentation software. This means you can show figures or play video in your presentation, or use Javascript to make your slides resolution-independent or to add animations. Additionally you can use any existing tool you want to write HTML. (If you’re like me, that tool will be emacs.)

You’ll be able to download Tech Talk PSE after my talk in two weeks time, or get early previews from my git repository. Requirements are Perl, Perl Gtk2 and Gtk2::MozEmbed.

7 Comments

Filed under Uncategorized