Half-baked ideas: classify git commits and rule-based branch building

For more half-baked ideas, see my ideas tag.

This idea is an evolution of the previous idea for git commit dependency analysis.

Let’s say we have a development branch and a stable branch. Lots of bug fixes, features and so on go into the development branch, but for stable users we want to construct a branch containing only the stable, well-tested, obvious bug fixes. (It’s no coincidence that this sounds a lot like the scheme we use in libguestfs, but it’s also the basic scheme that Red Hat use to construct RHEL.)

What you end up doing is manually classifying the commits from the development branch, and cherry-picking the ones which look safe back to your stable branch. And you probably have a bunch of rules about what you want in your stable branch, such as “must fix a customer bug” or “must have had more than X months of testing in Fedora”.

A better idea would be to annotate the development commits with labels:

0d6fd9e fuse: Fix getxattr, listxattr calls and add a regression te
  labels: bugfix, fuse
  depends: 4e529e0

4e529e0 fish: fuse: Add -m dev:mnt:opts to allow mount options to b
  labels: feature, fish, fuse

feaddb0 roadmap: Move QMP to 'beyond 1.10'.
  labels: documentation

b8724e2 Open release notes for version 1.10.0.
  labels: documentation

e751293 ruby: Don't segfault if callbacks throw exceptions (RHBZ#664
  labels: bugfix, ruby
  depends: 6a64114

a0e3b21 RHEL 5: Use mke4fs on RHEL 5 as replacement for mke2fs.
  labels: bugfix, rhel5
  depends: 227bea6

(Before anyone jumps in here, Markus Armbruster pointed out to me that git has a great feature called git notes that makes the labelling part rather easy).

Now you need a rule as to what you’re going to allow into your stable branch, eg:

labels.contains ('bugfix') &&
  forall c in depends: c is in stable

Now your stable branch can be constructed entirely algorithmically. In fact, making many stable branches each with a slightly different emphasis (“bug fixes only”, “doc changes only”, “well-tested new features”, etc.) can be done automatically and algorithmically just by having more rules.

Update Here is a reply from Johan Herland who is one of the authors of git-notes. Thanks Johan for giving me permission to reproduce this email.

IMHO, using notes like you outline in the blog post makes a lot of sense. Attaching text strings as notes to the relevant commit is exactly what git-notes were created for. Storing simple strings for labels, and commit SHA1s (use full 40-char SHA1 sums to prevent future collisions) for dependencies sounds like the best plan.

A couple of different ways to encode this:

A. Place labels in one notes ref (refs/notes/labels), and dependencies in another notes ref (refs/notes/deps). The format of notes is simply one label/dependency per line, e.g.:

  $ git notes --ref=labels show HEAD
  $ git notes --ref=deps show HEAD

B. Place labels and deps in the same notes ref, and use a simple email-style header format, e.g.:

  $ git notes --ref=foo show HEAD
  Labels: bugfix, fuse, ruby
  Dependencies: 4e529e0, b8724e2

Either format is extensible: In (A), if you add a new data type, simply add a new notes ref; in (B) simply add a new header field name.

Whatever you feel most comfortable parsing/maintaining is probably the best choice for you.



Filed under Uncategorized

3 responses to “Half-baked ideas: classify git commits and rule-based branch building

  1. We use a similar annotation workflow (and a half-baked little tool) to automatically generate release notes in the upstream Condor project: see this post about the “gliss” tool and this Condor wiki page about the workflow.

    It wouldn’t be hard to use gliss to get a list of SHAs to cherry-pick as well.

    • rich

      Nice, but I think it would be better if it used git notes rather than (I assume) commit messages. The point here is that we’d like to go back at a later date and classify or adjust the classification of commits. We often don’t know if a commit is really good until it’s been out in the field and tested for some time.

      • I agree completely regarding the desirability of “git notes,” or (more generally), a notes-like approach that allows multiple independent annotations after the fact of a commit, perhaps by having annotations in multiple branches with easily-distinguished names. (We aren’t using notes in our case because we can’t be sure that everyone who’d need to use this functionality has a sufficiently-modern version of git installed.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.