Half-baked ideas: git commit dependency analysis

For more half-baked ideas, see my ideas tag

Consider these three git commits A, B and C:

As far as git’s linear history is concerned, they are three independent commits and git only records the parent relationship A -> B -> C.

But could I cherry pick just patch C, omitting A and B? git won’t tell me, but I can examine the patches themselves (A, B, C) and answer this question (the answer is no, since the change in C depends on both changes A and B being applied first). The real dependency tree looks like this:

  ^ ^
 /   \
A --> B

Many other real dependency trees could have been possible. With another choice of A, B and C these might have been completely independent of each other, or (A, B) might have to be applied together, with C being independent.

The half-baked idea is whether we can write an automatic tool which can untangle these dependencies from the raw git commits? (Or whether such a tool exists already, I cannot find one)

There would be one important practical use for such a tool. When cherry picking commits for the stable branch, I would like to know which previous commits that the commit I’m trying to apply depends on. This gives me extra information: I can decide that applying this commit is too disruptive — perhaps it depends on an earlier feature which I don’t want to add. I can decide to go back and apply the older commits, or that a manual backport is the best way.

The information you can derive from patches doesn’t tell the whole story. There are two particular problems, one revealed by the choice of patches A, B and C above. With a trivial change to B, it is possible to apply A and B independently. There is only a dependency A -> B because a little bit of the context of patch A “leaked” into patch B. It is also possible for two features to logically depend on each other, but not overlap in any way: Consider the case where you add a log collector and log file processor in separate commits. The log file processor might be completely useless without the log collector, but the commits could appear completely independent if you just examine the patches.


Filed under Uncategorized

6 responses to “Half-baked ideas: git commit dependency analysis

  1. foo

    Please bake this idea and stick it into git cherry-pick!

  2. Gabriel

    Convert your git repository to darcs (there is a fast-import tool available) and you will get patch commutation (and dependency analysis) for free!

  3. I recently implemented a tool that can analyze a piece of linear history for (obvious) dependencies: https://github.com/matthijskooijman/patchdeps

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.