Half-baked idea: Standard machine readable output for command line programs

For more half-baked ideas, see the ideas tag.

It’s common, perhaps increasingly common, that one program needs to consume the output of another. This is the Unix philosophy — small, single purpose programs assembled together to carry out a more complex task.

However it’s not necessarily superior to alternative ways of composing programs, like COM or D-Bus.

There are two particular problems: 1. How do you find out if a particular feature is supported by the program? 2. How do you parse the output of the program (eg. to find progress bars or error messages)?

As a concrete example, let’s consider a program I wrote called virt-resize. 1. How do I find out if the version of virt-resize I have supports btrfs? 2. If I want to drive virt-resize from a graphical program, how can I parse the text progress bars that virt-resize prints?

For question 1, typical Unix programs take several approaches:

  1. Ignore the problem completely. Just blindly use the feature, and fail in unpredictable ways if it doesn’t work. This is probably the most popular “strategy”. People who write shell scripts tend to do this all the time. Shell scripts are often either unportable, or end up looking like “configure” because they try to use a very conservative subset of POSIX.
  2. Run the program, if it fails, run the program a bit differently (and if it fails, a bit differently again, …).
  3. Attempt to parse the output of program --help. This depends on help output being stable, when maybe it isn’t, so you end up chasing the upstream project.
  4. Parse program --version and work out if the feature was supported in that particular version. This is not very scalable, and doesn’t work with backports.

Question 2, how to get errors and progress bars, is usually too hard, unless the program offers a special “machine readable” flag (notable examples: rpm, parted).

Here’s my half-baked idea: We should standardize the way that program capabilities, help, progress bars, and error output is done.

An additional option is added to programs that support this:

$ program --machine-readable [...]

Programs that don’t support this, and programs that didn’t support it in earlier versions, ought to give an error if this option is not available.

Firstly, the caller just runs the program with this option on its own, and no other options:

$ program --machine-readable
resize-btrfs
resize-ext3
lv-expand
progress-bars

The program just prints out a list of capabilities, one per line, but with no defined format (that is a contract between the program and the caller). The program then exits with status 0. Using this option should not cause the program to perform any other action.

Secondly, if this was successful, the caller can use this option in combination with other options to produce machine-readable output. At least one other option or command line argument is required for this to work.

I would like to suggest the following standards for version numbers, progress bars and error messages.

Machine-readable version numbers are sent to stdout and have the form “program-name version”, where “program-name” should be one word. This is no different from how most GNU programs work:

$ program --version
program 0.1.2

Machine-readable progress bars are sent to stdout and have the form (example) “[10/100]”:

$ program --machine-readable foo
[0/100]
[1/100]
[2/100]
etc.

Error messages are anything sent to stderr when the status code of the program is non-zero. This is, of course, no change from standard Unix.

$ program --machine-readable foo
foo: File not found
Advertisement

24 Comments

Filed under Uncategorized

24 responses to “Half-baked idea: Standard machine readable output for command line programs

  1. bochecha

    But why scrape the output of a program?

    That seems fragile and ugly, like parsing html to get an info out of a web page. 😡

    And it pains me every time I have to “pipe grep pipe awk pipe sed” out of a command line tool to isolate the information I need.

    Programs should rather provide reusable APIs.

    • rich

      Given that currently we’re “scraping” the output of “qemu -help”, having the output in a predictable form is an improvement.

      You don’t define what you mean by “reusable APIs”, so I can’t really comment on that point.

  2. What would be nice, if the shell had a bit more of an understanding of generic output from these tools, then all the tools can emit a standardised (lets say JSON, even though its a fairly poor standard) format, so we needn’t use params -0 -print0 –machine-readable or any of this second guessing to tell the tools how to output ‐ It just works.
    ls would gather a list of files and output that to whatever stream is listening. If its the shell sat on a term, it knows how to output in formatted text.

    I wrote similar thoughts when TermKit was announced, ‘@johndrinkwater TermKit looks nice, but only for the mime / json IPC. not the front end display :/ http://acko.net/blog/on-termkit’
    ‘@johndrinkwater seem to remember @bkuhn & @jdub having an identica conversation about future of shell, std(in|out|err) being json ala #termkit #xmlterm’
    That convo was http://identi.ca/conversation/34128300

    • rich

      If we standardize on something, please not JSON. JSON doesn’t even define what it means to be an integer. It’s wholly unsuitable as a way to exchange data reliably.

    • The shell has nothing whatsoever to do with passing data between processes. The problem cannot and should not be fixed by turning the shell into a router in the middle of interactions between every pair of communicating processes.

      • Actually I apologize, or “apologize if appropriate” I guess, because I may be misinterpreting what you’ve suggested.

        The shell (in my opinion, of course) shouldn’t be actively sticking its fingers between processes. That would be a fairly radical transformation of the semantics of the pipe as interprocess channel. I don’t think that other solutions are bad, but we’re no longer talking about the Unix model if we have pipes that are “active” somehow. (Makes my head hurt, at least.)

        Now if what you meant was that the shell script would be the place where the vagaries of differing applications was hashed out, then I agree of course (because that’s kind-of what the nuts-and-bolts of shell script construction involve already).

      • You are misinterpreting slightly, the problem with tying all these applications together with a descriptive output format, is that you either need to write duplicate code to format it for stdout, or have something generic parse it when it’s viewed by term. Not sure if that should be shell or not (i’m not too familiar with the stack), but pushing stuff like filesize byte formatting into an external shim makes for better localisation and configuration.
        Something like `ls | humanreadable` with the last command being invisible to user.
        It would not get between processes, nor make pipes ‘active’.

  3. Though, in the case of wanting a version number, link to website/code, licence, description: I really do wish binaries contained more in their ELF header for this purpose.

  4. Kevin Granade

    An even less-baked idea I had in reaction to yours, perhaps a package manager could grow a query-able capabilities database. This of course doesn’t address error reporting, progress, etc.

    (quibble) I’d argue that if the output isn’t standardized, there’s no benefit to the argument being standardized either. If you already have to write custom handlers for each program you want to talk to, the additional overhead to add ‘machine-readable-switches[foo-program] = foo-switch’ is negligible. This is more arguing for output standardization rather than against the idea as a whole.

    • rich

      I think once you have a package manager, you can enforce capabilities through hard dependencies on version numbers. The distro as a whole ensures that version numbers are usable for this — it’s what we do in Fedora and RHEL.

      The problem is in the looser upstream world, where your software may have to exist in whatever environment it finds itself.

  5. Thoughts:

    * Version of a particular utility is not nearly as interesting as the version of its output format. If this is a problem worth solving, then the first step is to stop focusing on the utility applications themselves and start focusing on their output as things to be standardized. Then the tools can progress independently of the communications formats.

    * I would argue that the most powerful enhancement that could be made would be to have utility applications model their available outputs in such a way as to make it simple for their output to be exactly described by format strings on the command line. Several tools already do this: think about “date” and “find” (the “-printf” option). With facilities like that, what a shell script can do is provide the glue to mate the output of one utility to the exact input “shape” for the next utility in the pipe chain.

  6. mcandre

    As a CLI lover, I try my best to combine my APIs with usable CLI apps. That way, other CLI lovers can just run ./api.py –func, and GUI developers can do import api; api.func() to get the same functionality. It takes so long to describe this pattern that I’ve given it a name: “scripted main”.

    Cheers!
    https://github.com/mcandre/scriptedmain

  7. sidfarkus

    Sounds like you just invented a weaker version of Windows Powershell.

  8. c janscen

    I once wanted to create a whole ecosystem of tools that output YAML, which is nice cause it’s machine readable output is also very human readable (like your above examples).

    YAML would also introduce types: strings, integers, and aggregations of types: lists, maps, etc.

    Then you could iterate over the output and open up lots of possibilities..of course then to truly harness this system you need a shell language which would share the types and could iterate..its gets complicated fast.

    Someday, I’ll write all these in Haskell

  9. I was thinking about doing this, and I don’t see why would you not use JSON. Remember, Worse Is Better.

    For example, here is a sample df output:

    Filesystem 1K-blocks Used Available Use% Mounted on
    none 492228 264 491964 1% /dev
    none 496472 1796 494676 1% /dev/shm
    none 496472 344 496128 1% /var/run
    none 496472 0 496472 0% /var/lock

    And here is how it would look like on JSON:

    {“filesystem”: “none”,”onek_blocks”: 492228, “used”: 264, “available”: 491964, “use_percent”: 1, “mounted_on”: “/dev”}
    {“filesystem”: “none”,”onek_blocks”: 496472, “used”: 1796, “available”: 494676, “use_percent”: 1, “mounted_on”: “/dev/shm”}
    {“filesystem”: “none”,”onek_blocks”: 496472, “used”: 344, “available”: 496128, “use_percent”: 1, “mounted_on”: “/var/run”}
    {“filesystem”: “none”,”onek_blocks”: 496472, “used”: 0, “available”: 496472, “use_percent”: 0, “mounted_on”: “/var/lock”}

    Then you could observe this as a table and do SQL-like things to it. For example:

    df | where use_percent -lt 1

    to return only {“filesystem”: “none”,”onek_blocks”: 496472, “used”: 0, “available”: 496472, “use_percent”: 0, “mounted_on”: “/var/lock”}

    You don’t even have to have –machine-readable, a program can recognize whether it outputs to TTY and offer JSON if not.

    • rich

      The problem is that JSON isn’t defined. Undefined isn’t better.

      Anyway, compare:
      virt-df --csv format.

    • The problem with JSON relative to YAML is that JSON does not have a way to escape arbitrary bytes. JSON strings have to be Unicode strings.

      One concern I have about JSON and YAML is the relationship between ‘object’ parsing and streaming. An ‘object’ is an all-or-nothing kind of a thing — you have the keys or you don’t. Whereas, say you did away with objects and had plain rose trees instead: these are streamable and serve much the same purpose, in that they can represent hierarchical things.

  10. DDD

    Yay – Windows Powershell 🙂 Return an object instance with appropriate methods to display on screen or pass into other comandline…

  11. ted stockwell

    RDF is the ultimate machine readable format so I would choose to output RDF but I would use Turtle syntax (its like JSON for RDF, Turtle is much more human readable than the XML syntax for RDF).
    The output from programs would be a stream of Turtle statements.
    The statements can be stored by the consuming program and queried with SPARQL. There are lots of RDF libraries available that would make this easy.

  12. Chuck

    This is precisely what PowerShell does. It works because it standardizes on a rich object model they can all communicate with.

    What text format would you standardize on? XML? JSON? BSON? YAML? I see others have already come up with even more obscure format suggestions. I guarantee you’ll never get agreement on this.

    The silver lining to this pessimistic cloud however is that you might not have to get agreement: simply communicate the content-type in the output, and for this, you could just go with MIME. You could even communicate content negotiation over the pipes to get a preferred format.

    I still think doing textual format conversions on output is vastly inferior to a systemwide object model ala PowerShell, but at least you’d have something to go on.

  13. Chuck

    Another thing you might want to consider if you take a stab at doing this is not to require a new command-line option to enable parseable output formats, but to possibly look at an environment variable instead.

    While you’re tinkering with the way unix programs produce and consume I/O, you might also want to consider how you’d put a shell together for this. One thing to look at is reconsidering the lowly pipe. Pipes are great things sure, powerful and flexible things, but they’re also just about the most primitive thing that can glue two fd’s together, and there’s no reason a high level thing like a shell shouldn’t be able to express more powerful IPC mechanisms like message queues or sockets with shell literal syntax. If you rethink not just the output formats, but the glue itself, then you can have your object-oriented powershell and a text-oriented classic shell all at the same time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.