I was writing a little program to track monthly outgoings. “Only” £30/month for internet access or whatever can quickly add up …
But what format should I save the data in? XML is heavyweight and redundant compared to S-expressions, compare:
<outgoing rate="monthly"> <price>30.</price> <name>Internet</name> </outgoing>
(outgoing (rate monthly) (price 30.) (name "Internet"))
(Update: fixed XML x 2)
One difference I always notice is the redundancy of attributes like rate=”monthly”. S-expressions let you decide to make the attribute structured, but with XML you’re stuck with a simple string unless you make an incompatible change to the schema.
Another difference is that S-expressions are typed. 30 is a float and “Internet” is a string. XML is all just strings, which sucks when your language is typed.
On the other hand this article makes a good argument that XML is not (and is better than) S-expressions. More debate here.
A killer feature of OCaml is the sexplib syntax extension which makes S-expressions really easy. You just define any OCaml type in the usual way, and add
with sexp after it, and that magically generates serializer and deserializer functions for your type, so you can slurp your data into and out of S-expression files effortlessly. A page of boilerplate disappears in just two words. That’s probably the reason why I’ll go with S-expressions for this.
14 responses to “XML or S-expressions?”
My view is that if you have a scalar value make it an attribute, if you have a more complicated value, make it an entity. Other people make everything an entity w/ cdata because it is easier to write a DTD for that but it gives future flexibility of having more structure in an attribute.
Remember in sexpressions attributes are a convention, in XML they are defined as attributes and exist. Your sexpressions are effectively attribute-less because there is no consistent definition, the bonus is that you can store just any structure.
When you hit graphs or references you’re screwed in both worlds. XML has some references but it is pretty hokey-joe (excuse the expression)
Note that what you have on the left is not even xml, it has to be even more verbose than that.
I don’t buy your arguments about typedness. In any case this will end up as a string and the underlying software will have to parse it to convert it to the right datatype. Sexps are however much more behaved when it comes to white space management (because it doesn’t conflate “text” and data).
If you have to enter the entries by hand I would suggest to use a more friendly textual format. Have a look at ledger (http://joyful.com/repos/ledger/doc/ledger.html) or hledger (http://hledger.org).
If not then the only advantage I see with xml is that it *may* make it easier for your users to reuse their data.
P.S. If you define your own xml language, make it easy to parse : mandate a fixed tag ordering, avoid attributes altogheter and treat element’s data like attribute data (whitespace stripping) unless the element’s content is real text (in which case use the xml:space attribute on the element). In effect that makes your xml a more verbose version of sexps. With that and a good xml parser it shouldn’t be hard to IO the data to your internal representation (but harder than adding “with sexp”).
Forgot to add, if your data is not hierarchical but flat (tree vs table), use neither xml nor sexps but comma separated values (and eat your own dog food).
Thanks Daniel — my obvious broken XML is now fixed in the article 🙂
You can also extract bits of an sexp without having to convert it back to an ocaml type if you want, using Sexplib.Path.get. This can be pretty handy at times when you aren’t exactly sure of the type of the sexp and only need part of the data.
Not completly fixed. You must write something like :
30 or or maybe that’s what you wanted to write :
Mmmh this blogging software doesn’t seem to know how to escape markup. Here are the two missing lines, hope it works :
<price>30</price> or <price value=’30’>
<outgoing rate=’monthly’ price=’30’ name=’internet’>
Well… forgot the ; in the second tag and the two last tags need to be closed with a / before >.
Yeah, I was getting really confused by the XML there. This possibly proves a point about sexprs being easier though, I’m not sure …
You may use Lua as well. E.g.:
rate = ‘monthly’,
price = 30,
name = ‘Internet’
Semantic Web RDF/OWL format can also be of help here. In RDF, data is represented as a triple, subject predicate object.
In your case,
http://example.org/some_id is_of_type OUTGOING
http://example.org/some_id rate MONTHLY
http://example.org/some_id price 30.0
http://example.org/some_id name INTERNET
Note I am assuming Objects of a specific type but they can be literal values also.
Most of the cool kids use JSON these days. Just saying…
mpdehaan, hmm yes, well I’m not one who follows fashion so much 🙂
Ooops, previous link was to the JSON library. This is the JSON syntax extension.