What are the relative merits of CSV, JSON and XML for a REST API?
We're currently planning a new API for an application and debating the various data formats we should use for interchange. There's a fairly intense discussion going on about the relative merits of CSV, JSON and XML.
Basically, the crux of the argument is whether we should support CSV at all because of the lack of recursion (i.e. having a document which has multiple authors and multiple references would require multiple API calls to obtain all the information).
In the experiences you may have had when working with information from Web APIs and things we can do to make the lives easier for the developers working with our API.
We've decided to provide XML and JSON due to the difficulty in recursion in CSV needing multiple calls for a single logical operation. JSON doesn't have a parser in Qt and Protocol Buffers doesn't seem to have a non-alpha PHP implementation so they are out for the moment too but will probably be supported eventually.
CSV is right out. JSON is a more compact object notation than XML, so if you're looking for high volumes it has the advantage. XML has wider market penetration (I love that phrase) and is supported by all programming languages and their core frameworks. JSON is getting there (if not already there).
Personally, I like the brackets. I would bet more devs are comfortable with working with xml data than with json.
- XML - Lots of libraries, Devs are familiar with it, XSLT, Can be easiily Validated by both client and server (XSD, DTD), Hierarchical Data
- JSON - easily interpreted on client side, compact notation, Hierarchical Data
- CSV - Opens in Excel(?)
- JSON - If used improperly can pose a security hole (don't use eval), Not all languages have libraries to interpret it.
- CSV - Does not support hierarchical data, you'd be the only one doing it, it's actually much harder than most devs think to parse valid csv files (CSV values can contain new lines as long as they are between quotes, etc).
Given the above, I wouldn't even bother supporting CSV. The client can generate it from either XML or JSON if it's really needed.
CSV has so many problems as a complex data model that I wouldn't use it. XML is very flexible and easy to program with - clients will have no problem coding XML generators and parsers, you can even provide sample parsers using SAX.
Have you checked out Google's network data format? It's called Protocol Buffers. Don't know if it is useful for a REST service however as it skips that whole HTTP layer too.
XML can be a bit heavyweight at times. JSON is quite nice, though, has good language support, and JSON data can be translated directly to native objects on many playforms.
I don't have any experience with JSON, CSV works up to a point when your data is very tabular and evenly structured. XML can become unwieldy very quickly, especially if you don't have a tool that creates the bindings to your objects automatically.
I have not tried this either but Google's Protocol Buffers look really good, simple format, creates automatic bindings to C++, Java and Python and implements serialisation and deserialisation of the created objects.
Asides from what Allain Lalonde already said, one additional advantage of CSV is that it tends to be more compact than XML or even JSON. So, if your data is strictly tabular, with a completely flat hyerarchy, CSV may be a correct choice. Additonal disadvantages of CSV is that it may use different delimiters and decimal separators, depeding on which tool (and even country!) generated it.