Print

Print


> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Bill Dueber
> Sent: Friday, March 05, 2010 01:59 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Q: XML2JSON converter
> 
> On Fri, Mar 5, 2010 at 1:10 PM, Houghton,Andrew <[log in to unmask]>
> wrote:
> 
> >
> > I decided to stick closer to a MARC-XML type definition since its
> would be
> > easier to explain how the two specifications are related, rather than
> take a
> > more radical approach in producing a specification less familiar.
> Not to
> > say that other approaches are bad, they just have different
> advantages and
> > disadvantages.  I was going for simple and familiar.
> >
> >
> That makes sense, but please consider adding a format/version (which we
> get
> in MARC-XML from the namespace and isn't present here). In fact, please
> consider adding a format / version / URI, so people know what they've
> got.

This sounds reasonable and I'll consider adding into our specification.

> I'm also going to again push the newline-delimited-json stuff. The
> collection-as-array is simple and very clean, but leads to trouble
> for production (where for most of us we'd have to get the whole
> freakin' collection in memory first ...

As far as our MARC-JSON specificaton is concerned a server application can return either a collection or record which mimics the MARC-XML specification where the collection or record element can be used for a document element.

> Unless, of course, writing json to a stream and reading json from a
> stream
> is a lot easier than I make it out to be across a variety of languages
> and I
> just don't know it, which is entirely possible. The streaming writer
> interfaces for Perl (
> http://search.cpan.org/dist/JSON-Streaming-
> Writer/lib/JSON/Streaming/Writer.pm)
> and Java's Jackson (
> http://wiki.fasterxml.com/JacksonInFiveMinutes#Streaming_API_Example)
> are a
> little more daunting than I'd like them to be.

As you point out JSON streaming doesn't work with all clients and I am hesitent to build on anything that all clients cannot accept.  I think part of the issue here is proper API design.  Sending tens of megabytes back to a client and expecting them to process it seems like a poor API design regardless of whether they can stream it or not.  It might make more sense to have a server API send back 10 of our MARC-JSON records in a JSON collection and have the client request an additional batch of records for the result set.  In addition, if I remember correctly, JSON streaming or other streaming methods keep the connection to the server open which is not a good thing to do to maintain server throughput.


Andy.