Print

Print


My initial problem though with the marc-in-json approach is the complexity of the JSON, i am looking to find a simpler model in order to also make my queries, in ES for example, simpler to implement.

If anyone has any examples of how make use of this marc - in - json output in order to use ES, it would be much appreciated. 

thank you 




________________________________
 Απο: Ross Singer <[log in to unmask]>
Προς: [log in to unmask] 
Στάλθηκε: 3:47 μ.μ. Τρίτη, 24 Σεπτεμβρίου 2013
Θέμα: Re: [CODE4LIB] New perl module MARC::File::MiJ -- marc-in-json for
 

This serialization would actually be awful for the OP's use case, which (as
I understand it) is to put it in MongoDB and Elasticsearch (which are
exactly the use cases marc-in-json is designed for).

In this array of arrays approach, where the tag name is just another value
(as opposed to a key), you cannot take advantage of JsonPath, thereby
eliminating almost any possible way of querying this data in those
databases.

This format is great for serializing/deserializing in and out of a MARC
record structure (because it's incredibly fast and efficient).  Not so much
for actually using in JSON-native environment.

marc-in-json was an intentional compromise so that it got the benefits that
being optimized for json (as opposed to being optimized for MARC) brought.

-Ross.
On Sep 24, 2013 6:57 AM, "Marc Chantreux" <[log in to unmask]> wrote:

> hello,
>
> On Mon, Jul 15, 2013 at 11:00:35AM -0400, Bill Dueber wrote:
> > The marc-in-json<
> http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/
> >
>
> My 2 cents:
>
> * don't specify a MARC-in-Whatever format: define the way you store the
>   MARC record in memory then just use dumpers from the YAML, JSON, and
>   other serialization systems.
>
> * marc-in-json itself (as described in the document) use dicts at every
>   level which leads to 2 issues:
>
>   * implementations of transformations and querying are painfull
>   * explicit use of useless keys = more useless data (not as crappy as
>     XML but still useless.
>
> MARC::MIR http://search.cpan.org/~marcc/marc-mir-0.4/lib/MARC/MIR.pod
> In-memory representation is much simpler to handle, whatever the
> programming langage you use.
>
> As comparaison, the same record in MARC::MIR and MIJ.
> HTH
>
> MIR:
>
> [ "01471cjm a2200349 a 4500"
> , [ [ "001","5674874" ],
>     [ "005","20030305110405.0" ],
>     [ "007","sdubsmennmplu" ],
>     [ "008","930331s1963    nyuppn              eng d" ]
>     [ "035", [ [ "9","(DLC)   93707283" ] ],
>            , [" "," "] ],
>     [ "906", [ [ [ "a","7" ] ,
>                  [ "b","cbc" ],
>                  [ "c","copycat" ],
>                  [ "d","4" ],
>                  [ "e","ncip" ],
>                  [ "f","19" ],
>                  [ "g","y-soundrec" ] ],
>                [ " "," "]] ]
>
> MIJ:
>
> { "leader":"01471cjm a2200349 a 4500",
>     "fields":
>     [ { "001":"5674874" },
>         { "005":"20030305110405.0" },
>         { "007":"sdubsmennmplu" },
>         { "008":"930331s1963    nyuppn              eng d" },
>         { "035": { "subfields": [
>                     { "9":"(DLC)   93707283" }
>                 ],
>                 "ind1":" ",
>                 "ind2":" " } },
>         { "906":
>             { "subfields":
>                 [ { "a":"7" },
>                     { "b":"cbc" },
>                     { "c":"copycat" },
>                     { "d":"4" },
>                     { "e":"ncip" },
>                     { "f":"19" },
>                     { "g":"y-soundrec" }
>                 ], "ind1":" ", "ind2":" " }}}
>
> --
> Marc Chantreux
> Université de Strasbourg, Direction Informatique
> 14 Rue René Descartes,
> 67084  STRASBOURG CEDEX
> ☎: 03.68.85.57.40
> http://unistra.fr
> "Don't believe everything you read on the Internet"
>     -- Abraham Lincoln
>