So do you think the marc-hash-to-json "proto-spec" should suggest that
the encoding HAS to be UTF-8, or should it leave it open to anything
that's legal JSON? (Is there a problem I don't know about with
expressing "characters outside of the Basic Multilingual Plane" in
UTF-8? Any unicode char can be encoded in any of the unicode encodings,
right?).
If "collections" means what I think, Bill's blog proto-spec says they
should be serialized as JSON-seperated-by-newlines, right? That is,
JSON for each record, seperated by newlines. Rather than the alternative
approach you hypothesize there; there are various reasons to prefer
json-seperated-by-newlines, which is an actual convention used in the
wild, not something made up just for here.
Jonathan
Dan Scott wrote:
> Hey Bill:
>
> Do you have unit tests for MARC-HASH / JSON anywhere? If you do, that would make it easier for me to create a compliant PHP File_MARC_JSON variant, which I'll be happy-ish to create.
>
> The only concerns I have with your write-up are:
> * JSON itself allows UTF8, UTF16, and UTF32 encoding - and we've seen in Evergreen some cases where characters outside of the Basic Multilingual Plane are required. We eventually wound up resorting to surrogate pairs, in that case; so maybe this isn't a real issue.
> * You've mentioned that you would like to see better support for collections in File_MARC / File_MARCXML; but I don't see any mention of how collections would work in MARC-HASH / JSON. Would it just be something like the following?
>
> "collection": [
> {
> "type" : "marc-hash"
> "version" : [1, 0]
> "leader" : "…leader string … "
> "fields" : [array, of, fields]
> },
> {
> "type" : "marc-hash"
> "version" : [1, 0]
> "leader" : "…leader string … "
> "fields" : [array, of, fields]
> }
> ]
>
> Dan
>
>
>>>> Bill Dueber <[log in to unmask]> 03/15/10 12:22 PM >>>
>>>>
> I'm pretty sure Andrew was (a) completely unaware of anything I'd done, and
> (b) looking to match marc-xml as strictly as reasonable.
>
> I also like the array-based rather than hash-based format, but I'm not gonna
> go to the mat for it if no one else cares much.
>
> I would like to see ind1 and ind2 get their own fields, though, for easier
> use of stuff like jsonpath in json-centric nosql databases.
>
> On Mon, Mar 15, 2010 at 10:52 AM, Jonathan Rochkind <[log in to unmask]>wrote:
>
>
>> I would just ask why you didn't use Bill Dueber's already existing
>> proto-spec, instead of making up your own incomptable one.
>>
>> I'd think we could somehow all do the same consistent thing here.
>>
>> Since my interest in marc-json is getting as small a package as possible
>> for transfer accross the wire, I prefer Bill's approach.
>>
>> http://robotlibrarian.billdueber.com/new-interest-in-marc-hash-json/
>>
>>
>> Houghton,Andrew wrote:
>>
>>
>>> From: Houghton,Andrew
>>>
>>>> Sent: Saturday, March 06, 2010 06:59 PM
>>>> To: Code for Libraries
>>>> Subject: RE: [CODE4LIB] Q: XML2JSON converter
>>>>
>>>> Depending on how much time I get next week I'll talk with the developer
>>>> network folks to see what I need to do to put a specification under
>>>> their infrastructure
>>>>
>>>>
>>>>
>>> I finished documenting our existing use of MARC-JSON. The specification
>>> can be found on the OCLC developer network wiki [1]. Since it is a wiki,
>>> registered developer network members can edit the specification and I would
>>> ask that you refrain from doing so.
>>>
>>> However, please do use the discussion tab to record issues with the
>>> specification or add additional information to existing issues. There are
>>> already two open issues on the discussion tab and you can use them as a
>>> template for new issues. The first issue is Bill Dueber's request for some
>>> sort of versioning and the second issue is whether the specification should
>>> specify the flavor of MARC, e.g., marc21, unicode, etc.
>>>
>>> It is recommended that you place issues on the discussion tab since that
>>> will be the official place for documenting and disposing of them. I do
>>> monitor this listserve and the OCLC developer network listserve, but I only
>>> selectively look at messages on those listserves. If you would like to use
>>> this listserve or the OCLC developer network listserve to discuss the
>>> MARC-JSON specification, make sure you place MARC-JSON in the subject line,
>>> to give me a clue that I *should* look at that message, or directly CC my
>>> e-mail address on your post.
>>>
>>> This message marks the beginning of a two week comment period on the
>>> specification which will end on midnight 2010-03-28.
>>>
>>> [1] <http://worldcat.org/devnet/wiki/MARC-JSON_Draft_2010-03-11>
>>>
>>>
>>> Thanks, Andy.
>>>
>>>
>>>
>
>
>
|