There's been some talk in code4lib about using MongoDB to store MARC
records in some kind of JSON format. I'd like to know if you have
experimented with indexing those documents in MongoDB. From my limited
exposure to MongoDB, it seems difficult, unless MongoDB supports some
kind of "custom indexing" functionality.
According to the MongoDB docs [1], "you can create an index by calling
the ensureIndex() function, and providing a document that specifies
one or more keys to index." Examples of this are:
db.things.ensureIndex({"city": 1})
db.things.ensureIndex({"address.city": 1})
That is, you specify the keys giving a path from the root of the
document to the data element you are interested in. Such a path acts
both as the index's name, and as an specification of how to get the
keys's values.
In the case of two proposed MARC-JSON formats [2, 3], I can't see such
"path". For example, say you want an index on field 001. Simplifying,
the JSON docs would look like this
{ "fields" : [ ["001", "001 value"], ... ] }
or this
{ "controlfield" : [ { "tag" : "001", "data" : "fst01312614" }, ... ] }
How would you specify field 001 to MongoDB?
It would be nice to have some kind of custom indexing, where one could
provide an index name and separately a JavaScript function specifying
how to obtain the keys's values for that index.
Any suggestions? Do other document oriented databases offer a better
solution for this?
BTW, I fed MongoDB with the example MARC records in [2] and [3], and
it choked on them. Both are missing some commas :-)
[1] http://www.mongodb.org/display/DOCS/Indexes
[2] http://robotlibrarian.billdueber.com/new-interest-in-marc-hash-json/
[3] http://worldcat.org/devnet/wiki/MARC-JSON_Draft_2010-03-11
--
Fernando Gómez
Biblioteca "Antonio Monteiro"
INMABB (Conicet / Universidad Nacional del Sur)
Av. Alem 1253
B8000CPB Bahía Blanca, Argentina
Tel. +54 (291) 459 5116
http://inmabb.criba.edu.ar/
|