I’m surprised you didn’t recommend going straight to Solr and doing the reporting from there :) Index into Solr using your MARC library of choice (e.g. solrmarc) and then get all authorities using &facet.field=authorities (or whatever field name used).
Erik
On Nov 2, 2014, at 7:24 PM, Jonathan Rochkind <[log in to unmask]> wrote:
> If you are, can become, or know, a programmer, that would be relatively straightforward in any programming language using the open source MARC processing library for that language. (ruby marc, pymarc, perl marc, whatever).
>
> Although you might find more trouble than you expect around authorities, with them being less standardized in your corpus than you might like.
> ________________________________________
> From: Code for Libraries [[log in to unmask]] on behalf of Stuart Yeates [[log in to unmask]]
> Sent: Sunday, November 02, 2014 5:48 PM
> To: [log in to unmask]
> Subject: [CODE4LIB] MARC reporting engine
>
> I have ~800,000 MARC records from an indexing service (http://natlib.govt.nz/about-us/open-data/innz-metadata CC-BY). I am trying to generate:
>
> (a) a list of person authorities (and sundry metadata), sorted by how many times they're referenced, in wikimedia syntax
>
> (b) a view of a person authority, with all the records by which they're referenced, processed into a wikipedia stub biography
>
> I have established that this is too much data to process in XSLT or multi-line regexps in vi. What other MARC engines are there out there?
>
> The two options I'm aware of are learning multi-line processing in sed or learning enough koha to write reports in whatever their reporting engine is.
>
> Any advice?
>
> cheers
> stuart
> --
> I have a new phone number: 04 463 5692
|