Print

Print


Le 29/10/2012 15:07, Eric Lease Morgan a écrit :
> On Oct 29, 2012, at 8:19 AM, Henri-Damien LAURENT <[log in to unmask]> wrote:
>
>> I am about to write a tool which would help indexing EPUB into ILSes.
>> My first guess is to produce ISO2709 or MARCXML record from EPUB files,
>> but since MARCXML or ISO2709 is not really what I would call the more
>> portable (UNIMARC and MARC21 may both be handled in the same file
>> format), I am rather considering producing OAI-DC or html5 +schema.org
>> <http://schema.org/>+dublin corebut that would rely on EPUB3.
>>
>> Any comment anyone ?
>> Has anyone considered such a tool ?
>> Is there any hidden corpse lurking around I should be aware of ?
>
> A couple of years ago I wrote a TEI to ePub creation tool. I learned a lot from that process. Most importantly, I learned the inner HTML must validate. (Whew!) If you wanted to include full-text indexing of epub files into your catalog -- which I would personally endorse -- then you could to the following things:
>
>    * use the metadata coming from the ancillary epub files to create MARC records
>    * use the full text data come from the HTML to support full text indexing
>
> My creation tool is located on Github -- https://github.com/ericleasemorgan/epub/  The most important file is buried in the bin directory and called alex2epub.pl or build.pl. I know, theses files create epub files and not really index them, but in order to index them a person needs to know how they are built.
>
> Good luck. I really like the idea of full text induing in library "catalogs".
>
> --
> Eric Lease Morgan
> University of Notre Dame
>
> 574/631-8604
Thanks for your answer Eric.
That was not my intention though to index directly full text.
Getting out a biblio record is my first step.

But this could be a nice goal, provided that there is a kind of 
namespace for fulltext metadata in the ILS.

-- 
Henri-Damien LAURENT