We're just talking about creating an index, not a separate copy of the works, right? because I imagine that copyright has a lot to do with why this type of thing doesn't already exist. On Wed, Sep 16, 2009 at 3:08 PM, Eric Lease Morgan <[log in to unmask]> wrote: > Eric Morgan wrote: > > http://infomotions.com/highlights/ >> > > > > Rosalyn Metz wrote: > > I have librarians that would kill for this. In fact I was talking to >> one about it the other day. She felt there must be a way to handle >> active reading and make it portable. This would be great in >> conjunction with RefWorks or Zotero or something along those lines. >> > > > Yep, when I was creating this application for myself I was wondering what > it would be like if a whole group, say, an academic department, were to > systematically contribute to such a thing? I thought the output would be > pretty exciting. > > > Mark A. Matienzo wrote: > > Have you considered using Solr's ExtractingRequestHandler [1] for the >> PDFs? We're using it at NYPL with pretty great success. >> >> [1] http://wiki.apache.org/solr/ExtractingRequestHandler >> > > Nope, never saw that previously. Thanks for the pointer. > > > Peter Kiraly wrote: > > I would like to suggest an API for extracting text (including highlighted >> or >> annotated ones) from PDF: iText (http://www.lowagie.com/iText/). >> This is a Java API (has C# port), and it helped me a lot, when we worked >> with extraordinary PDF files. >> > > More tools! Thank you. > > > danielle plumer wrote: > > My (much more primitive) version of the same thing involves reading and >> annotating articles using my Tablet PC. Although I do get a variety of >> print >> publications, I find I don't tend to annotate them as much anymore. I used >> to use EndNote to do the metadata, then I switched to Zotero. I hadn't >> thought to try to create a full-text search of the articles -- hmm. >> > > Yes, for a growing number of the tools I create I need to be thinking about > Zotero as way of "remembering" content. Thanks for... reminding me. > > > Erik Hatcher wrote: > > Here's a post on how easy it is to send PDF documents to Solr from Java: >> >> < >> http://www.lucidimagination.com/blog/2009/09/14/posting-rich-documents-to-apache-solr-using-solrj-and-solr-cell-apache-tika/ >> > > I'm looking forward to the arrival of my Solr books any day now. After > reading it I hope to have a better handle on the guts of Solr as well as > increase my abilities to do the sorts of things discussed at the URL above. > > > Thank you, one and all for your replies. > > -- > Eric Morgan > -- Cindy Harper, Systems Librarian Colgate University Libraries [log in to unmask] 315-228-7363