Print

Print


How can people tell it searches content in bitstreams (pdfs, word docs)? It looks like it only searches metadata.
Thanks.
________________________________________
From: Code for Libraries [[log in to unmask]] On Behalf Of Han, Yan [[log in to unmask]]
Sent: Wednesday, October 20, 2010 4:43 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?

DSpace does Full-text search, you need to turn on the configuration file.
See UAL http://arizona.openrepository.com/arizona/
Yan

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Deng, Sai
Sent: Wednesday, October 20, 2010 2:14 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?

For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). I don't know how DRM affects file indexing.

On second thought, I searched for "DSpace full text search" and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing
However, I haven't seen any instance which shows the full text search results as I would see from vendor databases.

Any idea on what system might be good/best for search within documents and DRM?
Thank you for the reply!
Sophie

________________________________________
From: Code for Libraries [[log in to unmask]] On Behalf Of Bill Janssen [[log in to unmask]]
Sent: Wednesday, October 20, 2010 4:01 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?

Deng, Sai <[log in to unmask]> wrote:

> Do you know the Digital Library systems which can search within the
> documents (e.g. PDFs) and handle access restrictions (e.g. DRM)?

Not sure what you mean by "handle access restrictions".  Do you mean it can index the documents put into it even if they have DRM encumbrances?

UpLib has "search within the documents" -- if you search for a word or phrase, it shows you all the documents which match, but also all the pages in each document which match.  Supports a wide variety of document formats, from JPEG2000 to PDF to Powerpoint.  But as far as I know it doesn't deal with DRM restrictions.

Bill