On Oct 16, 2009, at 3:12 PM, Eric James wrote:
> For our finding aids, we are using fedoragenericsearch 2.2 with solr
> as index. Because the EADs can be huge, the EADs are indexed but
> not stored (with stored EADs, search time for ~500 objects = 20 min
> rather than < 1 sec).
>
> However, we would like to have number of search terms found within
> each hit. For example, CDL's collection:
>
> http://www.oac.cdlib.org/search?query=Donner
>
> Also we would like highlighting/snippets of the search term similar
> to CDL's.
>
> Is it a lost cause to have this functionality without storing the
> EAD? Is there a way to store the EAD and have a reasonable response
> time?
Hmmm... I'm not an expert, only a novice Solr hacker, but I've had
pretty good success full text indexing entire books, denoting them as
stored, and searching the index whose results are complete with
highlighted snippets. Here's my field definition:
<field name="fulltext" type="text" indexed="true" stored="true"
termVectors='true' />
While search response times equal about 2 seconds or so, it certainly
does return in 20 minutes. There are about 16,000 indexed books. Try:
http://infomotions.com/alex/
Yes, things like snippets are a lost cause without storing the indexed
data, unless maybe you can link to the content. The later was alluded
to the (one and only) Solr book, but I didn't even consider it.
'Seemed too expensive.
Query counts? I don't know about those.
--
Eric Lease Morgan
|