Yitzchak, are you interested in actually searching the fulltext? Or just
highlighting the terms?
If you're only interested in highlighting it, it might be a whole lot easier
to implement this in javascript through something like jQuery:
http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html
That way you're not juggling mostly redundant Lucene indexes and trying to
keep them synced.
How are you getting your search results? Does Greenstone have some sort of
search API that returns the highlighted results? Would it make a difference
if you could add a field to the Lucene document (meaning would you have
access to it through your PHP API to Greenstone)? If so, you could probably
do this pretty easily via one of the JVM scripting languages (Groovy, JRuby,
Jython, Quercus -- PHP in the JVM) so you just have the single Lucene index
instead of multiple.
Another approach might be to serve the Lucene index via Solr [1] or
Lucene-WS (http://lucene-ws.net/) which would allow you to skip Greenstone
altogether for searching.
Basically, I would try to avoid going the Zend_Lucene route if at all
possible.
-Ross.
1.
http://www.google.com/search?q=solr+on+an+existing+lucene+index&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
On Tue, Sep 29, 2009 at 11:32 AM, Yitzchak Schaffer <
[log in to unmask]> wrote:
> Erik Hatcher wrote:
>
>> I'm a bit confused then. You mentioned that somehow Zend Lucene was going
>> to help, but if you don't have the text to highlight anywhere then the
>> Highlighter isn't going to be of any use. Again, you don't need the full
>> text in the Lucene index, but you do need it get it from somewhere in order
>> to be able to highlight it.
>>
>
> Erik,
>
> I started to port the native Greenstone Java Lucene wrapper to PHP, so I
> could then modify it to add this feature, as I don't know Java. This would
> mean using Zend Lucene for the actual indexing implementation. My question
> is whether anyone's already done it, in Java or otherwise.
>
> Thanks for the clarification,
>
>
> --
> Yitzchak Schaffer
> Systems Manager
> Touro College Libraries
> 33 West 23rd Street
> New York, NY 10010
> Tel (212) 463-0400 x5230
> Fax (212) 627-3197
> Email [log in to unmask]
>
|