These requirements fit Swish-e [1] to a "T". I've used it to index millions of XML records [2], and there are no particular requirements for the XML -- it just needs to be well-formed. You can have it automatically detect and index XML fields as well as index all words across all fields. This is all handled by a very simple text config file. The only downside is you will need to write the user interface (CGI) in your favorite language to interact with Swish-e. For example, here is my entire config file for Current Cites [3], where I store citations in my own XML format: DefaultContents XML* UndefinedMetaTags auto IndexDir /home/tennantr/public_html/currentcites/cites/ ReplaceRules remove /home/tennantr/public_html/currentcites/cites/ PropertyNames creator title description booktitle source IndexOnly .xml This tells Swish-e to expect XML, the line "UndefinedMetaTags auto" tells it to keep track of any XML tag it sees, the next two lines telll it where the files are and I remove the path from the index so I only get returned each file title without the server path included. The "PropertyNames" line defines with elements are actually stored in the index, which I can then retrieve directly in the search results for display to the user. The "IndexOnly .xml" line tells Swish-e to ignore anything without that filename extension. Nothing could be easier. Roy [1] http://swish-e.org/ [2] http://roytennant.com/proto/hathi/ [3] http://lists.webjunction.org/currentcites/ On Wed, Mar 16, 2011 at 8:00 AM, Edward M. Corrado <[log in to unmask]> wrote: > Hi, > > I [will soon] have a small set (< 1000 records) of Dublin Core > metadata published in OAI_DC format that I want to be searchable via a > Web browser. Normally we would use Ex Libris's Primo for this, but > this particular set of data may have some confidential information and > our repository only has minimal built in search functions. While we > still may go with Primo for these records, I am looking for at other > possibilities. The requirements as I see them are: > > 1) Can ingest records in OAI_DC format > 2) Allow remote end-users who are familiar with the collection search > these ingest records via a Web browser. > 3)Search should be keyword anywhere or individual fields although it > does not need to have every whizzbang feature out there. In other > words, basic search feature are fine. > 4) Should support the ability to link to the display copy in our > repository (probably goes without saying) > 5) Should be simple to install and maintain (Thus, at least in my > mind, eliminating something like Blacklight) > 6) Preferably a LAMP application although a Windows server based > solution is a possibility as well > 7) Preferably Open Source, or at least no- or low-cost > > I haven't been able to find anything searching the Web, but it seems > like something people may have done before. Before I re-invent the > wheel or shoe-horn something together, does anyone have any > suggestions? > > Edward >