I'd definitely consider CouchDB as Patrick mentioned. It's a
"schema-free" JSON document database and replication is it's greatest
It does have Lucene integration:
Paul J. Davis of the core CouchDB team has a nice write-up:
There's also some Solr integration available:
From what you've described, CouchDB would be a great choice for your
Hope that's helpful, Godmar,
Godmar Back wrote:
> we're currently looking for an XML database to store a variety of
> small-to-medium sized XML documents. The XML documents are
> unstructured in the sense that they do not follow a schema or DTD, and
> that their structure will be changing over time. We'll need to do
> efficient searching based on elements, attributes, and full text
> within text content. More importantly, the documents are mutable.
> We'll like to bring documents or fragments into memory in a DOM
> representation, manipulate them, then put them back into the database.
> Ideally, this should be done in a transaction-like manner. We need to
> efficiently serve document fragments over HTTP, ideally in a manner
> that allows for scaling through replication. We would prefer strong
> support for Java integration, but it's not a must.
> Have other encountered similar problems, and what have you been using?
> So far, we're researching: eXist-DB (http://exist.sourceforge.net/ ),
> Base-X (http://www.basex.org/ ), MonetDB/XQuery
> (http://www.monetdb.nl/XQuery/ ), Sedna
> (http://modis.ispras.ru/sedna/index.html ). Wikipedia lists a few
> others here: http://en.wikipedia.org/wiki/XML_database
> I'm wondering to what extent systems such as Lucene, or even digital
> object repositories such as Fedora could be coaxed into this usage
> Thanks for any insight you have or experience you can share.
> - Godmar