LISTSERV 16.5 - CODE4LIB Archives

Hi,

AustLit ( http://www.austlit.edu.au ) is in the early stages of a
migration from javaServlets/xslt/oracle to java/neo4j/gremlin.  The
web version of AustLit was developed in 2000 based on FRBR with a
strong emphasis on events realised with a topic map model, so the sql
implementation is close to a triple-store.  More information on the
details are here: http://www.austlit.edu.au/about ,
http://www.austlit.edu.au/about/metadata and
http://www.austlit.edu.au:7777/DataModel/index.html ("ALEG" was the
working name for AustLit redevelopment in 2000).

Last year a decision was taken to move AustLit from a subscription
service to open access, and from updates being performed solely by
dedicated bibliographers and researchers (members of various AustLit
teams distributed across Australia) to include community
contributions, so rather than work these changes into a 12 year old
system, it was decided to start afresh with an approach which would
more naturally support the AustLit data model.

So, we experimented with Neo4j, and were impressed with its
performance.  For example, loading our current data from Oracle into
an empty neo4j database takes about 30 minutes (using a
run-of-the-mill 3 year-old server), producing a graph of 14m nodes and
20m relationships.  Performing custom indexing of this data using the
built-in Lucene integration takes about 2.5 hours, but that's a
function of the extensive indexing we're performing.

As you'd probably expect, we do have some "issues" we're working
through, such as

- integration with Lucene is "abstracted" by the neo4j index
interface, so it is difficult or impossible to use some native Lucene
features.  For example, boosting index nodes based on their inherent
importance and using this boost in lucene to determine relevance
cannot be done.

- our data model is complex, and added to the requirements to version
every node and relationship (ie, record changes, allow rollback), our
graph traversals are correspondingly complex, but I suspect as we
become more familar with graph traversal idioms in gremlin and cypher,
they'll become as "normal" as sql

But so far, neo4j seems fast and robust, and we're optimistic!

Kent Fitch

On Sat, Feb 11, 2012 at 9:42 AM, Chris Fitzpatrick
<[log in to unmask]> wrote:
> Hej hej,
>
> Is anyone is using neo4j in their library projects.
>
> If the answer is "ja", I would be very interested in hearing how it's going.
> How are you using it?
> Is it something that is in production and is adding value or is it
> more a skunkworks-type effort?
> What languages are you using? Are you using an ORM (like Rails or Django)?
>
> I would also be really interested in hearing thoughts, stories, and
> opinions about the idea of using a graph db or triple store in their
> stack.
>
> tack!
>
> b, fitz.