You might consider a NoSQL database, either memory (redis, etc.) or disk based (MongoDB, etc.) depending on your needs. There are also triple-store specific DBs like SparkleDB. Cary > On Apr 17, 2015, at 5:01 PM, Stephen Schor <[log in to unmask]> wrote: > > Firstly - thanks for the thoughtful replies, links, and anecdotes. > > We end up storing a lot of MODS as text in a database. > We map it out as other formats...but our app deals in *a lot* of mods. > > A lot of time and line-count is dedicated to turning XML into an > object/datastructure > that can be sent to-and-from web forms in a way our web app likes..and > because that > object/datastructure is atypical we forego the benefits (like first-class > validation) of our framework. > Not to mention querying gets hinky in XML and dealing with remediation > within a > hierarchy means updating what amounts to a denormalized cache. > (http://martinfowler.com/bliki/TwoHardThings.html) > > It's hard to dissuade myself from the idea that we're simply hanging > adjectives on nouns (our objects) > and that different specs map these adjectives to different words and format > them differently. > *I think other projects store attributes in a traditional relational way > and concoct* > *different specs based on DB records. (Maybe Archivist's Toolkit? > Archivespace?)* > > Uff - anyway - maybe I'll get a chance to describe a collection's objects > in a spec-agnostic way > I already can imagine peppering the schema with spec-specific columns and > it being a slippery > slope from there. But hey, dream big - right? > > I may reply to this thread with my success story one day. > I'm also really eager to share if it goes totally wrong. > Those stories are usually more entertaining. > > > Stephen > > > On Fri, Apr 17, 2015 at 6:56 PM, Cary Gordon <[log in to unmask]> wrote: > >> This is a beautiful response, and the payoff, at the end, is perfect... Try >> it, you probably won't like it. In practice, even with big hardware, >> relational databases get mired down with MARC and MODS once the collection >> size becomes significant.. >> >> Cary >> >> On Friday, April 17, 2015, Mark V. Sullivan < >> [log in to unmask]> wrote: >> >>> Stephen, >>> As the lead developer on the SobekCM open-source digital repository >>> project and formerly a developer for the University of Florida >> Libraries, I >>> have looked at this quite a bit and learned a bit over time. >>> >>> I began development working on tracking systems to manage a fairly >>> large-scale digitization shop at UF before I was even working on the >> public >>> repository side. When I arrived (around 1999) metadata was double keyed >>> several times for each item during the tracking and metadata creation >>> process. It seemed obvious to me that we needed a tracking system and >> one >>> that would hold metadata for each item. This was fairly easy to do when >>> our metadata was very homogenous and based on simple Dublin Core. This >>> worked well and the system could easily spit out ready METS (and MXF) >>> packages. >>> >>> Over time, I began to experiment with MODS and increasingly started using >>> specialized metadata schemas for different types of objects, such as >>> herbarium or oral history materials. I envisioned a tracking system that >>> would hold all of this metadata relationally and provide different tabs >>> based on the material type. So, oral history items would have an extra >> tab >>> exposing the oral history metadata and herbarium would have a similar >>> special tab. While development of this moved ahead, the entire system >>> seemed unwieldy. Adding a new schema was a bit laborious.. even adding a >>> new field to use. >>> >>> After several years of this, we began the SobekCM digital repository >>> software development. After that experience I swore off trying to store >>> very complex structured data in the database in the same type of format. >>> (This may also have had to do with an IMLS project I worked on that >> proved >>> the futility of this approach.) I generally eschew triple-stores for the >>> basis of libraries in favor of relational databases on the premise that >> we >>> DO actually understand the basic relationships of digital resources to >>> collection and the sub-relations there. We keep the data within METS >> files >>> with one or more descriptive metadata sections and essentially the >> database >>> only points to that METS file. For searching, we use a flattened table >>> structure with one row per item, much like Solr/Lucene, and Solr/Lucene >>> itself. >>> >>> My advice is to steer clear of trying to take beautifully (and deeply) >>> structured metadata from MODS, Darwin Core, VRACore (and who knows what >>> else) and try to create tables and relations for them. >>> >>> I think you can point some database tools at the schema and have it >>> generate the tables for you. Just doing that will probably dissuade you. >>> ;) >>> >>> Mark V. Sullivan >>> CIO & Application Architect >>> Sobek Digital Hosting and Consulting, LLC >>> [log in to unmask] <javascript:;> >>> 352-682-9692 (mobile) >>> >>> >>> ________________________________________ >>> From: Code for Libraries <[log in to unmask] <javascript:;>> on >>> behalf of Stephen Schor <[log in to unmask] <javascript:;>> >>> Sent: Friday, April 17, 2015 1:27 PM >>> To: [log in to unmask] <javascript:;> >>> Subject: [CODE4LIB] Modeling a repository's objects in a relational >>> database >>> >>> Hullo. >>> >>> I'm interested to hear about people's approaches for modeling >>> repository objects in a normalized, spec-agnostic way, _relational_ way >>> while >>> maintaining the ability to cast objects as various specs (MODS, Dublin >>> Core). >>> >>> People often resort to storing an object as one specification (the text >> of >>> the MODS for example), >>> and then convert it other specs using XSLT or their favorite language, >>> using established >>> mappings / conversions. ( >>> http://www.loc.gov/standards/mods/mods-conversions.html) >>> >>> Baking a MODS representation into a database text field can introduce >>> problems with queryablity and remediation that I _feel_ would be hedged >>> by factoring out information from the XML document, and modeling it >>> in a relational DB. >>> >>> This is idea that's been knocking around in my head for a while. >>> I'd like to hear if people have gone down this road...and I'm especially >>> eager to hear both success and horror stories about what kind of results >>> they got. >>> >>> Stephen >>> >> >> >> -- >> Cary Gordon >> The Cherry Hill Company >> http://chillco.com >>