Print

Print


Interesting, Chris.

I have looked toward solr for feeding instances for controlled vocabulary
(and to take advantage of Orbeon's autocomplete/autosuggest capabilities),
but I haven't gotten very far into it.  I assume it's possible.

I too have had PermGen space issues with Orbeon running in tomcat with Solr
and Cocoon on a VM instance with 512mb of memory.  I fixed this problem by
turning on garbage collection in the PermGen space.  For example, on a 512mb
system, the CATALINA_OPTS set to:

CATALINA_OPTS="-server -Xms356m -Xmx356m -XX:+UseConcMarkSweepGC
-XX:+CMSPermGenSweepingEnabled -XX:NewSize=64m -XX:MaxNewSize=64m
-XX:MaxPermSize=128m"

works fine in collecting garbage and avoiding crashes of tomcat.

Ethan



On Fri, Nov 13, 2009 at 6:52 PM, Chris Fitzpatrick <[log in to unmask]>wrote:

> I too have written a metadata editor in Orbeon xforms, using their new Form
> Runner framework.
>
>  I put a semi-up-to-date beta demo version of it here, if anyone is curious
>  --> https://mdtoolkit-dev.stanford.edu/ops/fr/mods/mlm/
> (Feel free to edit/delete records, as this is just a dev instance. You'll
> probably have to accept a self-signed cert tho....).
>
> (Records probably look a little weird because they were just blindly
>  imported from MARC records from our ILMS. )
>
> I've written a version that back-ends into Fedora and Solr , but we're
> still using the default exist data base in production.
> Some features this version has:
>
> 1. The "Import Record from Catalog Key" is based on a REST-ful service
> written by my coworker Richard Anderson that pulls MARC xml records from our
> SOLR db and converts them into MODS.
>   You can try it out by entering "8257892" and hitting the plus...
> 2. The language section has the ability to do a real-time autosuggest
> lookup of a value list. In this case, it's from this xml file -->>
> http://www.loc.gov/standards/codelists/languages.xml
>  If you want to try this out in a record, add a new language node (hit the
> green plus), and type something  (bug -- it has to start with an uppercase
> letter) into the box (Something  like "Ger") and wait a couple of seconds.
> Not too long...
> I've also done demo  versions that query value lists from SOLR and from
> LCSH genre RDF in Mulgara , as well as queried  the OCLC grid  naming
> authority service to add nodes from their authority file.  So, there are a
> lot of possibilities there.
> 3. When you create a new record, the uuids are generated by a REST request
> to our uuid generator.
>
> But the performance seems ok, but I haven't done any heavy stress testing
> on it. It's a little slow, I guess. This really is just a way for our
> catalogers/project managers to create records to be loaded into SOLR, so it
> gets very light traffic. And it runs into perm gen space problems if you're
> running things like Mulgara, SOLR, or multiple Orbeon applications in the
> same container, especially on a VM.
>
> And, yes , it is very ugly and a little weird, but so are most of  us in
> the library business, so I've been comfortable with it.
>
> Any suggestions,comments, and barbs are welcome...
> best,chris.
>
>
>
>
> On Nov 13, 2009, at 9:49 AM, [Your Name] wrote:
>
>  In discussion with colleagues around this topic, the question of
>> controlled vocabularies has been prominent. We're looking to move away from
>> list instances that are packed into the XForm at render time to lists that
>> are exposed from other services through REST interfaces, which can be
>> dynamically coupled into a form.
>>
>> On the other hand, 4 seconds is really not terribly long. {grin}
>>
>> ---
>> A. Soroka
>> Digital Research and Scholarship R & D
>> the University of Virginia Library
>>
>>
>>
>> On Nov 13, 2009, at 12:45 PM, Ford, Kevin wrote:
>>
>>  We've been using Orbeon forms for about a year now for cataloging our
>>> digital collections.  We use Fedora Commons, so using the XML as input and
>>> outputting to XML seemed a no brainer.  It has worked very nicely for
>>> editing VRA Core4 records. But, instead of doing anything terribly fancy
>>> with Orbeon, we simply use the little sandbox application that comes with
>>> Orbeon (there's an online demo [1]).  The URL to the XForm is part of the
>>> query string. This solution has greatly reduced our time investment in
>>> making Orbeon part of our workflow and, more importantly, getting Orbeon to
>>> work for us.  All that being said, Ethan's sharp looking EAD editor makes me
>>> jealous that we haven't created our own custom editor.
>>>
>>> As for Orbeon's performance, once we worked out some quirks, we've been
>>> quite happy with Orbeon.  Orbeon hosts a useful performance and tuning page
>>> [2].  We also learned that it is helpful to stop the Orbeon app and restart
>>> it about once every two weeks as performance can become progressively
>>> slower.  It seems to need a little reboot.  In any event, a typical XForm
>>> for us is about 200k, with a number of authority lists, one of which
>>> includes nearly 1500 items.  Orbeon loads and renders the XForm fairly
>>> quickly (less than 4 seconds) and editing performance hasn't been an issue
>>> either, which is great considering that a 1500-item-subject-authority drop
>>> down list is created for each subject being added to a record.
>>>
>>> Moving such a large XForm to a server-based solution was necessary.  Our
>>> XForm cataloging application, which began with a simple DC record and
>>> focused on producing a viable XForm, initially used the Mozilla XForm add-on
>>> [3].  The Firefox add-on, which of course runs on the client, easily scaled
>>> for a VRA Core4 record, but it couldn't handle a burgeoning subject
>>> authority file.  Hence the need for an alternative solution, quick.
>>>
>>> -Kevin
>>>
>>> [1] http://www.orbeon.com/ops/xforms-sandbox/
>>> [2] http://wiki.orbeon.com/forms/doc/developer-guide/performance-tuning
>>> [3] http://www.mozilla.org/projects/xforms/
>>>
>>> --
>>> Kevin Ford
>>> Library Digital Collections
>>> Columbia College Chicago
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
>>> Andrew Ashton
>>> Sent: Friday, November 13, 2009 8:37 AM
>>> To: [log in to unmask]
>>> Subject: Re: [CODE4LIB] XForms EAD editor sandbox available
>>>
>>> Nice job, Ethan.  This looks really cool.
>>>
>>> We have an Orbeon-based MODS editor, but I have found Orbeon to be a bit
>>> tough to develop/maintain and more heavyweight than we really need.
>>>  We're
>>> considering more Xforms implementations, but I would love to find a more
>>> lightweight Xforms application.  Does anyone have any recommendations?
>>>
>>> The only one I know of is XSLTForms (http://www.agencexml.com/xsltforms)
>>> but
>>> I haven't messed with it yet.
>>>
>>> -Andy
>>>
>>> On 11/13/09 9:13 AM, "Eric Hellman" <[log in to unmask]> wrote:
>>>
>>>  XForms and Orbeon are very interesting tools for developing metadata
>>>> management tools.
>>>>
>>>> The ONIX developers have used this stack to produce an interface for
>>>> ONIX-PL
>>>> called OPLE that people should try out.
>>>>
>>>> http://www.jisc.ac.uk/whatwedo/programmes/pals3/onixeditor.aspx
>>>>
>>>> Questions about Orbeon relate to performance and integrability, but I
>>>> think
>>>> it's an impressive use of XForms nonetheless.
>>>>
>>>> - Eric
>>>>
>>>> On Nov 12, 2009, at 1:30 PM, Ethan Gruber wrote:
>>>>
>>>>  Hello all,
>>>>>
>>>>> Over the past few months I have been working on and off on a research
>>>>> project to develop a XForms, web-based editor for EAD finding aids that
>>>>> runs
>>>>> within the Orbeon tomcat application.  While still in a very early
>>>>> alpha
>>>>> stage (I have probably put only 60-80 hours of work into it thus far),
>>>>> I
>>>>> think that it's ready for a general demonstration to solicit opinions,
>>>>> criticism, etc. from librarians, and technical staff.
>>>>>
>>>>> Background:
>>>>> For those not familiar with XForms, it is a W3C standard for creating
>>>>> next-generation forms.  It is powerful and can allow you to create XML
>>>>> in
>>>>> the way that it is intended to be created, without limits to
>>>>> repeatability,
>>>>> complex hierarchies, or mixed content.  Orbeon adds a level on top of
>>>>> that,
>>>>> taking care of all the ajax calls, serialization, CRUD operations, and
>>>>> a
>>>>> variety of widgets that allow nice features like tabs and
>>>>> autocomplete/autosuggest that can be bound to authority lists and
>>>>> controlled
>>>>> access terms.  By default, Orbeon reads and writes data from and to an
>>>>> eXist
>>>>> database that comes packaged with it, but you can have it serialize the
>>>>> XML
>>>>> to disk or have it interact with any REST interface such as Fedora.
>>>>>
>>>>> Goals:
>>>>> Ultimately, I wish to create a system of forms that can open any EAD
>>>>> 2002-compliant XML file without any data loss or XML transformation
>>>>> whatsoever.  I think that this is the shortcoming of systems such as
>>>>> Archon
>>>>> and Archivists' Toolkit.  I want to integrate authority lists that can
>>>>> be
>>>>> integrated into certain fields with autosuggest (such as corporate
>>>>> names,
>>>>> people, and subjects).  If there is demand, I can build a public
>>>>> interface
>>>>> for viewing the entire EAD collection, complete with solr for faceted
>>>>> browse
>>>>> and search, but this is secondary to producing a form that people with
>>>>> some
>>>>> basic archiving knowledge and EAD background can use to easily and
>>>>> effectively create finding aids.  A public interface is the easy part,
>>>>> in
>>>>> any case.  It wouldn't take more than a week or two to build something
>>>>> fairly nice and robust.
>>>>>
>>>>> Here is the link:  http://beta.scholarslab.org:9080/cocoon/eaditor/
>>>>>
>>>>> I should stress that the application is *not complete.*  I am using
>>>>> cocoon
>>>>> for providing a list of EAD content in the system.  I will remove that
>>>>> application eventually and utilize Orbeon's internal pipelining
>>>>> features to
>>>>> achieve the same objective.  I haven't delved too deeply into Orbeon's
>>>>> pipelines yet.
>>>>>
>>>>> Here are some things to note:
>>>>>
>>>>> 1. If you click on a link to open the main part of the guide or any of
>>>>> its
>>>>> components, you have to click the "Load" link on the top of the form.
>>>>>  Forms
>>>>> aren't being loaded on page load yet.
>>>>> 2. Elements that accept mixed content per the EAD 2002 schema (e.g.
>>>>> paragraphs) only accept PCDATA.  I haven't worked on mixed content yet;
>>>>> it
>>>>> is by far the most challenging aspect of the project.
>>>>> 3. I only have a few C-level elements available to add.
>>>>> 4. Not all did elements are available yet.
>>>>> 5. A lot of the generic attributes, like type and label, are not
>>>>> available
>>>>> for editing yet.  This may be the type of thing that is best customized
>>>>> per
>>>>> institution relative to their own best practices.  I don't want more
>>>>> input
>>>>> fields than necessary right now.
>>>>> 6. The only thing you can add into the archdesc right now is the <dsc>.
>>>>> Once I finish all of the c-level elements, I can just put some
>>>>> xi:includes
>>>>> into the archdesc XForm file to show them in the archdesc level.
>>>>>
>>>>> I think those are the major issues for now.  As I stated earlier, this
>>>>> is
>>>>> sort of a pre-alpha.  The project is open source and available (through
>>>>> svn)
>>>>> to anyone who wants it.  http://code.google.com/p/eaditor/ .  I have
>>>>> put
>>>>> together an easy package to get the application up and running without
>>>>> difficulty.  All you have to do is unzip the download, go into the
>>>>> apache
>>>>> tomcat folder and execute the startup script.  This assumes you have
>>>>> nothing
>>>>> running on port 8080 already.
>>>>>
>>>>> Download page: http://code.google.com/p/eaditor/downloads/list
>>>>>
>>>>> Wiki instructions:
>>>>>
>>>>> http://code.google.com/p/eaditor/wiki/QuickstartInstallation?ts=1257887453&up
>>>>> dated=QuickstartInstallation
>>>>>
>>>>> Comments, questions, criticism welcome.  The editor is a sandbox.  Feel
>>>>> free
>>>>> to experiment.
>>>>>
>>>>> Ethan Gruber
>>>>> University of Virginia Library
>>>>>
>>>>
>>>> Eric Hellman
>>>> President, Gluejar, Inc.
>>>> 41 Watchung Plaza, #132
>>>> Montclair, NJ 07042
>>>> USA
>>>>
>>>> [log in to unmask]
>>>> http://go-to-hellman.blogspot.com/
>>>>
>>>