Can you start FOP job as its own process? If so, something like
Condor[1] or Sun Grid Engine will probably be a whole lot easier to
deploy (like, you could be testing next week) than something
MapReduce-based.
Usage is essentially:
condor_run your_process
and that queues the job up and runs it on an available core. The only
other setup you need to do is run a manager program on each system
that can run stuff.
And if you have a LOT of jobs, you can probably get some time on the
global Condor pool.
Of course, that only helps if you have lots of jobs that each takes a
somewhat-reasonable amount of time to run. If your problem is that
processing one document takes three days, it's not gonna help.
Cheers,
-Nate
1: Disclaimer: Condor is built here at Madison. That being said, it
really is sweet.
On Tue, Oct 26, 2010 at 4:41 PM, Mark A. Matienzo <[log in to unmask]> wrote:
> Has anyone out there built a distributed application using Hadoop (or
> another MapReduce framework) and FOP? I'm interested in ways we can
> potentially allow our XSL:FO processing to scale.
>
> Mark A. Matienzo
> Digital Archivist, Manuscripts and Archives
> Yale University Library
>
|