On Mon, Apr 7, 2008 at 8:30 AM, Esha Datta <[log in to unmask]> wrote:
> NYU is looking at e-publishing in general and how it ties in with
> preservation requirements. Have any of you done any work with PDF/A
> and generating access files from that format? We have a number of
> books that will be converted to the pdf format. We're looking at PDF/
> A for ingestion into our preservation repository(a DSpace instance)
> and generating access files from it. How easy/difficult was it to
> generate a workflow for working with PDFs, generating PDF/As, and
> access files from PDF/As.
One more vote for OJS, here -- we're running the Journal of Insect
Science[1] with it, very successfully.
With respect to generating HTML and PDFs, my understanding (it's a
little fuzzy) is that we have manuscripts converted to XML by a
third-party, and then use a combination of XSLT and Prince to generate
professional-quality documents. Prince isn't cheap, but man, if it
isn't good at what it does. If you were gonna start at this again, you
might be able to build a wrapper around Gecko or WebKit to do the
work... but that'd take time.
IIRC, it's all pretty cheap (dunno if I can disclose our XML
processing rate -- suffice to say, it's cheaper than undergrads), and
takes somewhere in the 3-4 hours per article timeframe. I think
there's a fairly good potential for economies of scale, were we to add
more titles.
A great person to talk to is Andrew Gough <[log in to unmask]> --
he developed most of the workflow and procedures we use at Madison
(I've copied him here, in case this message contains gross
inaccuracies).
Cheers,
-Nate
[1]: http://insectscience.org/
|