Do you need OCR? This script => http://bookscanner.pbworks.com/w/page/45609343/Homer%20bash%20script will OCR a directory of TIFFs (using Tesseract) and build a PDF using Tesseract. It's a little old, but I still use it pretty much every day. I think you'll need to have Ruby 1.9 installed, since the PDFBeads library uses Hpricot. There's lots of Document View/Book Widget/Page Turners...the Internet Archive one is good. I also really like the NYTime Document Viewer ( https://github.com/documentcloud/document-viewer ). The DocumentCloud people also have something to rip your PDFs apart and put them into the viewer ( https://github.com/documentcloud/docsplit ) On Fri, Nov 8, 2013 at 8:23 PM, Karen Coyle <[log in to unmask]> wrote: > +1 for the viewer concept, and I'll add that viewing & downloading meet > different needs and should both be offered if possible. (said because of > recently having had to download huge PDFs just to glance at a few pages). > > kc > > > On 11/8/13 11:10 AM, Edward Summers wrote: > >> It is sad to me that converting to PDF for viewing off the Web seems like >> the answer. Isn’t there a tiling viewer (like Leaflet) that could be used >> to render jpeg derivatives of the original tif files in Omeka? >> >> For an example of using Leaflet (usually used for working with maps) in >> this way checkout NYTimes Machine Beta: >> >> http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html >> >> //Ed >> >> On Nov 8, 2013, at 2:00 PM, Kyle Banerjee <[log in to unmask]> >> wrote: >> >> We are in the process of migrating our digital collections from CONTENTdm >>> to Omeka and are trying to figure out what to do about the compound >>> objects >>> -- the vast majority of which are digitized books. >>> >>> The source files are actually hi res tiffs but since ginormous objects >>> broken into hundreds of pieces (each of which can be well over 100MB in >>> size) aren't exactly friendly to use, we'd like to stitch them into >>> individual pdf's that can be viewed more conveniently >>> >>> My game plan is to simply have a script pull the files down as jpegs >>> which >>> can be fed to imagemagick which can theoretically do everything I need. >>> However, I've never actually done anything like this before, so I wanted >>> to >>> see if there's a method that people have used for combining lots of >>> images >>> into pdfs that works particularly well. Thanks, >>> >>> kyle >>> >> > -- > Karen Coyle > [log in to unmask] http://kcoyle.net > m: 1-510-435-8234 > skype: kcoylenet >