On Wed, 29 Apr 2015, Sergio Letuche wrote:
> Dear all,
> we have a pdf, that is taken from a to be printed pdf, full of tables. The
> text is split in two columns. How would you suggest we uploaded this pdf to
> the web? We would like to keep the structure, and split each section taken
> from the table of contents as a page, but also keep the format, and if
> possible, serve the content both in an html view, and in a pdf view, based
> on the preference of the user.
The last time I spoke to someone from AAS about how they extracted their
'Data Behind the Table' (aka 'DbT'), it was mostly dependent upon getting
something from the author when it was still in a useful format.
> The document is made with Indesign CS6, and i do not know in which format i
> could transform it into
There are a few ways to do tables in InDesign, as it's page layout
software. If it's in a single table within a text block, and there's
nothing strange within each cell, you should be able to just select the
table, copy it, and paste it out into a text editor. You'll get line
returns between each row, and tabs between each cell.
If they've placed line returns within the cells, those will get pasted in
the middle of the cell, which can really screw you up.
For cases like that, it's sometimes easiest to go through the file, and
paste HTML elements at the beginning of each cell to mark table cells
(<td>), so when you export, you have markers as to which are legitimate
changes in cells, and which are line returns in the file.
I then do post-processing to add in the close cells, and the row markers.
If I were using BBEdit, I'd do:
If you're doing it in some other editor that supports search/replace, you
should be able to do similar, but you might need to figure out how to
specify tabs & line returns in your program.
... and then fix the initial & final lines. (and maybe convert some of
the <td>s into <th>s)
ps. after getting in trouble last week, I should mention that all
statements are my own, and I don't represent NASA or any other
organizations in this matter.