Print

Print


On Wed, 29 Apr 2015, Sergio Letuche wrote:

> Dear all,
>
> we have a pdf, that is taken from a to be printed pdf, full of tables. The
> text is split in two columns. How would you suggest we uploaded this pdf to
> the web? We would like to keep the structure, and split each section taken
> from the table of contents as a page, but also keep the format, and if
> possible, serve the content both in an html view, and in a pdf view, based
> on the preference of the user.

The last time I spoke to someone from AAS about how they extracted  their 
'Data Behind the Table' (aka 'DbT'), it was mostly dependent upon getting 
something from the author when it was still in a useful format.


> The document is made with Indesign CS6, and i do not know in which format i
> could transform it into

There are a few ways to do tables in InDesign, as it's page layout 
software.  If it's in a single table within a text block, and there's 
nothing strange within each cell, you should be able to just select the 
table, copy it, and paste it out into a text editor.  You'll get line 
returns between each row, and tabs between each cell.

If they've placed line returns within the cells, those will get pasted in 
the middle of the cell, which can really screw you up.

For cases like that, it's sometimes easiest to go through the file, and 
paste HTML elements at the beginning of each cell to mark table cells 
(<td>), so when you export, you have markers as to which are legitimate 
changes in cells, and which are line returns in the file.

I then do post-processing to add in the close cells, and the row markers.

If I were using BBEdit, I'd do:

 	Find :
 		\t<td>
 	Replace :
 		</td><td>

 	Find:
 		\r<td>
 	Replace :
 		</td></tr>\r<tr><td>

If you're doing it in some other editor that supports search/replace, you 
should be able to do similar, but you might need to figure out how to 
specify tabs & line returns in your program.

... and then fix the initial & final lines.  (and maybe convert some of 
the <td>s into <th>s)

-Joe


ps.  after getting in trouble last week, I should mention that all
      statements are my own, and I don't represent NASA or any other
      organizations in this matter.