Print

Print


Matt Butler wrote:
> Depending on how your Excel file is set up, the least painful way
> might just be to do it all in Excel. Add columns in between each
> field, throw XML strings into those, then concatenate each row into
> a single cell at the end of the row and copy-paste that final column
> out. If you want cleaner XML (i.e. one attribute per line rather
> than the all the item's attributes strung together) you can add
> regular expressions in with your XML and parse them out later.

I've done something similar to the above recently and it seems to be
the most efficient way.  The process was actually: Excel file (with no
tabs in it) to tab-separated-values to TSV-with-&<>-replaced to XML.

The most time-consuming part of getting bibliographic data out of
spreadsheets into an XML format is finding special-but-invalid cases
in the spreadsheet - because a spreadsheet probably didn't check for
it and your XML-using tool probably throws them out.  So I think it's
more efficient to keep the toolchain as short as possible because you
may want to repeat it a few times to get as good as it gets.

Good luck!
-- 
MJ Ray (slef), member of www.software.coop, a for-more-than-profit co-op.
http://koha-community.org supporter, web and LMS developer, statistician.
In My Opinion Only: see http://mjr.towers.org.uk/email.html
Available for hire for Koha work http://www.software.coop/products/koha