Print

Print


Thanks for the responses, on the list and off, so far.

As I'm sure is true for so many of us, my interest in learning more 
about how to solve this type of problem is balanced against my need to 
just get the project done so I can move on to other things.  One of the 
great things about this list is the ability to learn from the collective 
experiences of colleagues.  For this project specifically, even clues 
about better search terms is useful; as Chris Gray pointed out, basic 
Google searches present too many hits.

I did try following the "Create an XML data file and XML schema file 
from worksheet data" instructions on the Microsoft site.  And it did 
produce an XML document, but it wasn't able to transform this:
Row1    Question1    Q1Answer1
Row2                        Q1Answer2

...into something like this:
<row1>Row One Data</row1>
<question1>This is a question</question1>
<answers>
<q1answer1>Answer 1</q1answer1>
<q1answer2>Answer2</q1answer2>
</answers

Instead, I could get it to either convert every row into its own XML 
entry, meaning that I had a lot of answers with no associated questions, 
or I got an XML file that had 1 question with EVERY SINGLE answer nested 
beneath it -- effectively all questions after the first question were 
ignored.  Based on those results, I wasn't sure if there is more 
tweaking I could do in Excel, or if there is some programmed logic in 
Excel that can't be accounted for when associating a schema.


Another suggestion I received was to "fill" the question column so that 
every row had a question listed.  I did consider this, but the problem 
then is during the data import, I'd have to convince my CMS to put all 
the answers back together based on the question, something I'm sure 
Drupal COULD do, but I'm not sure how to do that either.


Finally, this project is a spreadsheet with 225,270 rows, so you can 
imagine why I'd like a process that is reasonably trustworthy AND that 
can run locally.


Anyway, any/all additional suggestions appreciated, even if they are 
"try searching for "blah blah python parser", or "I made something that 
solves a similar process, and you can download it from Git".

Ryan
___

Ryan Engel
Web Stuff
UW-Madison
> Dana Pearson <mailto:[log in to unmask]>
> June 13, 2014 at 7:14 PM
> I don't use Excel but a client did who wanted to use XSL I had created 
> ONIX
> to MARC to transform bibliographic metadata in Excel to XML. The built
> in Excel XML converter was not very helpful since empty cells were skipped
> so that it was impossible to use that result.
>
> There is an add on that allow you to map your data to XML elements by
> creating a schema which is pretty cool.
>
> http://bit.ly/1jpwtqM
>
> This might be helpful.
>
> regards,
> dana
>
>
>
>
>
> Terry Brady <mailto:[log in to unmask]>
> June 13, 2014 at 6:53 PM
> The current version of Excel offers a save as XML option.
>
> It will produce something like this. There is other wrapping metadata, but
> the table is pretty easy to parse.
>
> <Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="7"
> x:FullColumns="1"
> x:FullRows="1" ss:DefaultRowHeight="15">
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 1</Data></Cell>
> <Cell><Data ss:Type="String">question 1</Data></Cell>
> <Cell><Data ss:Type="String">answer 1</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 2</Data></Cell>
> <Cell ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 3</Data></Cell>
> <Cell ss:Index="3"><Data ss:Type="String">answer 3</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 4</Data></Cell>
> <Cell><Data ss:Type="String">question 2</Data></Cell>
> <Cell><Data ss:Type="String">answer 1</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 5 </Data></Cell>
> <Cell ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"><Data ss:Type="String">row 6</Data></Cell>
> <Cell><Data ss:Type="String">quest </Data></Cell>
> <Cell><Data ss:Type="String">answer 3</Data></Cell>
> </Row>
> <Row>
> <Cell ss:StyleID="s62"/>
> </Row>
> </Table>
>
>
>
>
>
> Ryan Engel <mailto:[log in to unmask]>
> June 13, 2014 at 4:28 PM
> Hello -
>
> I have an Excel spreadsheet that, for the purposes of an easy import 
> into a Drupal site, I'd like to convert to XML.  I know people more 
> knowledgeable than I could code up something in Python or Perl to 
> convert a CSV version of the data to XML (and I have a colleague who 
> offered to do just that for me), but I am looking for recommendations 
> for something more immediately accessible.
>
> Here's an idea of how the spreadsheet is structured:
>
> Row1    Question1    Q1Answer1
> Row2                        Q1Answer2
> Row3                        Q1Answer3
> Row4    Question2    Q2Answer1
> Row5                        Q2Answer2
> Row6    Question3    Q3Answer1
> etc.
>
> How do other people approach this?  Import the data to an SQL 
> database, write some clever queries, and then export that to XML?  
> Work some wizardry in GoogleRefine/OpenRefine?  Are scripting 
> languages really the best all around solution?  Excel's built in XML 
> mapping function wasn't able to process the one-to-many relationship 
> of questions to answers, though maybe I just don't know how to build 
> the mapping structure correctly.
>
> In the interest immanent deadlines, I have handed the spreadsheet off 
> to my Perl-writing colleague.  But as a professional growth 
> opportunity, I'm interested in suggestions from Libraryland about ways 
> others have approached this successfully.
>
> Thanks!
>
> Ryan Engel
> Web Stuff
> UW-Madison

--