I'd just do this the old fashioned way. Awk is great for problems like this. For example, if your file is tab delimited, the following should work awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}'' yourfile In the example above, I just print the fields but you could easily encase them in tags. kyle On Mon, Jun 16, 2014 at 9:29 AM, Ryan Engel <[log in to unmask]> wrote: > Thanks for the responses, on the list and off, so far. > > As I'm sure is true for so many of us, my interest in learning more about > how to solve this type of problem is balanced against my need to just get > the project done so I can move on to other things. One of the great things > about this list is the ability to learn from the collective experiences of > colleagues. For this project specifically, even clues about better search > terms is useful; as Chris Gray pointed out, basic Google searches present > too many hits. > > I did try following the "Create an XML data file and XML schema file from > worksheet data" instructions on the Microsoft site. And it did produce an > XML document, but it wasn't able to transform this: > Row1 Question1 Q1Answer1 > Row2 Q1Answer2 > > ...into something like this: > <row1>Row One Data</row1> > <question1>This is a question</question1> > <answers> > <q1answer1>Answer 1</q1answer1> > <q1answer2>Answer2</q1answer2> > </answers > > Instead, I could get it to either convert every row into its own XML > entry, meaning that I had a lot of answers with no associated questions, or > I got an XML file that had 1 question with EVERY SINGLE answer nested > beneath it -- effectively all questions after the first question were > ignored. Based on those results, I wasn't sure if there is more tweaking I > could do in Excel, or if there is some programmed logic in Excel that can't > be accounted for when associating a schema. > > > Another suggestion I received was to "fill" the question column so that > every row had a question listed. I did consider this, but the problem then > is during the data import, I'd have to convince my CMS to put all the > answers back together based on the question, something I'm sure Drupal > COULD do, but I'm not sure how to do that either. > > > Finally, this project is a spreadsheet with 225,270 rows, so you can > imagine why I'd like a process that is reasonably trustworthy AND that can > run locally. > > > Anyway, any/all additional suggestions appreciated, even if they are "try > searching for "blah blah python parser", or "I made something that solves a > similar process, and you can download it from Git". > > Ryan > ___ > > Ryan Engel > Web Stuff > UW-Madison > >> Dana Pearson <mailto:[log in to unmask]> >> June 13, 2014 at 7:14 PM >> I don't use Excel but a client did who wanted to use XSL I had created >> ONIX >> to MARC to transform bibliographic metadata in Excel to XML. The built >> in Excel XML converter was not very helpful since empty cells were skipped >> so that it was impossible to use that result. >> >> There is an add on that allow you to map your data to XML elements by >> creating a schema which is pretty cool. >> >> http://bit.ly/1jpwtqM >> >> This might be helpful. >> >> regards, >> dana >> >> >> >> >> >> Terry Brady <mailto:[log in to unmask]> >> June 13, 2014 at 6:53 PM >> The current version of Excel offers a save as XML option. >> >> It will produce something like this. There is other wrapping metadata, but >> the table is pretty easy to parse. >> >> <Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="7" >> x:FullColumns="1" >> x:FullRows="1" ss:DefaultRowHeight="15"> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 1</Data></Cell> >> <Cell><Data ss:Type="String">question 1</Data></Cell> >> <Cell><Data ss:Type="String">answer 1</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 2</Data></Cell> >> <Cell ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 3</Data></Cell> >> <Cell ss:Index="3"><Data ss:Type="String">answer 3</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 4</Data></Cell> >> <Cell><Data ss:Type="String">question 2</Data></Cell> >> <Cell><Data ss:Type="String">answer 1</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 5 </Data></Cell> >> <Cell ss:Index="3"><Data ss:Type="String">answer 2</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"><Data ss:Type="String">row 6</Data></Cell> >> <Cell><Data ss:Type="String">quest </Data></Cell> >> <Cell><Data ss:Type="String">answer 3</Data></Cell> >> </Row> >> <Row> >> <Cell ss:StyleID="s62"/> >> </Row> >> </Table> >> >> >> >> >> >> Ryan Engel <mailto:[log in to unmask]> >> June 13, 2014 at 4:28 PM >> Hello - >> >> I have an Excel spreadsheet that, for the purposes of an easy import into >> a Drupal site, I'd like to convert to XML. I know people more >> knowledgeable than I could code up something in Python or Perl to convert a >> CSV version of the data to XML (and I have a colleague who offered to do >> just that for me), but I am looking for recommendations for something more >> immediately accessible. >> >> Here's an idea of how the spreadsheet is structured: >> >> Row1 Question1 Q1Answer1 >> Row2 Q1Answer2 >> Row3 Q1Answer3 >> Row4 Question2 Q2Answer1 >> Row5 Q2Answer2 >> Row6 Question3 Q3Answer1 >> etc. >> >> How do other people approach this? Import the data to an SQL database, >> write some clever queries, and then export that to XML? Work some wizardry >> in GoogleRefine/OpenRefine? Are scripting languages really the best all >> around solution? Excel's built in XML mapping function wasn't able to >> process the one-to-many relationship of questions to answers, though maybe >> I just don't know how to build the mapping structure correctly. >> >> In the interest immanent deadlines, I have handed the spreadsheet off to >> my Perl-writing colleague. But as a professional growth opportunity, I'm >> interested in suggestions from Libraryland about ways others have >> approached this successfully. >> >> Thanks! >> >> Ryan Engel >> Web Stuff >> UW-Madison >> > > -- > > >