Thanks for everyone's ideas. I think Shaun's solution is exactly what I'm trying to do. > Shaun Ellis <mailto:[log in to unmask]> > June 16, 2014 at 10:36 PM > Ryan, it sounds like you simply want to pull two relational tables > into drupal using referenced entities in a one (question) to many > (answers) relationship? > > This can be accomplished, albeit unintuitively (it is Drupal > afterall), by using the Feeds CSV parser with Feeds Tamper, and I > would skip what seems like an unnecessary conversion to XML step. > > First, you import your questions and set a GUID target (typically your > db id). Then import your answers -- each should have a reference to > the id mapped using the "Entity Reference by Feeds GUID". I have done > it in reverse order too, in which case you'd import all your answers > first. In this case, your second (questions) import needs to contain a > single column of delimited answer GUIDs into a single column. Use the > Feeds Tamper explode plugin on that field during mapping as explained > (rather vaguely) here: > > http://drupal.stackexchange.com/questions/32234/how-to-use-feeds-module-to-import-multi-value-fields > > > If that's not what you're trying to do, can you clarify? > > -Shaun > > > > Joshua Welker <mailto:[log in to unmask]> > June 16, 2014 at 2:35 PM > Sorry, the last line got messed up by outlook. > > #now save the whole thing as an xml file > > with open('myfile.xml', 'wb') as file > ElementTree(rootNode).write(file) > > > Josh Welker > > > -----Original Message----- > From: Joshua Welker [mailto:[log in to unmask]] > Sent: Monday, June 16, 2014 2:32 PM > To: Code for Libraries > Subject: RE: [CODE4LIB] Excel to XML (for a Drupal Feeds import) > > This should be quite doable in most programming languages with > out-of-the-box tools and no tricky parsing code. The gist is to save in > Excel as a delimited text file (tab is a good choice), then have your > script > ingest the document and turn it into an array, and then turn the array > into > XML. In Python, it could be something like the code below (not tested but > the principles should be sound): > > import 'csv' > from elementtree.ElementTree import Element, SubElement > > #create a list > mylist = [] > > #open your delimited file with a csv reader with open('myfile.txt', > 'rb') as > textfile: > reader = csv.reader( textfile, delimiter='\t', quotechat='"') #this > assumes your file is tab-delimited (\t) > > #loop through rows in your file and save each row as a key/value pair > (dictionary) > for row in textfile: > fields = { > 'field1': row[0] > 'field2': row[1] > 'field3': row[2] > 'field4': row[3] > } > > #append this row to our master list > mylist.append( fields ) > > > #create an xml root node > rootNode = Element("XmlRoot") > > #loop through our list of "rows" from the text file and create xml > nodes for > row in mylist: > rowNode = Element("record") > > #loop through all the fields on this "row" and turn them into xml nodes > for fieldName, fieldValue in row: > fieldNode = Element(fieldName) > fieldNode.text = fieldValue > > #append each field node to the parent row node > rowNode.append(fieldNode) > > #append each row node to the document root node > rootNode.append(rowNode) > > #now save the whole thing as an xml file with open('myfile.xml', 'wb') as > file > ElementTree(rootNode).write(file) > > > > Josh Welker > > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf > Of Kyle > Banerjee > Sent: Monday, June 16, 2014 1:04 PM > To: [log in to unmask] > Subject: Re: [CODE4LIB] Excel to XML (for a Drupal Feeds import) > > I'd just do this the old fashioned way. Awk is great for problems like > this. > For example, if your file is tab delimited, the following should work > > awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}'' > yourfile > > In the example above, I just print the fields but you could easily encase > them in tags. > > kyle > > > Joshua Welker <mailto:[log in to unmask]> > June 16, 2014 at 2:32 PM > This should be quite doable in most programming languages with > out-of-the-box tools and no tricky parsing code. The gist is to save in > Excel as a delimited text file (tab is a good choice), then have your > script > ingest the document and turn it into an array, and then turn the array > into > XML. In Python, it could be something like the code below (not tested but > the principles should be sound): > > import 'csv' > from elementtree.ElementTree import Element, SubElement > > #create a list > mylist = [] > > #open your delimited file with a csv reader > with open('myfile.txt', 'rb') as textfile: > reader = csv.reader( textfile, delimiter='\t', quotechat='"') #this > assumes your file is tab-delimited (\t) > > #loop through rows in your file and save each row as a key/value pair > (dictionary) > for row in textfile: > fields = { > 'field1': row[0] > 'field2': row[1] > 'field3': row[2] > 'field4': row[3] > } > > #append this row to our master list > mylist.append( fields ) > > > #create an xml root node > rootNode = Element("XmlRoot") > > #loop through our list of "rows" from the text file and create xml nodes > for row in mylist: > rowNode = Element("record") > > #loop through all the fields on this "row" and turn them into xml nodes > for fieldName, fieldValue in row: > fieldNode = Element(fieldName) > fieldNode.text = fieldValue > > #append each field node to the parent row node > rowNode.append(fieldNode) > > #append each row node to the document root node > rootNode.append(rowNode) > > #now save the whole thing as an xml file > with open('myfile.xml', 'wb') as file > ElementTree(rootNode).write(file) > > > > Josh Welker > > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf > Of Kyle > Banerjee > Sent: Monday, June 16, 2014 1:04 PM > To: [log in to unmask] > Subject: Re: [CODE4LIB] Excel to XML (for a Drupal Feeds import) > > I'd just do this the old fashioned way. Awk is great for problems like > this. > For example, if your file is tab delimited, the following should work > > awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}'' > yourfile > > In the example above, I just print the fields but you could easily encase > them in tags. > > kyle > > > Kyle Banerjee <mailto:[log in to unmask]> > June 16, 2014 at 1:03 PM > I'd just do this the old fashioned way. Awk is great for problems like > this. For example, if your file is tab delimited, the following should > work > > awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}'' > yourfile > > In the example above, I just print the fields but you could easily encase > them in tags. > > kyle > > > Ryan Engel <mailto:[log in to unmask]> > June 16, 2014 at 11:29 AM > Thanks for the responses, on the list and off, so far. > > As I'm sure is true for so many of us, my interest in learning more > about how to solve this type of problem is balanced against my need to > just get the project done so I can move on to other things. One of > the great things about this list is the ability to learn from the > collective experiences of colleagues. For this project specifically, > even clues about better search terms is useful; as Chris Gray pointed > out, basic Google searches present too many hits. > > I did try following the "Create an XML data file and XML schema file > from worksheet data" instructions on the Microsoft site. And it did > produce an XML document, but it wasn't able to transform this: > Row1 Question1 Q1Answer1 > Row2 Q1Answer2 > > ...into something like this: > <row1>Row One Data</row1> > <question1>This is a question</question1> > <answers> > <q1answer1>Answer 1</q1answer1> > <q1answer2>Answer2</q1answer2> > </answers > > Instead, I could get it to either convert every row into its own XML > entry, meaning that I had a lot of answers with no associated > questions, or I got an XML file that had 1 question with EVERY SINGLE > answer nested beneath it -- effectively all questions after the first > question were ignored. Based on those results, I wasn't sure if there > is more tweaking I could do in Excel, or if there is some programmed > logic in Excel that can't be accounted for when associating a schema. > > > Another suggestion I received was to "fill" the question column so > that every row had a question listed. I did consider this, but the > problem then is during the data import, I'd have to convince my CMS to > put all the answers back together based on the question, something I'm > sure Drupal COULD do, but I'm not sure how to do that either. > > > Finally, this project is a spreadsheet with 225,270 rows, so you can > imagine why I'd like a process that is reasonably trustworthy AND that > can run locally. > > > Anyway, any/all additional suggestions appreciated, even if they are > "try searching for "blah blah python parser", or "I made something > that solves a similar process, and you can download it from Git". > > Ryan > ___ > > Ryan Engel > Web Stuff > UW-Madison > -- Ryan Engel Web Services Architect Learning Support Services College of Letters and Science University of Wisconsin - Madison 291 Van Hise Hall [log in to unmask] 608-263-5002