Print

Print


Thanks for everyone's ideas.  I think Shaun's solution is exactly what 
I'm trying to do.


> Shaun Ellis <mailto:[log in to unmask]>
> June 16, 2014 at 10:36 PM
> Ryan, it sounds like you simply want to pull two relational tables 
> into drupal using referenced entities in a one (question) to many 
> (answers) relationship?
>
> This can be accomplished, albeit unintuitively (it is Drupal 
> afterall), by using the Feeds CSV parser with Feeds Tamper, and I 
> would skip what seems like an unnecessary conversion to XML step.
>
> First, you import your questions and set a GUID target (typically your 
> db id).  Then import your answers -- each should have a reference to 
> the id mapped using the "Entity Reference by Feeds GUID".  I have done 
> it in reverse order too, in which case you'd import all your answers 
> first. In this case, your second (questions) import needs to contain a 
> single column of delimited answer GUIDs into a single column.  Use the 
> Feeds Tamper explode plugin on that field during mapping as explained 
> (rather vaguely) here:
>
> http://drupal.stackexchange.com/questions/32234/how-to-use-feeds-module-to-import-multi-value-fields 
>
>
> If that's not what you're trying to do, can you clarify?
>
> -Shaun
>
>
>
> Joshua Welker <mailto:[log in to unmask]>
> June 16, 2014 at 2:35 PM
> Sorry, the last line got messed up by outlook.
>
> #now save the whole thing as an xml file
>
> with open('myfile.xml', 'wb') as file
> ElementTree(rootNode).write(file)
>
>
> Josh Welker
>
>
> -----Original Message-----
> From: Joshua Welker [mailto:[log in to unmask]]
> Sent: Monday, June 16, 2014 2:32 PM
> To: Code for Libraries
> Subject: RE: [CODE4LIB] Excel to XML (for a Drupal Feeds import)
>
> This should be quite doable in most programming languages with
> out-of-the-box tools and no tricky parsing code. The gist is to save in
> Excel as a delimited text file (tab is a good choice), then have your 
> script
> ingest the document and turn it into an array, and then turn the array 
> into
> XML. In Python, it could be something like the code below (not tested but
> the principles should be sound):
>
> import 'csv'
> from elementtree.ElementTree import Element, SubElement
>
> #create a list
> mylist = []
>
> #open your delimited file with a csv reader with open('myfile.txt', 
> 'rb') as
> textfile:
> reader = csv.reader( textfile, delimiter='\t', quotechat='"') #this
> assumes your file is tab-delimited (\t)
>
> #loop through rows in your file and save each row as a key/value pair
> (dictionary)
> for row in textfile:
> fields = {
> 'field1': row[0]
> 'field2': row[1]
> 'field3': row[2]
> 'field4': row[3]
> }
>
> #append this row to our master list
> mylist.append( fields )
>
>
> #create an xml root node
> rootNode = Element("XmlRoot")
>
> #loop through our list of "rows" from the text file and create xml 
> nodes for
> row in mylist:
> rowNode = Element("record")
>
> #loop through all the fields on this "row" and turn them into xml nodes
> for fieldName, fieldValue in row:
> fieldNode = Element(fieldName)
> fieldNode.text = fieldValue
>
> #append each field node to the parent row node
> rowNode.append(fieldNode)
>
> #append each row node to the document root node
> rootNode.append(rowNode)
>
> #now save the whole thing as an xml file with open('myfile.xml', 'wb') as
> file
> ElementTree(rootNode).write(file)
>
>
>
> Josh Welker
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf 
> Of Kyle
> Banerjee
> Sent: Monday, June 16, 2014 1:04 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Excel to XML (for a Drupal Feeds import)
>
> I'd just do this the old fashioned way. Awk is great for problems like 
> this.
> For example, if your file is tab delimited, the following should work
>
> awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}''
> yourfile
>
> In the example above, I just print the fields but you could easily encase
> them in tags.
>
> kyle
>
>
> Joshua Welker <mailto:[log in to unmask]>
> June 16, 2014 at 2:32 PM
> This should be quite doable in most programming languages with
> out-of-the-box tools and no tricky parsing code. The gist is to save in
> Excel as a delimited text file (tab is a good choice), then have your 
> script
> ingest the document and turn it into an array, and then turn the array 
> into
> XML. In Python, it could be something like the code below (not tested but
> the principles should be sound):
>
> import 'csv'
> from elementtree.ElementTree import Element, SubElement
>
> #create a list
> mylist = []
>
> #open your delimited file with a csv reader
> with open('myfile.txt', 'rb') as textfile:
> reader = csv.reader( textfile, delimiter='\t', quotechat='"') #this
> assumes your file is tab-delimited (\t)
>
> #loop through rows in your file and save each row as a key/value pair
> (dictionary)
> for row in textfile:
> fields = {
> 'field1': row[0]
> 'field2': row[1]
> 'field3': row[2]
> 'field4': row[3]
> }
>
> #append this row to our master list
> mylist.append( fields )
>
>
> #create an xml root node
> rootNode = Element("XmlRoot")
>
> #loop through our list of "rows" from the text file and create xml nodes
> for row in mylist:
> rowNode = Element("record")
>
> #loop through all the fields on this "row" and turn them into xml nodes
> for fieldName, fieldValue in row:
> fieldNode = Element(fieldName)
> fieldNode.text = fieldValue
>
> #append each field node to the parent row node
> rowNode.append(fieldNode)
>
> #append each row node to the document root node
> rootNode.append(rowNode)
>
> #now save the whole thing as an xml file
> with open('myfile.xml', 'wb') as file
> ElementTree(rootNode).write(file)
>
>
>
> Josh Welker
>
> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf 
> Of Kyle
> Banerjee
> Sent: Monday, June 16, 2014 1:04 PM
> To: [log in to unmask]
> Subject: Re: [CODE4LIB] Excel to XML (for a Drupal Feeds import)
>
> I'd just do this the old fashioned way. Awk is great for problems like 
> this.
> For example, if your file is tab delimited, the following should work
>
> awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}''
> yourfile
>
> In the example above, I just print the fields but you could easily encase
> them in tags.
>
> kyle
>
>
> Kyle Banerjee <mailto:[log in to unmask]>
> June 16, 2014 at 1:03 PM
> I'd just do this the old fashioned way. Awk is great for problems like
> this. For example, if your file is tab delimited, the following should 
> work
>
> awk '{FS="\t"}{if ($2 != "") question = $2;}{print $1,question,$3}''
> yourfile
>
> In the example above, I just print the fields but you could easily encase
> them in tags.
>
> kyle
>
>
> Ryan Engel <mailto:[log in to unmask]>
> June 16, 2014 at 11:29 AM
> Thanks for the responses, on the list and off, so far.
>
> As I'm sure is true for so many of us, my interest in learning more 
> about how to solve this type of problem is balanced against my need to 
> just get the project done so I can move on to other things.  One of 
> the great things about this list is the ability to learn from the 
> collective experiences of colleagues.  For this project specifically, 
> even clues about better search terms is useful; as Chris Gray pointed 
> out, basic Google searches present too many hits.
>
> I did try following the "Create an XML data file and XML schema file 
> from worksheet data" instructions on the Microsoft site.  And it did 
> produce an XML document, but it wasn't able to transform this:
> Row1    Question1    Q1Answer1
> Row2                        Q1Answer2
>
> ...into something like this:
> <row1>Row One Data</row1>
> <question1>This is a question</question1>
> <answers>
> <q1answer1>Answer 1</q1answer1>
> <q1answer2>Answer2</q1answer2>
> </answers
>
> Instead, I could get it to either convert every row into its own XML 
> entry, meaning that I had a lot of answers with no associated 
> questions, or I got an XML file that had 1 question with EVERY SINGLE 
> answer nested beneath it -- effectively all questions after the first 
> question were ignored.  Based on those results, I wasn't sure if there 
> is more tweaking I could do in Excel, or if there is some programmed 
> logic in Excel that can't be accounted for when associating a schema.
>
>
> Another suggestion I received was to "fill" the question column so 
> that every row had a question listed.  I did consider this, but the 
> problem then is during the data import, I'd have to convince my CMS to 
> put all the answers back together based on the question, something I'm 
> sure Drupal COULD do, but I'm not sure how to do that either.
>
>
> Finally, this project is a spreadsheet with 225,270 rows, so you can 
> imagine why I'd like a process that is reasonably trustworthy AND that 
> can run locally.
>
>
> Anyway, any/all additional suggestions appreciated, even if they are 
> "try searching for "blah blah python parser", or "I made something 
> that solves a similar process, and you can download it from Git".
>
> Ryan
> ___
>
> Ryan Engel
> Web Stuff
> UW-Madison
>

-- 

Ryan Engel
Web Services Architect
Learning Support Services
College of Letters and Science
University of Wisconsin - Madison
291 Van Hise Hall
[log in to unmask]
608-263-5002