Print

Print


Also, just to be clear, the data file is a tab-delimited text file, not a
CSV (comma-separated quoted values) file. Whenever processing data it's
important to be clear about what format you are working with. I happen to
prefer tab-delimited text files over CSV myself, as in this case like in
many others, the data itself can have quotes, which can play havoc on a
program expecting them only as delimiters.
Roy


On Mon, Nov 25, 2013 at 9:49 AM, Joshua Gomez <[log in to unmask]> wrote:

> If all you want to do is add a tab to the beginning of each line, then you
> don't need to bother using the csv library.  Just open your file, read it
> line by line, prepend a tab to each line and write it out again.
>
> src = open('noid_refworks.txt','rU')
> tgt = open('withid.txt', 'w')
>
> for line in src.readlines():
>     line = '\t%s' % line
>     tgt.write(line)
>
> -Joshua
>
> ________________________________________
> From: Code for Libraries <[log in to unmask]> on behalf of Bohyun
> Kim <[log in to unmask]>
> Sent: Monday, November 25, 2013 9:10 AM
> To: [log in to unmask]
> Subject: [CODE4LIB] Tab delimited file with Python CSV
>
> Hi all,
>
> I am new to Python and was wondering if I can get some help with my short
> script. What I would like the script to do is:
> (1) Read the tab delimited file generated by Refworks
> (2) Output exactly the same file but the blank column added in front.
> (This is for prepping the exported tab delimited file from refworks so
> that it can be imported into MySQL; so any suggestions in the line of
> timtoady would be also appreciated.)
>
> This is what I have so far. It works, but then in the output file, I end
> up getting some weird character in each line in the second column (first
> column in the original input file). I also don't really get what
> escapechar=' ' does or what I am supposed to put in there.
>
> import csv
> with open('noid_refworks.txt','rU') as csvinput:
>     with open('withid.txt', 'w') as csvoutput:
>         dialect = csv.Sniffer().sniff(csvinput.read(1024))
>         csvinput.seek(0)
>         reader = csv.reader(csvinput, dialect)
>         writer = csv.writer(csvoutput, dialect, escapechar='\'',
> quoting=csv.QUOTE_NONE)
>         for row in reader:
>             writer.writerow(['\t']+row)
>
> A row in the original file is like this (Tab delimited and no quotations,
> some fields have commas and quotation marks inside.):
>
> Reference Type    Authors, Primary    Title Primary    Periodical Full
>  Periodical Abbrev    Pub Year    Pub Date Free From    Volume    Issue
>  Start Page    Other Pages    Keywords    Abstract    Notes    Personal
> Notes    Authors, Secondary    Title Secondary    Edition    Publisher
>  Place Of Publication    Authors, Tertiary    Authors, Quaternary
>  Authors, Quinary    Title, Tertiary    ISSN/ISBN    Availability
>  Author/Address    Accession Number    Language    Classification    Sub
> file/Database    Original Foreign Title    Links    DOI    Call Number
>  Database    Data Source    Identifying Phrase    Retrieved Date
>  Shortened Title    User 1    User 2    User 3    User 4    User 5    User
> 6    User 7    User 8    User 9    User 10    User 11    User 12    User 13
>    User 14    User 15
>
> A row in the output file is like this:
> (The tab is successfully inserted. But I don't get why I have L inserted
> after no matter what I put in escapechar)
>
>     LReference Type    Authors, Primary    Title Primary    Periodical
> Full    Periodical Abbrev    Pub Year    Pub Date Free From    Volume
>  Issue    Start Page    Other Pages    Keywords    Abstract    Notes
>  Personal Notes    Authors, Secondary    Title Secondary    Edition
>  Publisher    Place Of Publication    Authors, Tertiary    Authors,
> Quaternary    Authors, Quinary    Title, Tertiary    ISSN/ISBN
>  Availability    Author/Address    Accession Number    Language
>  Classification    Sub file/Database    Original Foreign Title    Links
>  DOI    Call Number    Database    Data Source    Identifying Phrase
>  Retrieved Date    Shortened Title    User 1    User 2    User 3    User 4
>    User 5    User 6    User 7    User 8    User 9    User 10    User 11
>  User 12    User 13    User 14    User 15
>
>
> Any help or pointers would be greatly appreciated!
> ~Bohyun
>