CSV is an unfortunate format because there's not a single accepted way to
do quoting and escaping. Excel is a problematic tool because it often does
things to your data that you don't want.
Counting commas won't work unless none of your data fields contain commas.
However, if that happens to be a true statement, awk makes quick work of
the problem:
awk 'BEGIN{FS=OFS=","}{if (NF == 10){print $0}}' yourfile.csv >
loadablefile.csv
awk 'BEGIN{FS=OFS=","}{if (NF != 10){print $0}}' yourfile.csv >
unloadablerecords.csv
assuming you expect 10 fields.
If values in your file might contain commas, translating it to something
else before analyzing simplifies things. Lots of ways to do this, csvtool
is particularly easy:
csvtool -t COMMA -u TAB cat yourfile.csv > yourfile.tsv (convert to tab
delimited)
awk 'BEGIN{FS=OFS=","}{if (NF == 10){print $0}}' yourfile.tsv > goodfile.tsv
csvtool -t TAB -u COMMA cat goodfile.tsv > loadablefile.csv
kyle
On Mon, Jan 16, 2023 at 4:26 PM Max <[log in to unmask]> wrote:
> Hi code4lib folks:
>
> Does anyone know a tool or hack to help fix a problem at a CSV that's
> causing a "The rows are not all the same number of columns." error when
> trying to import the CSV at a web application? I'm trying to use the CSV
> Import module <https://omeka.org/s/docs/user-manual/modules/csvimport/> at
> Omeka S. I've had success in the past with different CSV files. But some
> kinda problem at the CSV I'm trying to import right now is causing this
> error, & reviewing the CSV in Excel & as plain text (literally counting
> commas to confirm rows are the same number of columns) isn't helping.
> Thanks!
>
> Cheers all,
> Max
>
> Maxwell Gray
> https://maxgray20.com
>
|