Print

Print


Nitpick: RFC 7111 (the standard for CSV files in MIME, although, as noted
by others in this thread, CSV in practice might as well not have a standard
at all since different tools feel free to do whatever the heck they want)
says to always use Windows-style line endings in CSV files (CRLF), whatever
system you're on.

I don't see any explicit line ending handling in the github repo you linked
to, but other things will likely expect this ending (including, by not
limited to, Python's csv library which also has some suggestions for usage
to fix things.)

CSVImport also seems to raise an entirely different error message if it
thinks your file isn't UTF-8, but it doesn't hurt to verify in advance.
It's certainly one of the easier encodings to tell a bunch of bytes
definitely isn't using, but I didn't verify whether CSVImport is actually
doing that in a foolproof way...

On Tue, Jan 17, 2023 at 10:08 AM Benjamin Armintor <[log in to unmask]>
wrote:

> Sometimes in a situation like this it's useful to look at the source of the
> message. In the event that this is true here...
>
> This error is raised by the CSV import module here:
>
> https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/AbstractSource.php#L89-L94
> The checkNumberOfColumnsByRow function returning false (and triggering the
> error) is here:
>
> https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/CsvFile.php#L126-L144
>
> That function appears to verify two things:
> 1. The iterator over the CSV is not empty
> 2. The rows all have the same number of values as the header row had
> headers
>
> The iterator in question is from an SplFileObject (
> https://www.php.net/manual/en/class.splfileobject.php). So either your
> uploaded file appears empty to the CSV reader, or some row has a number of
> cells different from the header row. If I were you, before I started
> digging into escaped values and what not, I would:
> 1. Make sure I had the number of headers I intended
> 2. Make sure my file is UTF8 encoded, and make sure I was using Unix-style
> line endings (a single newline character)
> 3. Make sure I didn't have empty lines (I notice that
>
> https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/CsvFile.php#L189-L190
> sets the SKIP_EMPTY flag, but not DROP_NEW_LINE), including the end of the
> file
>
> If you've already done these things, I apologize for being so rudimentary -
> but it's always good to verify the basic assumptions before you dive into
> more elaborate data inspection.
>
> Good luck!
> Ben
>
> PS: I know this stuff is frustrating - it might be worth opening an issue
> on that CSVImport github repository to improve the error message!
>
> On Tue, Jan 17, 2023 at 9:15 AM Max <[log in to unmask]> wrote:
>
> > Hi again folks:
> >
> > Many thanks for everyone's replies yesterday evening! I retried fixing
> via
> > OpenRefine, & no success. & I'm using double quotes for CSV comma
> > enclosure. So I don't think commas inside values are the problem. (I've
> had
> > success in the past with multiple different CSV files that included extra
> > commas inside values & used double quotes for CSV comma enclosure, & so
> > they weren't a problem.) RE: counting blanks in columns, Jackie Keith
> > recommended using the COUNTBLANK formula at Excel/Google Sheets, which
> was
> > easy to use, but still no success. (I didn't find any blanks in the
> data.)
> > Anyway, I wanted to update folks, & say thanks again, everyone!
> >
> > Cheers all,
> > Max
> >
> > Maxwell Gray
> > https://maxgray20.com
> >
> >
> > On Mon, Jan 16, 2023 at 8:59 PM Joe Hourclé <[log in to unmask]>
> wrote:
> >
> > > > On Jan 16, 2023, at 7:26 PM, Max <[log in to unmask]> wrote:
> > > >
> > > > Hi code4lib folks:
> > > >
> > > > Does anyone know a tool or hack to help fix a problem at a CSV that's
> > > > causing a "The rows are not all the same number of columns." error
> when
> > > > trying to import the CSV at a web application? I'm trying to use the
> > CSV
> > > > Import module <
> https://omeka.org/s/docs/user-manual/modules/csvimport/
> > >
> > > at
> > > > Omeka S. I've had success in the past with different CSV files. But
> > some
> > > > kinda problem at the CSV I'm trying to import right now is causing
> this
> > > > error, & reviewing the CSV in Excel & as plain text (literally
> counting
> > > > commas to confirm rows are the same number of columns) isn't helping.
> > >
> > > I’ve been known to do a find/replace on commas, then set the tab width
> to
> > > something very large and then look for the rows that don’t line up.
> > >
> > > But CSV is tricky, as it’s more than just commas that are significant,
> > you
> > > also have to consider quotations marks… which allow it so you can put
> > > commas or line returns within a string field.
> > >
> > > -Joe
> > >
> > > Sent from a mobile device with a crappy on screen keyboard and
> obnoxious
> > > "autocorrect"
> > >
> >
>