Hi again folks: Many thanks for everyone's replies this week! We figured it out here with y'all's help, & so I wanna share back what we learned. Like folks suggested, the problem was some combination of line breaks, commas & double quotes inside values. Maintaining the breaks/characters isn't important for us. So I used OpenRefine to "trim leading & trailing whitespace" & "collapse consecutive whitespace" at our very messy OCR transcript column to clean it up some. & then used find & replace at Excel to replace the breaks, commas & double quotes. & success! 👍👍 Thanks again for everyone's replies & personal messages & help! Y'all are great folks. 🙏🙏 Cheers all, Max Maxwell Gray https://maxgray20.com On Tue, Jan 17, 2023 at 10:49 AM Geoffrey Spear <[log in to unmask]> wrote: > Nitpick: RFC 7111 (the standard for CSV files in MIME, although, as noted > by others in this thread, CSV in practice might as well not have a standard > at all since different tools feel free to do whatever the heck they want) > says to always use Windows-style line endings in CSV files (CRLF), whatever > system you're on. > > I don't see any explicit line ending handling in the github repo you linked > to, but other things will likely expect this ending (including, by not > limited to, Python's csv library which also has some suggestions for usage > to fix things.) > > CSVImport also seems to raise an entirely different error message if it > thinks your file isn't UTF-8, but it doesn't hurt to verify in advance. > It's certainly one of the easier encodings to tell a bunch of bytes > definitely isn't using, but I didn't verify whether CSVImport is actually > doing that in a foolproof way... > > On Tue, Jan 17, 2023 at 10:08 AM Benjamin Armintor <[log in to unmask]> > wrote: > > > Sometimes in a situation like this it's useful to look at the source of > the > > message. In the event that this is true here... > > > > This error is raised by the CSV import module here: > > > > > https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/AbstractSource.php#L89-L94 > > The checkNumberOfColumnsByRow function returning false (and triggering > the > > error) is here: > > > > > https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/CsvFile.php#L126-L144 > > > > That function appears to verify two things: > > 1. The iterator over the CSV is not empty > > 2. The rows all have the same number of values as the header row had > > headers > > > > The iterator in question is from an SplFileObject ( > > https://www.php.net/manual/en/class.splfileobject.php). So either your > > uploaded file appears empty to the CSV reader, or some row has a number > of > > cells different from the header row. If I were you, before I started > > digging into escaped values and what not, I would: > > 1. Make sure I had the number of headers I intended > > 2. Make sure my file is UTF8 encoded, and make sure I was using > Unix-style > > line endings (a single newline character) > > 3. Make sure I didn't have empty lines (I notice that > > > > > https://github.com/omeka-s-modules/CSVImport/blob/30857ff5cbab31bb53713fc7c837b8c2c1247f6e/src/Source/CsvFile.php#L189-L190 > > sets the SKIP_EMPTY flag, but not DROP_NEW_LINE), including the end of > the > > file > > > > If you've already done these things, I apologize for being so > rudimentary - > > but it's always good to verify the basic assumptions before you dive into > > more elaborate data inspection. > > > > Good luck! > > Ben > > > > PS: I know this stuff is frustrating - it might be worth opening an issue > > on that CSVImport github repository to improve the error message! > > > > On Tue, Jan 17, 2023 at 9:15 AM Max <[log in to unmask]> wrote: > > > > > Hi again folks: > > > > > > Many thanks for everyone's replies yesterday evening! I retried fixing > > via > > > OpenRefine, & no success. & I'm using double quotes for CSV comma > > > enclosure. So I don't think commas inside values are the problem. (I've > > had > > > success in the past with multiple different CSV files that included > extra > > > commas inside values & used double quotes for CSV comma enclosure, & so > > > they weren't a problem.) RE: counting blanks in columns, Jackie Keith > > > recommended using the COUNTBLANK formula at Excel/Google Sheets, which > > was > > > easy to use, but still no success. (I didn't find any blanks in the > > data.) > > > Anyway, I wanted to update folks, & say thanks again, everyone! > > > > > > Cheers all, > > > Max > > > > > > Maxwell Gray > > > https://maxgray20.com > > > > > > > > > On Mon, Jan 16, 2023 at 8:59 PM Joe Hourclé <[log in to unmask]> > > wrote: > > > > > > > > On Jan 16, 2023, at 7:26 PM, Max <[log in to unmask]> wrote: > > > > > > > > > > Hi code4lib folks: > > > > > > > > > > Does anyone know a tool or hack to help fix a problem at a CSV > that's > > > > > causing a "The rows are not all the same number of columns." error > > when > > > > > trying to import the CSV at a web application? I'm trying to use > the > > > CSV > > > > > Import module < > > https://omeka.org/s/docs/user-manual/modules/csvimport/ > > > > > > > > at > > > > > Omeka S. I've had success in the past with different CSV files. But > > > some > > > > > kinda problem at the CSV I'm trying to import right now is causing > > this > > > > > error, & reviewing the CSV in Excel & as plain text (literally > > counting > > > > > commas to confirm rows are the same number of columns) isn't > helping. > > > > > > > > I’ve been known to do a find/replace on commas, then set the tab > width > > to > > > > something very large and then look for the rows that don’t line up. > > > > > > > > But CSV is tricky, as it’s more than just commas that are > significant, > > > you > > > > also have to consider quotations marks… which allow it so you can put > > > > commas or line returns within a string field. > > > > > > > > -Joe > > > > > > > > Sent from a mobile device with a crappy on screen keyboard and > > obnoxious > > > > "autocorrect" > > > > > > > > > >