Other than truncating the shelving code you really can’t do that without adding more data into the mix.  I would actually not do it arbitrarily but once the list is produced sort them largest to smallest by distance.  That way you start with the books most out of place and work down until they are close enough.

Your absolutely correct about the checked out thing though cause that would throw everything off.  If you can’t get a list that excludes checked out items, just take the list and remove everything not in your other list.  That should make it close enough to get decent results.


Sent from Windows Mail

From: Pikas, Christina K.
Sent: ‎Thursday‎, ‎January‎ ‎22‎, ‎2015 ‎7‎:‎08‎ ‎AM
To: Code for Libraries

That would work - make sure you're comparing to a list that checkout out books are not on. Also, you probably don't have to do 10 or 3000 completely arbitrarily. Like Danielle Steele or Nora Roberts are quite a bit more prolific than Bulgakov. You could sort of normalize by number of items in the collection. So like if 40 items in collection then being off by 39 is ok.

I remember in some class reading something like most misshelved books are a shelf above or a shelf below. I guess that makes sense. Doing some sort of normalization would help.

-----Original Message-----
From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of Brent Hanner
Sent: Wednesday, January 21, 2015 8:20 PM
To: [log in to unmask]
Subject: Re: [CODE4LIB] Identifying misshelved items

I was doing something and it came to me.  

So basically you should calculate the distance between where it is and where it should be.

So you take the belong list and actual place list and put them both into an array.  Then you go through the belong list and find its location in the actual place list.  Subtract the actual list location from the belong list location.  Take the absolute value and that is your distance.  

So if that distance is 10 its probably in the right area for an author.  If its 3000 it is in the wrong place.  

And it should work in any programing language and I think you could even do it in Excel.


Sent from Windows Mail

From: Cab Vinton
Sent: ‎Tuesday‎, ‎January‎ ‎20‎, ‎2015 ‎6‎:‎07‎ ‎AM
To: Code for Libraries

Thanks, Ron & Becky.

I remember Shelvar, but hadn't heard anything about it for a while. Adding tags to our entire collection is an initial hurdle, but could obviously be worthwhile in the long run.

Ron's diff command approach is a bit too fine-grained for us as there are multiple acceptable shelflist orders for novels by the same author. I'd probably also need to come up with a way to make the output a bit more user-friendly for our pages :-)  That said, I'll probably still spend some time messing around with Windows equivalents of diff (PowerShell, WinMerge, etc.) as that's the OS our pages are most comfortable with.

Thanks for the suggestions!

Cab Vinton
Plaistow Public Library
Plaistow, NH

On Thu, Jan 15, 2015 at 7:30 PM, Ronald Houk <[log in to unmask]> wrote:
> Just realized I had a typo. Should look something like.
> diff -Nau <(sort -k[[whatever field you want to sort by]] 
> original.csv) original.csv On Jan 15, 2015 2:29 PM, "Ronald Houk" 
> <[log in to unmask]>
> wrote:
>> This sounds like a perfect job for a unix/linux system.  I'd export 
>> this xls into a nice tab separated csv.  Then sort the column that 
>> contains the call no.  Then compare the sorted columns to the original column with diff.
>> something along the lines of
>> diff -Nau <(original.csv | sort -k[[whatever field you want to sort 
>> by]]) original.csv
>> For the dewey titles you could add the -n flag to sort.
>> This is just a rough sketch, but with a little work I think it will 
>> work for you and what's better it won't cost you dime. :)
>> On Thu, Jan 15, 2015 at 1:32 PM, Cab Vinton <[log in to unmask]> wrote:
>>> We're doing inventory here and would love to combine this with 
>>> finding items out of call number order. (The inventory process 
>>> simply updates the datelastseen field.)
>>> Koha's inventory tool generates an XLS file in the following format 
>>> (barcodes, too, actually):
>>>   Title Author Call number  The last jihad : Rosenberg, Joel, FIC 
>>> ROSEN Home repair / Rosenbarg, Liz. FIC ROSEN  Abuse of power / 
>>> Rosen, Fred. FIC ROSEN  California angel / Rosenberg, Nancy Taylor. 
>>> FIC ROSEN What we'd ideally like is a programmatic method of:
>>> 1./ identifying items like Home Repair and Abuse of Power, and
>>> 2./ specifying where such misshelved titles are currently located.
>>> For fiction, we're mostly concerned with authors out of order (i.e., 
>>> title order *within* the same author can be ignored). For 
>>> non-fiction, Dewey/ call number order is, of course, the desired result.
>>> Thoughts on how best to tackle this? And no, shelf-reading while 
>>> scanning is not an acceptable solution :-)
>>> My VBA skills are seriously rusty at this point, and there are some 
>>> complicating factors (e.g,. how to handle to books in a row which 
>>> are misshelved -- the second book's location should be compared to 
>>> the last correctly shelved book; see Rosen/ Rosenberg above).
>>> Has this wheel already been invented?
>>> Grateful for any & all suggestions!
>>> Best,
>>> Cab Vinton, Director
>>> Plaistow Public Library
>>> Plaistow, NH
>> --
>> Ronald Houk
>> Assistant Director
>> Ottumwa Public Library
>> 102 W. Fourth Street
>> Ottumwa, IA 52501
>> (641)682-7563x203
>> [log in to unmask]