I built a Queryset class using pysolr that does this for one of my projects. It's not available as a standalone package, but the code is here: https://github.com/unt-libraries/catalog-api/blob/master/django/sierra/utils/solr.py Django-Haystack does this as well, but I'm not sure how usable Haystack is outside of Django. In fact, what I built here was an attempt to re-implement parts of Haystack's SearchQuerySet interface using something a bit more barebones. I found in this particular case Haystack was overkill for my needs and actually added quite a bit of overhead to Solr searches, but its interfaces are perfect for use in Django where you want to use Solr in place of Django models (e.g., using a Solr queryset in place of Django's QuerySets). I was able to shave a few hundred milliseconds off search requests this way. And since this is in context of a web API, that was important for me. I guess it works about like you would think. After you instantiate a utils.solr.Queryset object (passing connection details, a page_by parameter, and the kwargs you want to send to Solr), you access individual results using the Queryset object like you would a list. Behind the scenes, it sends queries to Solr as needed to fetch the result (or results) you're accessing and caches the last result set. It only sends a query when you try to access a result outside the cached result set. Some of the filters still need some work, and it *is* barebones--features like highlighting and faceting aren't implemented at all--but it works for what I use it for, and it shows how you might go about abstracting Solr paging logic with pysolr. (FWIW I haven't tried any of the other numerous Python modules for interacting with Solr so I'm not sure if others do something similar...) Jason > -----Original Message----- > From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of > Tod Olson > Sent: Thursday, September 01, 2016 5:28 AM > To: [log in to unmask] > Subject: Re: [CODE4LIB] python for solr > > Exactly! The question is whether there is a python solr library that > provides a layer of abstraction over that paging logic. > > -Tod > > Sent from from the æther. > > > On Sep 1, 2016, at 04:59, Andrew Hankinson > <[log in to unmask]> wrote: > > > > Solr itself has an internal limit to the number of results you can > return on a single page (I think it is 1000) and AFAIK always returns a > paged result. For speed and memory usage over large result sets it would > probably be most efficient to build in paging logic. > > > >> On Aug 31, 2016, at 10:45 PM, Tod Olson <[log in to unmask]> wrote: > >> > >> On a related note, do any of the libraries allow the user to iterate > over a large result set without having to be aware of repeated calls, > incrementing the start parameter, and that sort of bookkeeping? > >> > >> It seems like someone must have built an iterator to hide that when > you're trying to sift through a large number of hits. > >> > >> -Tod > >> > >>> On Aug 31, 2016, at 4:09 PM, Rhoads, Joseph > <[log in to unmask]> wrote: > >>> > >>> I've used several of these. I like the interface of mysolr but (as > >>> mentioned) it hasn't been updated in a while. > >>> > >>> pysolr is fairly up to date (v3.5 came out in May this year), and is > used > >>> in django-haystack for the solr backend. > >>> https://github.com/django-haystack/pysolr > >>> > >>> Haystack itself is great if you want an ORM-like interface for solr > and use > >>> django. > >>> https://github.com/django-haystack/django-haystack > >>> > >>> -Joseph > >>> > >>> > >>> > >>>> On Wed, Aug 31, 2016 at 3:42 PM, Chris Gray <[log in to unmask]> > wrote: > >>>> > >>>> I haven't done much of that but you can submit documents via the > API and > >>>> have them indexed (and processed by Tika). Once you understand how > to do > >>>> that, you might find that you can do everything you want to do. > >>>> > >>>> An alternative would be reading the source of one of those > libraries. In > >>>> the list you referenced, the only mention of inserting documents > was for > >>>> sunburnt. I would be inclined to look there first, especially > since it > >>>> mentions a pythonic interface to Solr. > >>>> > >>>> A good, and amusing, cautionary tale about overwritten Python > libraries is > >>>> at https://www.youtube.com/watch?v=o9pEzgHorH0. > >>>> > >>>> Chris > >>>> > >>>> > >>>>> On 2016-08-31 03:28 PM, Eric Lease Morgan wrote: > >>>>> > >>>>> On Aug 31, 2016, at 3:25 PM, Chris Gray <[log in to unmask]> > wrote: > >>>>> > >>>>> Okay, there are SO many Python libraries [1] for Solr, and I'd > like to > >>>>>>> know which one is the most popular (not necessarily the "best"). > >>>>>> What do you want to do with it? > >>>>>> > >>>>>> I didn't feel the need to even look for a Python library for my > needs. > >>>>>> I use Python to submit searches to the Solr web API and consume > the results > >>>>>> as JSON. > >>>>> > >>>>> Good question. I want to add documents to a Solr index, and I want > to > >>>>> query the same index. Hmmm. -Eric M. > >>