You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tommaso Teofili <to...@gmail.com> on 2014/05/14 14:34:11 UTC

deep paging without sorting / keep IRs open

Hi all,

in one use case I'm working on [1] I am using Solr in combination with a
MVCC system [2][3], so that the (Solr) index is kept up to date with the
system and must handle search requests that are tied to a certain state /
version of it and of course multiple searches based on different versions
of the system have to run together.

So to make an example an indexing request (with commit) creates doc x and
y, a search for all the docs retrieves x and y, then a second indexing
requests (with commit) adds doc z, a search for all the docs retrieves x y
and z; that's fine as soon as the number of results is not big, but if
search requests are paged (with start and rows parameters) then the above
example doesn't work as multiple requests with underlying changing data
would have to be done to get pages.
In the above scenario if rows = 1 then the first request would retrieve 1
doc at a time, with a 'numFound' changed on the second request (from 2 to
3) which would be not consistent.

Basically I need the ability to keep running searches against a specified
commit point / index reader / state of the Lucene / Solr index.
So I wonder if a similar thing like the one done for "cursorMark" can be
done in order to address that, of course such "long running IndexReaders"
would have to be disposed after some time.

WDYT?
Regards,
Tommaso

[1] : http://jackrabbit.apache.org/oak
[2] : http://en.wikipedia.org/wiki/Multiversion_concurrency_control
[3] :
http://wiki.apache.org/jackrabbit/RepositoryMicroKernel?action=AttachFile&do=view&target=MicroKernel+Revision+Model.pdf

Re: deep paging without sorting / keep IRs open

Posted by Tommaso Teofili <to...@gmail.com>.
thanks Yonik, that looks promising, I'll have a look at it.

Tommaso


2014-05-17 17:57 GMT+02:00 Yonik Seeley <yo...@heliosearch.com>:

> On Sat, May 17, 2014 at 10:30 AM, Yonik Seeley <yo...@heliosearch.com>
> wrote:
> > I think searcher leases would fit the bill here?
> > https://issues.apache.org/jira/browse/SOLR-2809
> >
> > Not yet implemented though...
>
> FYI, I just put up a simple LeaseManager implementation on that issue.
>
> -Yonik
> http://heliosearch.org - facet functions, subfacets, off-heap
> filters&fieldcache
>

Re: deep paging without sorting / keep IRs open

Posted by Yonik Seeley <yo...@heliosearch.com>.
On Sat, May 17, 2014 at 10:30 AM, Yonik Seeley <yo...@heliosearch.com> wrote:
> I think searcher leases would fit the bill here?
> https://issues.apache.org/jira/browse/SOLR-2809
>
> Not yet implemented though...

FYI, I just put up a simple LeaseManager implementation on that issue.

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache

Re: deep paging without sorting / keep IRs open

Posted by Yonik Seeley <yo...@heliosearch.com>.
On Wed, May 14, 2014 at 8:34 AM, Tommaso Teofili
<to...@gmail.com> wrote:
> Basically I need the ability to keep running searches against a specified
> commit point / index reader / state of the Lucene / Solr index.

I think searcher leases would fit the bill here?
https://issues.apache.org/jira/browse/SOLR-2809

Not yet implemented though...

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache