You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Nicholas Chase <nc...@earthlink.net> on 2011/07/18 16:53:20 UTC

NRT and commit behavior

Very glad to hear that NRT is finally here!  But my question is this: 
will things still come to a standstill during a commit?

Thanks...

----  Nick

Re: NRT and commit behavior

Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
 From one of the users of NRT, their system was freezing with commits at 
about 1.5 million docs due to the frequency of commits but with NRT 
(Solr  with RankingAlgorithm) update document performance and a commit 
interval of about 15 mins they no longer have the freeze problem.

Regards,

- Nagendra Nagarajayya
http://solr-ra.tgels.org  <http://solr-ra.tgels.com>
http://rankingalgorithm.tgels.org  <http://rankingalgorithm.tgels.com>



On 7/18/2011 7:53 AM, Nicholas Chase wrote:
> Very glad to hear that NRT is finally here!  But my question is this: 
> will things still come to a standstill during a commit?
>
> Thanks...
>
> ----  Nick
>
>


Re: NRT and commit behavior

Posted by Mark Miller <ma...@gmail.com>.
I've written a blog post on some of the recent improvements that explains things a bit:

http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-%E2%80%98near-realtime%E2%80%99-improvements/

On Jul 18, 2011, at 10:53 AM, Nicholas Chase wrote:

> Very glad to hear that NRT is finally here!  But my question is this: will things still come to a standstill during a commit?
> 
> Thanks...
> 
> ----  Nick

- Mark Miller
lucidimagination.com









Re: NRT and commit behavior

Posted by Vadim Kisselmann <v....@googlemail.com>.
Tirthankar,

are you indexing 1.smaller docs or 2.books?
if 1.  your caches are too big for your memory, as Erick already said.
Try to allocate 10GB für JVM, leave 14GB for your HDD-Cache and make your
caches smaller.

if 2.  read the blog-posts on hathitrust.com.
http://www.hathitrust.org/blogs/large-scale-search

Regards
Vadim


2011/9/24 Erick Erickson <er...@gmail.com>

> No <G>. The problem is that "number of documents" isn't a reliable
> indicator of resource consumption. Consider the difference between
> indexing a twitter message and a book. I can put a LOT more docs
> of 140 chars on a single machine of size X than I can books.
>
> Unfortunately, the only way I know of is to test. Use something like
> jMeter of SolrMeter to fire enough queries at your machine to
> determine when you're over-straining resources and shard at that
> point (or get a bigger machine <G>)..
>
> Best
> Erick
>
> On Wed, Sep 21, 2011 at 8:24 PM, Tirthankar Chatterjee
> <tc...@commvault.com> wrote:
> > Okay, but is there any number that if we reach on the index size or total
> docs in the index or the size of physical memory that sharding should be
> considered.
> >
> > I am trying to find the winning combination.
> > Tirthankar
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerickson@gmail.com]
> > Sent: Friday, September 16, 2011 7:46 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: NRT and commit behavior
> >
> > Uhm, you're putting  a lot of index into not very much memory. I really
> think you're going to have to shard your index across several machines to
> get past this problem. Simply increasing the size of your caches is still
> limited by the physical memory you're working with.
> >
> > You really have to put a profiler on the system to see what's going on.
> At that size there are too many things that it *could* be to definitively
> answer it with e-mails....
> >
> > Best
> > Erick
> >
> > On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee <
> tchatterjee@commvault.com> wrote:
> >> Erick,
> >> Also, we had  our solrconfig where we have tried increasing the
> cache.... making the below value for autowarm count as 0 helps returning the
> commit call within the second, but that will slow us down on searches....
> >>
> >> <filterCache
> >>      class="solr.FastLRUCache"
> >>      size="16384"
> >>      initialSize="4096"
> >>      autowarmCount="4096"/>
> >>
> >>    <!-- Cache used to hold field values that are quickly accessible
> >>         by document id.  The fieldValueCache is created by default
> >>         even if not configured here.
> >>      <fieldValueCache
> >>        class="solr.FastLRUCache"
> >>        size="512"
> >>        autowarmCount="128"
> >>        showItems="32"
> >>      />
> >>    -->
> >>
> >>   <!-- queryResultCache caches results of searches - ordered lists of
> >>         document ids (DocList) based on a query, a sort, and the range
> >>         of documents requested.  -->
> >>    <queryResultCache
> >>      class="solr.LRUCache"
> >>      size="16384"
> >>      initialSize="4096"
> >>      autowarmCount="4096"/>
> >>
> >>  <!-- documentCache caches Lucene Document objects (the stored fields
> for each document).
> >>       Since Lucene internal document ids are transient, this cache
> >> will not be autowarmed.  -->
> >>    <documentCache
> >>      class="solr.LRUCache"
> >>      size="512"
> >>      initialSize="512"
> >>      autowarmCount="512"/>
> >>
> >> -----Original Message-----
> >> From: Tirthankar Chatterjee [mailto:tchatterjee@commvault.com]
> >> Sent: Wednesday, September 14, 2011 7:31 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: RE: NRT and commit behavior
> >>
> >> Erick,
> >> Here is the answer to your questions:
> >> Our index is 267 GB
> >> We are not optimizing...
> >> No we have not profiled yet to check the bottleneck, but logs indicate
> opening the searchers is taking time...
> >> Nothing except SOLR
> >> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and
> >> JVM and Tomcat
> >>
> >> -----Original Message-----
> >> From: Erick Erickson [mailto:erickerickson@gmail.com]
> >> Sent: Sunday, September 11, 2011 11:37 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: NRT and commit behavior
> >>
> >> Hmm, OK. You might want to look at the non-cached filter query stuff,
> it's quite recent.
> >> The point here is that it is a filter that is applied only after all of
> the less expensive filter queries are run, One of its uses is exactly ACL
> calculations. Rather than calculate the ACL for the entire doc set, it only
> calculates access for docs that have made it past all the other elements of
> the query.... See SOLR-2429 and note that it is a 3.4 (currently being
> released) only.
> >>
> >> As to why your commits are taking so long, I have no idea given that you
> really haven't given us much to work with.
> >>
> >> How big is your index? Are you optimizing? Have you profiled the
> application to see what the bottleneck is (I/O, CPU, etc?). What else is
> running on your machine? It's quite surprising that it takes that long. How
> much memory are you giving the JVM? etc...
> >>
> >> You might want to review:
> >> http://wiki.apache.org/solr/UsingMailingLists
> >>
> >> Best
> >> Erick
> >>
> >>
> >> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <
> tchatterjee@commvault.com> wrote:
> >>> Erick,
> >>> What you said is correct for us the searches are based on some Active
> Directory permissions which are populated in Filter query parameter. So we
> don't have any warming query concept as we cannot fire for every user ahead
> of time.
> >>>
> >>> What we do here is that when user logs in we do an invalid query(which
> return no results instead of '*') with the correct filter query (which is
> his permissions based on the login). This way the cache gets warmed up with
> valid docs.
> >>>
> >>> It works then.
> >>>
> >>>
> >>> Also, can you please let me know why commit is taking 45 mins to 1
> hours on a good resourced hardware with multiple processors and 16gb RAM 64
> bit VM, etc. We tried passing waitSearcher as false and found that inside
> the code it hard coded to be true. Is there any specific reason. Can we
> change that value to honor what is being passed.
> >>>
> >>> Thanks,
> >>> Tirthankar
> >>>
> >>> -----Original Message-----
> >>> From: Erick Erickson [mailto:erickerickson@gmail.com]
> >>> Sent: Thursday, September 01, 2011 8:38 AM
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: NRT and commit behavior
> >>>
> >>> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound
> very safe, but I suppose it *might* be OK.
> >>>
> >>> What does "invalid" mean? Syntax error? not safe.
> >>>
> >>> search that returns 0 results? I don't know, but I'd guess that
> >>> filling your caches, which is the point of warming queries, might be
> >>> short circuited if the query returns
> >>> 0 results but I don't know for sure.
> >>>
> >>> But the fact that "invalid queries return quicker" does not inspire
> confidence since the *point* of warming queries is to spend the time up
> front so your users don't have to wait.
> >>>
> >>> So here's a test. Comment out your warming queries.
> >>> Restart your server and fire the warming query from the browser
> with&debugQuery=on and look at the QTime parameter.
> >>>
> >>> Now fire the same form of the query (as in the same sort, facet,
> grouping, etc, but presumably a valid term). See the QTime.
> >>>
> >>> Now fire the same form of the query with a *different* value in the
> query. That is, it should search on different terms but with the same sort,
> facet, etc. to avoid getting your data straight from the queryResultCache.
> >>>
> >>> My guess is that the last query will return much more quickly than the
> second query. Which would indicate that the first form isn't doing you any
> good.
> >>>
> >>> But a test is worth a thousand opinions.
> >>>
> >>> Best
> >>> Erick
> >>>
> >>> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <
> tchatterjee@commvault.com> wrote:
> >>>> Also noticed that "waitSearcher" parameter value is not  honored
> inside commit. It is always defaulted to true which makes it slow during
> indexing.
> >>>>
> >>>> What we are trying to do is use an invalid query (which wont return
> any results) as a warming query. This way the commit returns faster. Are we
> doing something wrong here?
> >>>>
> >>>> Thanks,
> >>>> Tirthankar
> >>>>
> >>>> -----Original Message-----
> >>>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
> >>>> Sent: Monday, July 18, 2011 11:38 AM
> >>>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
> >>>> Subject: Re: NRT and commit behavior
> >>>>
> >>>> In practice, in my experience at least, a very 'expensive' commit
> >>>> can still slow down searches significantly, I think just due to CPU
> >>>> (or
> >>>> i/o?) starvation. Not sure anything can be done about that.  That's my
> experience in Solr 1.4.1, but since searches have always been async with
> commits, it probably is the same situation even in more recent versions, I'd
> guess.
> >>>>
> >>>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
> >>>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<
> nchase@earthlink.net>  wrote:
> >>>>>> Very glad to hear that NRT is finally here!  But my question is
> this:
> >>>>>> will things still come to a standstill during a commit?
> >>>>> New updates can now proceed in parallel with a commit, and searches
> >>>>> have always been completely asynchronous w.r.t. commits.
> >>>>>
> >>>>> -Yonik
> >>>>> http://www.lucidimagination.com
> >>>>>
> >>>> ******************Legal Disclaimer***************************
> >>>> "This communication may contain confidential and privileged material
> >>>> for the sole use of the intended recipient. Any unauthorized review,
> >>>> use or distribution by others is strictly prohibited. If you have
> >>>> received the message in error, please advise the sender by reply
> >>>> email and delete the message. Thank you."
> >>>> *********************************************************
> >>>>
> >>>
> >>
> >
>

Re: NRT and commit behavior

Posted by Erick Erickson <er...@gmail.com>.
No <G>. The problem is that "number of documents" isn't a reliable
indicator of resource consumption. Consider the difference between
indexing a twitter message and a book. I can put a LOT more docs
of 140 chars on a single machine of size X than I can books.

Unfortunately, the only way I know of is to test. Use something like
jMeter of SolrMeter to fire enough queries at your machine to
determine when you're over-straining resources and shard at that
point (or get a bigger machine <G>)..

Best
Erick

On Wed, Sep 21, 2011 at 8:24 PM, Tirthankar Chatterjee
<tc...@commvault.com> wrote:
> Okay, but is there any number that if we reach on the index size or total docs in the index or the size of physical memory that sharding should be considered.
>
> I am trying to find the winning combination.
> Tirthankar
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Friday, September 16, 2011 7:46 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Uhm, you're putting  a lot of index into not very much memory. I really think you're going to have to shard your index across several machines to get past this problem. Simply increasing the size of your caches is still limited by the physical memory you're working with.
>
> You really have to put a profiler on the system to see what's going on. At that size there are too many things that it *could* be to definitively answer it with e-mails....
>
> Best
> Erick
>
> On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Erick,
>> Also, we had  our solrconfig where we have tried increasing the cache.... making the below value for autowarm count as 0 helps returning the commit call within the second, but that will slow us down on searches....
>>
>> <filterCache
>>      class="solr.FastLRUCache"
>>      size="16384"
>>      initialSize="4096"
>>      autowarmCount="4096"/>
>>
>>    <!-- Cache used to hold field values that are quickly accessible
>>         by document id.  The fieldValueCache is created by default
>>         even if not configured here.
>>      <fieldValueCache
>>        class="solr.FastLRUCache"
>>        size="512"
>>        autowarmCount="128"
>>        showItems="32"
>>      />
>>    -->
>>
>>   <!-- queryResultCache caches results of searches - ordered lists of
>>         document ids (DocList) based on a query, a sort, and the range
>>         of documents requested.  -->
>>    <queryResultCache
>>      class="solr.LRUCache"
>>      size="16384"
>>      initialSize="4096"
>>      autowarmCount="4096"/>
>>
>>  <!-- documentCache caches Lucene Document objects (the stored fields for each document).
>>       Since Lucene internal document ids are transient, this cache
>> will not be autowarmed.  -->
>>    <documentCache
>>      class="solr.LRUCache"
>>      size="512"
>>      initialSize="512"
>>      autowarmCount="512"/>
>>
>> -----Original Message-----
>> From: Tirthankar Chatterjee [mailto:tchatterjee@commvault.com]
>> Sent: Wednesday, September 14, 2011 7:31 AM
>> To: solr-user@lucene.apache.org
>> Subject: RE: NRT and commit behavior
>>
>> Erick,
>> Here is the answer to your questions:
>> Our index is 267 GB
>> We are not optimizing...
>> No we have not profiled yet to check the bottleneck, but logs indicate opening the searchers is taking time...
>> Nothing except SOLR
>> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and
>> JVM and Tomcat
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerickson@gmail.com]
>> Sent: Sunday, September 11, 2011 11:37 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: NRT and commit behavior
>>
>> Hmm, OK. You might want to look at the non-cached filter query stuff, it's quite recent.
>> The point here is that it is a filter that is applied only after all of the less expensive filter queries are run, One of its uses is exactly ACL calculations. Rather than calculate the ACL for the entire doc set, it only calculates access for docs that have made it past all the other elements of the query.... See SOLR-2429 and note that it is a 3.4 (currently being released) only.
>>
>> As to why your commits are taking so long, I have no idea given that you really haven't given us much to work with.
>>
>> How big is your index? Are you optimizing? Have you profiled the application to see what the bottleneck is (I/O, CPU, etc?). What else is running on your machine? It's quite surprising that it takes that long. How much memory are you giving the JVM? etc...
>>
>> You might want to review:
>> http://wiki.apache.org/solr/UsingMailingLists
>>
>> Best
>> Erick
>>
>>
>> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>>> Erick,
>>> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>>>
>>> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>>>
>>> It works then.
>>>
>>>
>>> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>>>
>>> Thanks,
>>> Tirthankar
>>>
>>> -----Original Message-----
>>> From: Erick Erickson [mailto:erickerickson@gmail.com]
>>> Sent: Thursday, September 01, 2011 8:38 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: NRT and commit behavior
>>>
>>> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>>>
>>> What does "invalid" mean? Syntax error? not safe.
>>>
>>> search that returns 0 results? I don't know, but I'd guess that
>>> filling your caches, which is the point of warming queries, might be
>>> short circuited if the query returns
>>> 0 results but I don't know for sure.
>>>
>>> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>>>
>>> So here's a test. Comment out your warming queries.
>>> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>>>
>>> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>>>
>>> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>>>
>>> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>>>
>>> But a test is worth a thousand opinions.
>>>
>>> Best
>>> Erick
>>>
>>> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>>>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>>>
>>>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>>>
>>>> Thanks,
>>>> Tirthankar
>>>>
>>>> -----Original Message-----
>>>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>>>> Sent: Monday, July 18, 2011 11:38 AM
>>>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>>>> Subject: Re: NRT and commit behavior
>>>>
>>>> In practice, in my experience at least, a very 'expensive' commit
>>>> can still slow down searches significantly, I think just due to CPU
>>>> (or
>>>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>>>
>>>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>>>> will things still come to a standstill during a commit?
>>>>> New updates can now proceed in parallel with a commit, and searches
>>>>> have always been completely asynchronous w.r.t. commits.
>>>>>
>>>>> -Yonik
>>>>> http://www.lucidimagination.com
>>>>>
>>>> ******************Legal Disclaimer***************************
>>>> "This communication may contain confidential and privileged material
>>>> for the sole use of the intended recipient. Any unauthorized review,
>>>> use or distribution by others is strictly prohibited. If you have
>>>> received the message in error, please advise the sender by reply
>>>> email and delete the message. Thank you."
>>>> *********************************************************
>>>>
>>>
>>
>

RE: NRT and commit behavior

Posted by Tirthankar Chatterjee <tc...@commvault.com>.
Okay, but is there any number that if we reach on the index size or total docs in the index or the size of physical memory that sharding should be considered. 

I am trying to find the winning combination.
Tirthankar
-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Friday, September 16, 2011 7:46 AM
To: solr-user@lucene.apache.org
Subject: Re: NRT and commit behavior

Uhm, you're putting  a lot of index into not very much memory. I really think you're going to have to shard your index across several machines to get past this problem. Simply increasing the size of your caches is still limited by the physical memory you're working with.

You really have to put a profiler on the system to see what's going on. At that size there are too many things that it *could* be to definitively answer it with e-mails....

Best
Erick

On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
> Erick,
> Also, we had  our solrconfig where we have tried increasing the cache.... making the below value for autowarm count as 0 helps returning the commit call within the second, but that will slow us down on searches....
>
> <filterCache
>      class="solr.FastLRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>    <!-- Cache used to hold field values that are quickly accessible
>         by document id.  The fieldValueCache is created by default
>         even if not configured here.
>      <fieldValueCache
>        class="solr.FastLRUCache"
>        size="512"
>        autowarmCount="128"
>        showItems="32"
>      />
>    -->
>
>   <!-- queryResultCache caches results of searches - ordered lists of
>         document ids (DocList) based on a query, a sort, and the range
>         of documents requested.  -->
>    <queryResultCache
>      class="solr.LRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>  <!-- documentCache caches Lucene Document objects (the stored fields for each document).
>       Since Lucene internal document ids are transient, this cache 
> will not be autowarmed.  -->
>    <documentCache
>      class="solr.LRUCache"
>      size="512"
>      initialSize="512"
>      autowarmCount="512"/>
>
> -----Original Message-----
> From: Tirthankar Chatterjee [mailto:tchatterjee@commvault.com]
> Sent: Wednesday, September 14, 2011 7:31 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NRT and commit behavior
>
> Erick,
> Here is the answer to your questions:
> Our index is 267 GB
> We are not optimizing...
> No we have not profiled yet to check the bottleneck, but logs indicate opening the searchers is taking time...
> Nothing except SOLR
> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and 
> JVM and Tomcat
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, September 11, 2011 11:37 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, OK. You might want to look at the non-cached filter query stuff, it's quite recent.
> The point here is that it is a filter that is applied only after all of the less expensive filter queries are run, One of its uses is exactly ACL calculations. Rather than calculate the ACL for the entire doc set, it only calculates access for docs that have made it past all the other elements of the query.... See SOLR-2429 and note that it is a 3.4 (currently being released) only.
>
> As to why your commits are taking so long, I have no idea given that you really haven't given us much to work with.
>
> How big is your index? Are you optimizing? Have you profiled the application to see what the bottleneck is (I/O, CPU, etc?). What else is running on your machine? It's quite surprising that it takes that long. How much memory are you giving the JVM? etc...
>
> You might want to review: 
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
>
> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Erick,
>> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>>
>> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>>
>> It works then.
>>
>>
>> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerickson@gmail.com]
>> Sent: Thursday, September 01, 2011 8:38 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: NRT and commit behavior
>>
>> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>>
>> What does "invalid" mean? Syntax error? not safe.
>>
>> search that returns 0 results? I don't know, but I'd guess that 
>> filling your caches, which is the point of warming queries, might be 
>> short circuited if the query returns
>> 0 results but I don't know for sure.
>>
>> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>>
>> So here's a test. Comment out your warming queries.
>> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>>
>> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>>
>> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>>
>> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>>
>> But a test is worth a thousand opinions.
>>
>> Best
>> Erick
>>
>> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>>
>>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>>
>>> Thanks,
>>> Tirthankar
>>>
>>> -----Original Message-----
>>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>>> Sent: Monday, July 18, 2011 11:38 AM
>>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>>> Subject: Re: NRT and commit behavior
>>>
>>> In practice, in my experience at least, a very 'expensive' commit 
>>> can still slow down searches significantly, I think just due to CPU 
>>> (or
>>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>>
>>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>>> will things still come to a standstill during a commit?
>>>> New updates can now proceed in parallel with a commit, and searches 
>>>> have always been completely asynchronous w.r.t. commits.
>>>>
>>>> -Yonik
>>>> http://www.lucidimagination.com
>>>>
>>> ******************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material 
>>> for the sole use of the intended recipient. Any unauthorized review, 
>>> use or distribution by others is strictly prohibited. If you have 
>>> received the message in error, please advise the sender by reply 
>>> email and delete the message. Thank you."
>>> *********************************************************
>>>
>>
>

Re: NRT and commit behavior

Posted by Erick Erickson <er...@gmail.com>.
Uhm, you're putting  a lot of index into not very much memory. I
really think you're
going to have to shard your index across several machines to get past this
problem. Simply increasing the size of your caches is still limited by the
physical memory you're working with.

You really have to put a profiler on the system to see what's going on. At
that size there are too many things that it *could* be to definitively answer
it with e-mails....

Best
Erick

On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee
<tc...@commvault.com> wrote:
> Erick,
> Also, we had  our solrconfig where we have tried increasing the cache.... making the below value for autowarm count as 0 helps returning the commit call within the second, but that will slow us down on searches....
>
> <filterCache
>      class="solr.FastLRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>    <!-- Cache used to hold field values that are quickly accessible
>         by document id.  The fieldValueCache is created by default
>         even if not configured here.
>      <fieldValueCache
>        class="solr.FastLRUCache"
>        size="512"
>        autowarmCount="128"
>        showItems="32"
>      />
>    -->
>
>   <!-- queryResultCache caches results of searches - ordered lists of
>         document ids (DocList) based on a query, a sort, and the range
>         of documents requested.  -->
>    <queryResultCache
>      class="solr.LRUCache"
>      size="16384"
>      initialSize="4096"
>      autowarmCount="4096"/>
>
>  <!-- documentCache caches Lucene Document objects (the stored fields for each document).
>       Since Lucene internal document ids are transient, this cache will not be autowarmed.  -->
>    <documentCache
>      class="solr.LRUCache"
>      size="512"
>      initialSize="512"
>      autowarmCount="512"/>
>
> -----Original Message-----
> From: Tirthankar Chatterjee [mailto:tchatterjee@commvault.com]
> Sent: Wednesday, September 14, 2011 7:31 AM
> To: solr-user@lucene.apache.org
> Subject: RE: NRT and commit behavior
>
> Erick,
> Here is the answer to your questions:
> Our index is 267 GB
> We are not optimizing...
> No we have not profiled yet to check the bottleneck, but logs indicate opening the searchers is taking time...
> Nothing except SOLR
> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and JVM and Tomcat
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Sunday, September 11, 2011 11:37 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, OK. You might want to look at the non-cached filter query stuff, it's quite recent.
> The point here is that it is a filter that is applied only after all of the less expensive filter queries are run, One of its uses is exactly ACL calculations. Rather than calculate the ACL for the entire doc set, it only calculates access for docs that have made it past all the other elements of the query.... See SOLR-2429 and note that it is a 3.4 (currently being released) only.
>
> As to why your commits are taking so long, I have no idea given that you really haven't given us much to work with.
>
> How big is your index? Are you optimizing? Have you profiled the application to see what the bottleneck is (I/O, CPU, etc?). What else is running on your machine? It's quite surprising that it takes that long. How much memory are you giving the JVM? etc...
>
> You might want to review: http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
>
> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Erick,
>> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>>
>> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>>
>> It works then.
>>
>>
>> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerickson@gmail.com]
>> Sent: Thursday, September 01, 2011 8:38 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: NRT and commit behavior
>>
>> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>>
>> What does "invalid" mean? Syntax error? not safe.
>>
>> search that returns 0 results? I don't know, but I'd guess that
>> filling your caches, which is the point of warming queries, might be
>> short circuited if the query returns
>> 0 results but I don't know for sure.
>>
>> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>>
>> So here's a test. Comment out your warming queries.
>> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>>
>> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>>
>> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>>
>> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>>
>> But a test is worth a thousand opinions.
>>
>> Best
>> Erick
>>
>> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>>
>>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>>
>>> Thanks,
>>> Tirthankar
>>>
>>> -----Original Message-----
>>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>>> Sent: Monday, July 18, 2011 11:38 AM
>>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>>> Subject: Re: NRT and commit behavior
>>>
>>> In practice, in my experience at least, a very 'expensive' commit can
>>> still slow down searches significantly, I think just due to CPU (or
>>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>>
>>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>>> will things still come to a standstill during a commit?
>>>> New updates can now proceed in parallel with a commit, and searches
>>>> have always been completely asynchronous w.r.t. commits.
>>>>
>>>> -Yonik
>>>> http://www.lucidimagination.com
>>>>
>>> ******************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material
>>> for the sole use of the intended recipient. Any unauthorized review,
>>> use or distribution by others is strictly prohibited. If you have
>>> received the message in error, please advise the sender by reply
>>> email and delete the message. Thank you."
>>> *********************************************************
>>>
>>
>

RE: NRT and commit behavior

Posted by Tirthankar Chatterjee <tc...@commvault.com>.
Erick,
Also, we had  our solrconfig where we have tried increasing the cache.... making the below value for autowarm count as 0 helps returning the commit call within the second, but that will slow us down on searches....

<filterCache
      class="solr.FastLRUCache"
      size="16384"
      initialSize="4096"
      autowarmCount="4096"/>

    <!-- Cache used to hold field values that are quickly accessible
         by document id.  The fieldValueCache is created by default
         even if not configured here.
      <fieldValueCache
        class="solr.FastLRUCache"
        size="512"
        autowarmCount="128"
        showItems="32"
      />
    -->

   <!-- queryResultCache caches results of searches - ordered lists of
         document ids (DocList) based on a query, a sort, and the range
         of documents requested.  -->
    <queryResultCache
      class="solr.LRUCache"
      size="16384"
      initialSize="4096"
      autowarmCount="4096"/>

  <!-- documentCache caches Lucene Document objects (the stored fields for each document).
       Since Lucene internal document ids are transient, this cache will not be autowarmed.  -->
    <documentCache
      class="solr.LRUCache"
      size="512"
      initialSize="512"
      autowarmCount="512"/>

-----Original Message-----
From: Tirthankar Chatterjee [mailto:tchatterjee@commvault.com] 
Sent: Wednesday, September 14, 2011 7:31 AM
To: solr-user@lucene.apache.org
Subject: RE: NRT and commit behavior

Erick,
Here is the answer to your questions:
Our index is 267 GB
We are not optimizing...
No we have not profiled yet to check the bottleneck, but logs indicate opening the searchers is taking time...
Nothing except SOLR
Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and JVM and Tomcat

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Sunday, September 11, 2011 11:37 AM
To: solr-user@lucene.apache.org
Subject: Re: NRT and commit behavior

Hmm, OK. You might want to look at the non-cached filter query stuff, it's quite recent.
The point here is that it is a filter that is applied only after all of the less expensive filter queries are run, One of its uses is exactly ACL calculations. Rather than calculate the ACL for the entire doc set, it only calculates access for docs that have made it past all the other elements of the query.... See SOLR-2429 and note that it is a 3.4 (currently being released) only.

As to why your commits are taking so long, I have no idea given that you really haven't given us much to work with.

How big is your index? Are you optimizing? Have you profiled the application to see what the bottleneck is (I/O, CPU, etc?). What else is running on your machine? It's quite surprising that it takes that long. How much memory are you giving the JVM? etc...

You might want to review: http://wiki.apache.org/solr/UsingMailingLists

Best
Erick


On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
> Erick,
> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>
> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>
> It works then.
>
>
> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>
> Thanks,
> Tirthankar
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Thursday, September 01, 2011 8:38 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>
> What does "invalid" mean? Syntax error? not safe.
>
> search that returns 0 results? I don't know, but I'd guess that 
> filling your caches, which is the point of warming queries, might be 
> short circuited if the query returns
> 0 results but I don't know for sure.
>
> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>
> So here's a test. Comment out your warming queries.
> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>
> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>
> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>
> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>
> But a test is worth a thousand opinions.
>
> Best
> Erick
>
> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>
>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>> Sent: Monday, July 18, 2011 11:38 AM
>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>> Subject: Re: NRT and commit behavior
>>
>> In practice, in my experience at least, a very 'expensive' commit can 
>> still slow down searches significantly, I think just due to CPU (or
>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>
>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>> will things still come to a standstill during a commit?
>>> New updates can now proceed in parallel with a commit, and searches 
>>> have always been completely asynchronous w.r.t. commits.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>> ******************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material 
>> for the sole use of the intended recipient. Any unauthorized review, 
>> use or distribution by others is strictly prohibited. If you have 
>> received the message in error, please advise the sender by reply 
>> email and delete the message. Thank you."
>> *********************************************************
>>
>

RE: NRT and commit behavior

Posted by Tirthankar Chatterjee <tc...@commvault.com>.
Erick,
Here is the answer to your questions:
Our index is 267 GB 
We are not optimizing...
No we have not profiled yet to check the bottleneck, but logs indicate opening the searchers is taking time...
Nothing except SOLR
Total memory is 16GB tomcat has 8GB allocated 
Everything 64 bit OS and JVM and Tomcat

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Sunday, September 11, 2011 11:37 AM
To: solr-user@lucene.apache.org
Subject: Re: NRT and commit behavior

Hmm, OK. You might want to look at the non-cached filter query stuff, it's quite recent.
The point here is that it is a filter that is applied only after all of the less expensive filter queries are run, One of its uses is exactly ACL calculations. Rather than calculate the ACL for the entire doc set, it only calculates access for docs that have made it past all the other elements of the query.... See SOLR-2429 and note that it is a 3.4 (currently being released) only.

As to why your commits are taking so long, I have no idea given that you really haven't given us much to work with.

How big is your index? Are you optimizing? Have you profiled the application to see what the bottleneck is (I/O, CPU, etc?). What else is running on your machine? It's quite surprising that it takes that long. How much memory are you giving the JVM? etc...

You might want to review: http://wiki.apache.org/solr/UsingMailingLists

Best
Erick


On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
> Erick,
> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>
> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>
> It works then.
>
>
> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>
> Thanks,
> Tirthankar
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Thursday, September 01, 2011 8:38 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>
> What does "invalid" mean? Syntax error? not safe.
>
> search that returns 0 results? I don't know, but I'd guess that 
> filling your caches, which is the point of warming queries, might be 
> short circuited if the query returns
> 0 results but I don't know for sure.
>
> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>
> So here's a test. Comment out your warming queries.
> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>
> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>
> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>
> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>
> But a test is worth a thousand opinions.
>
> Best
> Erick
>
> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>
>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>> Sent: Monday, July 18, 2011 11:38 AM
>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>> Subject: Re: NRT and commit behavior
>>
>> In practice, in my experience at least, a very 'expensive' commit can 
>> still slow down searches significantly, I think just due to CPU (or
>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>
>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>> will things still come to a standstill during a commit?
>>> New updates can now proceed in parallel with a commit, and searches 
>>> have always been completely asynchronous w.r.t. commits.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>> ******************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material 
>> for the sole use of the intended recipient. Any unauthorized review, 
>> use or distribution by others is strictly prohibited. If you have 
>> received the message in error, please advise the sender by reply 
>> email and delete the message. Thank you."
>> *********************************************************
>>
>

Re: NRT and commit behavior

Posted by Erick Erickson <er...@gmail.com>.
Hmm, OK. You might want to look at the non-cached filter query stuff,
it's quite recent.
The point here is that it is a filter that is applied only after all
of the less expensive filter
queries are run, One of its uses is exactly ACL calculations. Rather
than calculate the
ACL for the entire doc set, it only calculates access for docs that
have made it past
all the other elements of the query.... See SOLR-2429 and note that it
is a 3.4 (currently
being released) only.

As to why your commits are taking so long, I have no idea given that
you really haven't
given us much to work with.

How big is your index? Are you optimizing? Have you profiled the application to
see what the bottleneck is (I/O, CPU, etc?). What else is running on your
machine? It's quite surprising that it takes that long. How much memory are you
giving the JVM? etc...

You might want to review: http://wiki.apache.org/solr/UsingMailingLists

Best
Erick


On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee
<tc...@commvault.com> wrote:
> Erick,
> What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time.
>
> What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs.
>
> It works then.
>
>
> Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.
>
> Thanks,
> Tirthankar
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Thursday, September 01, 2011 8:38 AM
> To: solr-user@lucene.apache.org
> Subject: Re: NRT and commit behavior
>
> Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.
>
> What does "invalid" mean? Syntax error? not safe.
>
> search that returns 0 results? I don't know, but I'd guess that filling your caches, which is the point of warming queries, might be short circuited if the query returns
> 0 results but I don't know for sure.
>
> But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.
>
> So here's a test. Comment out your warming queries.
> Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.
>
> Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.
>
> Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.
>
> My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.
>
> But a test is worth a thousand opinions.
>
> Best
> Erick
>
> On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
>> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>>
>> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>>
>> Thanks,
>> Tirthankar
>>
>> -----Original Message-----
>> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
>> Sent: Monday, July 18, 2011 11:38 AM
>> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
>> Subject: Re: NRT and commit behavior
>>
>> In practice, in my experience at least, a very 'expensive' commit can
>> still slow down searches significantly, I think just due to CPU (or
>> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>>
>> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>>> Very glad to hear that NRT is finally here!  But my question is this:
>>>> will things still come to a standstill during a commit?
>>> New updates can now proceed in parallel with a commit, and searches
>>> have always been completely asynchronous w.r.t. commits.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>> ******************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material
>> for the sole use of the intended recipient. Any unauthorized review,
>> use or distribution by others is strictly prohibited. If you have
>> received the message in error, please advise the sender by reply email
>> and delete the message. Thank you."
>> *********************************************************
>>
>

RE: NRT and commit behavior

Posted by Tirthankar Chatterjee <tc...@commvault.com>.
Erick,
What you said is correct for us the searches are based on some Active Directory permissions which are populated in Filter query parameter. So we don't have any warming query concept as we cannot fire for every user ahead of time. 

What we do here is that when user logs in we do an invalid query(which return no results instead of '*') with the correct filter query (which is his permissions based on the login). This way the cache gets warmed up with valid docs. 

It works then. 


Also, can you please let me know why commit is taking 45 mins to 1 hours on a good resourced hardware with multiple processors and 16gb RAM 64 bit VM, etc. We tried passing waitSearcher as false and found that inside the code it hard coded to be true. Is there any specific reason. Can we change that value to honor what is being passed.

Thanks,
Tirthankar

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Thursday, September 01, 2011 8:38 AM
To: solr-user@lucene.apache.org
Subject: Re: NRT and commit behavior

Hmm, I'm guessing a bit here, but using an invalid query doesn't sound very safe, but I suppose it *might* be OK.

What does "invalid" mean? Syntax error? not safe.

search that returns 0 results? I don't know, but I'd guess that filling your caches, which is the point of warming queries, might be short circuited if the query returns
0 results but I don't know for sure.

But the fact that "invalid queries return quicker" does not inspire confidence since the *point* of warming queries is to spend the time up front so your users don't have to wait.

So here's a test. Comment out your warming queries.
Restart your server and fire the warming query from the browser with&debugQuery=on and look at the QTime parameter.

Now fire the same form of the query (as in the same sort, facet, grouping, etc, but presumably a valid term). See the QTime.

Now fire the same form of the query with a *different* value in the query. That is, it should search on different terms but with the same sort, facet, etc. to avoid getting your data straight from the queryResultCache.

My guess is that the last query will return much more quickly than the second query. Which would indicate that the first form isn't doing you any good.

But a test is worth a thousand opinions.

Best
Erick

On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee <tc...@commvault.com> wrote:
> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>
> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>
> Thanks,
> Tirthankar
>
> -----Original Message-----
> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
> Sent: Monday, July 18, 2011 11:38 AM
> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
> Subject: Re: NRT and commit behavior
>
> In practice, in my experience at least, a very 'expensive' commit can 
> still slow down searches significantly, I think just due to CPU (or
> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>
> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>> Very glad to hear that NRT is finally here!  But my question is this:
>>> will things still come to a standstill during a commit?
>> New updates can now proceed in parallel with a commit, and searches 
>> have always been completely asynchronous w.r.t. commits.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material 
> for the sole use of the intended recipient. Any unauthorized review, 
> use or distribution by others is strictly prohibited. If you have 
> received the message in error, please advise the sender by reply email 
> and delete the message. Thank you."
> *********************************************************
>

Re: NRT and commit behavior

Posted by Erick Erickson <er...@gmail.com>.
Hmm, I'm guessing a bit here, but using an invalid query
doesn't sound very safe, but I suppose it *might* be OK.

What does "invalid" mean? Syntax error? not safe.

search that returns 0 results? I don't know, but I'd guess
that filling your caches, which is the point of warming
queries, might be short circuited if the query returns
0 results but I don't know for sure.

But the fact that "invalid queries return quicker" does not
inspire confidence since the *point* of warming queries
is to spend the time up front so your users don't have to
wait.

So here's a test. Comment out your warming queries.
Restart your server and fire the warming query from
the browser with&debugQuery=on and look at the
QTime parameter.

Now fire the same form of the query (as in the same
sort, facet, grouping, etc, but presumably a valid
term). See the QTime.

Now fire the same form of the query with a *different*
value in the query. That is, it should search on different
terms but with the same sort, facet, etc. to avoid
getting your data straight from the queryResultCache.

My guess is that the last query will return much more quickly
than the second query. Which would indicate that the first
form isn't doing you any good.

But a test is worth a thousand opinions.

Best
Erick

On Wed, Aug 31, 2011 at 11:04 AM, Tirthankar Chatterjee
<tc...@commvault.com> wrote:
> Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing.
>
> What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?
>
> Thanks,
> Tirthankar
>
> -----Original Message-----
> From: Jonathan Rochkind [mailto:rochkind@jhu.edu]
> Sent: Monday, July 18, 2011 11:38 AM
> To: solr-user@lucene.apache.org; yonik@lucidimagination.com
> Subject: Re: NRT and commit behavior
>
> In practice, in my experience at least, a very 'expensive' commit can still slow down searches significantly, I think just due to CPU (or
> i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.
>
> On 7/18/2011 11:07 AM, Yonik Seeley wrote:
>> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>>> Very glad to hear that NRT is finally here!  But my question is this:
>>> will things still come to a standstill during a commit?
>> New updates can now proceed in parallel with a commit, and searches
>> have always been completely asynchronous w.r.t. commits.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
> ******************Legal Disclaimer***************************
> "This communication may contain confidential and privileged
> material for the sole use of the intended recipient. Any
> unauthorized review, use or distribution by others is strictly
> prohibited. If you have received the message in error, please
> advise the sender by reply email and delete the message. Thank
> you."
> *********************************************************
>

RE: NRT and commit behavior

Posted by Tirthankar Chatterjee <tc...@commvault.com>.
Also noticed that "waitSearcher" parameter value is not  honored inside commit. It is always defaulted to true which makes it slow during indexing. 

What we are trying to do is use an invalid query (which wont return any results) as a warming query. This way the commit returns faster. Are we doing something wrong here?

Thanks,
Tirthankar

-----Original Message-----
From: Jonathan Rochkind [mailto:rochkind@jhu.edu] 
Sent: Monday, July 18, 2011 11:38 AM
To: solr-user@lucene.apache.org; yonik@lucidimagination.com
Subject: Re: NRT and commit behavior

In practice, in my experience at least, a very 'expensive' commit can still slow down searches significantly, I think just due to CPU (or
i/o?) starvation. Not sure anything can be done about that.  That's my experience in Solr 1.4.1, but since searches have always been async with commits, it probably is the same situation even in more recent versions, I'd guess.

On 7/18/2011 11:07 AM, Yonik Seeley wrote:
> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>> Very glad to hear that NRT is finally here!  But my question is this: 
>> will things still come to a standstill during a commit?
> New updates can now proceed in parallel with a commit, and searches 
> have always been completely asynchronous w.r.t. commits.
>
> -Yonik
> http://www.lucidimagination.com
>
******************Legal Disclaimer***************************
"This communication may contain confidential and privileged
material for the sole use of the intended recipient. Any
unauthorized review, use or distribution by others is strictly
prohibited. If you have received the message in error, please
advise the sender by reply email and delete the message. Thank
you."
*********************************************************

Re: NRT and commit behavior

Posted by Jonathan Rochkind <ro...@jhu.edu>.
In practice, in my experience at least, a very 'expensive' commit can 
still slow down searches significantly, I think just due to CPU (or 
i/o?) starvation. Not sure anything can be done about that.  That's my 
experience in Solr 1.4.1, but since searches have always been async with 
commits, it probably is the same situation even in more recent versions, 
I'd guess.

On 7/18/2011 11:07 AM, Yonik Seeley wrote:
> On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase<nc...@earthlink.net>  wrote:
>> Very glad to hear that NRT is finally here!  But my question is this: will
>> things still come to a standstill during a commit?
> New updates can now proceed in parallel with a commit, and
> searches have always been completely asynchronous w.r.t. commits.
>
> -Yonik
> http://www.lucidimagination.com
>

Re: NRT and commit behavior

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Jul 18, 2011 at 10:53 AM, Nicholas Chase <nc...@earthlink.net> wrote:
> Very glad to hear that NRT is finally here!  But my question is this: will
> things still come to a standstill during a commit?

New updates can now proceed in parallel with a commit, and
searches have always been completely asynchronous w.r.t. commits.

-Yonik
http://www.lucidimagination.com