You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Webster Homer <we...@sial.com> on 2017/08/18 18:06:58 UTC

Re: Tlogs not being deleted/truncated

I have an update on this. While I was on vacation, there were a number of
alerts.
Our autoCommit settings were (and are) the following:
     <autoCommit>
      <maxTime>${solr.autoCommit.maxTime:600000}</maxTime>
       <openSearcher>false</openSearcher>
     </autoCommit>

The startup script was NOT setting solr.autoCommit.maxTime. It seemed that
autoCommits were sporadic at best. Our autoSoftCommit was working.
Our admistrators changed the Solr startup script to set
solr.autoCommit.maxTime. Which they set as follows, i the script.
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoCommit.maxTime=60000"

They claim that this has fixed our tlog problems across the board. Commits
appear to be reliable now. As a developer I don't have visibility into our
production systems. I find it odd that explicitly setting the value in the
solr startup fixed the issue. We had wanted to have this value determined
peer collection but it does seem to address the problem.

This seems like a bug in solr to have it behave like this!

We are running Solr 6.2.0 with our production systems in Google Cloud We
use cdcr to replicate from our on prem systems to the Google Cloud





On Wed, Jul 12, 2017 at 9:19 AM, Webster Homer <we...@sial.com>
wrote:

> We have buffers disabled as described in the CDCR documentation. We also
> have autoCommit set for hard commits, but openSearcher false. We also have
> autoSoftCommit set.
>
>
> On Tue, Jul 11, 2017 at 5:00 PM, Xie, Sean <Se...@finra.org> wrote:
>
>> Please see my previous thread. I have to disable buffer on source cluster
>> and a scheduled hard commit with scheduled logscheduler to make it work.
>>
>>
>> -- Thank you
>> Sean
>>
>> From: jmyatt <jm...@wayfair.com>>
>> Date: Tuesday, Jul 11, 2017, 1:56 PM
>> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org<mailto:
>> solr-user@lucene.apache.org>>
>> Subject: [EXTERNAL] Re: Tlogs not being deleted/truncated
>>
>> another interesting clue in my case (different from what WebsterHomer is
>> seeing): the response from /cdcr?action=QUEUES reflects what I would
>> expect
>> to see in the tlog directory but it's not accurate.  By that I mean
>> tlogTotalSize shows 1500271 (bytes) and tlogTotalCount shows 2.  This
>> changes as more updates come in and autoCommit runs - sometimes
>> tlogTotalCount is 1 instead of 2, and the tlogTotalSize changes but stays
>> in
>> that low range.
>>
>> But on the filesystem, all the tlogs are still there.  Perhaps the ignored
>> exception noted above is in fact a problem?
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble
>> .com/Tlogs-not-being-deleted-truncated-tp4341958p4345477.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>> Confidentiality Notice::  This email, including attachments, may include
>> non-public, proprietary, confidential or legally privileged information.
>> If you are not an intended recipient or an authorized agent of an intended
>> recipient, you are hereby notified that any dissemination, distribution or
>> copying of the information contained in or transmitted with this e-mail is
>> unauthorized and strictly prohibited.  If you have received this email in
>> error, please notify the sender by replying to this message and permanently
>> delete this e-mail, its attachments, and any copies of it immediately.  You
>> should not retain, copy or use this e-mail or any attachment for any
>> purpose, nor disclose all or any part of the contents to any other person.
>> Thank you.
>>
>
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.

Re: Tlogs not being deleted/truncated

Posted by Erick Erickson <er...@gmail.com>.
Anyway, when commit N happens, only then are the earlier tlogs examined to
see whether they should be deleted. So yo have an interval of _at least_ 20
minutes where you'll have old tlogs hanging around. And that's not
even considering
CDCR.

I have noticed that tlogs can get cleared only after "a while" with
CDCR, and this
assumes that buffering is turned off on the source. With buffering on they'll
accumulate forever.

What I'm saying is that I've seen CDCR take quite a while to clean out old
tlogs, and with a 10 minute interval it may take quite a while.

that said, there's no good reason to have long hard commit intervals
for a variety of reasons, not the least of which is that if Solr is killed
for some reason, on restart it can replay all the unclosed-at-the-time-of-kill
updates, s your Solr node can take 10 minutes to come back up in that case.
See: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
And if setting them to 60 seconds makes everyone happy I'd stick with that.
In fact I often just set them to 15 seconds. All that's hapening with
openSearcher set to false is that you're opening about 15 new files and
closing that many other ones. Any efficiency savings is not worth the
risk

Best,
Erick



On Fri, Aug 18, 2017 at 11:06 AM, Webster Homer <we...@sial.com> wrote:
> I have an update on this. While I was on vacation, there were a number of
> alerts.
> Our autoCommit settings were (and are) the following:
>      <autoCommit>
>       <maxTime>${solr.autoCommit.maxTime:600000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
> The startup script was NOT setting solr.autoCommit.maxTime. It seemed that
> autoCommits were sporadic at best. Our autoSoftCommit was working.
> Our admistrators changed the Solr startup script to set
> solr.autoCommit.maxTime. Which they set as follows, i the script.
> SOLR_OPTS="$SOLR_OPTS -Dsolr.autoCommit.maxTime=60000"
>
> They claim that this has fixed our tlog problems across the board. Commits
> appear to be reliable now. As a developer I don't have visibility into our
> production systems. I find it odd that explicitly setting the value in the
> solr startup fixed the issue. We had wanted to have this value determined
> peer collection but it does seem to address the problem.
>
> This seems like a bug in solr to have it behave like this!
>
> We are running Solr 6.2.0 with our production systems in Google Cloud We
> use cdcr to replicate from our on prem systems to the Google Cloud
>
>
>
>
>
> On Wed, Jul 12, 2017 at 9:19 AM, Webster Homer <we...@sial.com>
> wrote:
>
>> We have buffers disabled as described in the CDCR documentation. We also
>> have autoCommit set for hard commits, but openSearcher false. We also have
>> autoSoftCommit set.
>>
>>
>> On Tue, Jul 11, 2017 at 5:00 PM, Xie, Sean <Se...@finra.org> wrote:
>>
>>> Please see my previous thread. I have to disable buffer on source cluster
>>> and a scheduled hard commit with scheduled logscheduler to make it work.
>>>
>>>
>>> -- Thank you
>>> Sean
>>>
>>> From: jmyatt <jm...@wayfair.com>>
>>> Date: Tuesday, Jul 11, 2017, 1:56 PM
>>> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org<mailto:
>>> solr-user@lucene.apache.org>>
>>> Subject: [EXTERNAL] Re: Tlogs not being deleted/truncated
>>>
>>> another interesting clue in my case (different from what WebsterHomer is
>>> seeing): the response from /cdcr?action=QUEUES reflects what I would
>>> expect
>>> to see in the tlog directory but it's not accurate.  By that I mean
>>> tlogTotalSize shows 1500271 (bytes) and tlogTotalCount shows 2.  This
>>> changes as more updates come in and autoCommit runs - sometimes
>>> tlogTotalCount is 1 instead of 2, and the tlogTotalSize changes but stays
>>> in
>>> that low range.
>>>
>>> But on the filesystem, all the tlogs are still there.  Perhaps the ignored
>>> exception noted above is in fact a problem?
>>>
>>>
>>>
>>> --
>>> View this message in context: http://lucene.472066.n3.nabble
>>> .com/Tlogs-not-being-deleted-truncated-tp4341958p4345477.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>> Confidentiality Notice::  This email, including attachments, may include
>>> non-public, proprietary, confidential or legally privileged information.
>>> If you are not an intended recipient or an authorized agent of an intended
>>> recipient, you are hereby notified that any dissemination, distribution or
>>> copying of the information contained in or transmitted with this e-mail is
>>> unauthorized and strictly prohibited.  If you have received this email in
>>> error, please notify the sender by replying to this message and permanently
>>> delete this e-mail, its attachments, and any copies of it immediately.  You
>>> should not retain, copy or use this e-mail or any attachment for any
>>> purpose, nor disclose all or any part of the contents to any other person.
>>> Thank you.
>>>
>>
>>
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.