You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Rallavagu <ra...@gmail.com> on 2015/10/30 16:28:09 UTC

growth of tlog

4.10.4 solr cloud, 3 zk quorum, jdk 8

autocommit: 15 sec, softcommit: 2 min

Under heavy indexing load with above settings, i have seen tlog growing 
(into GB). After the updates stopped coming in, it settles down and 
takes a while to recover before cloud becomes "green".

With 15 second autocommit setting, what could potentially cause tlog to 
grow? What to look for?

Re: growth of tlog

Posted by Shawn Heisey <ap...@elyograg.org>.

On 10/30/2015 9:46 AM, Rallavagu wrote:
> Also, this affects available physical memory as tlog continues to grow
> and it is memory mapped.

I think this is a common misconception.

MMAP does *not* use up physical memory, at least not in the detrimental
way your sentence suggests.  Any memory (OS disk cache) used when
reading files this way can be immediately claimed by any program that
needs it.

https://en.wikipedia.org/wiki/Page_cache

MMAP allows programs to allocate LESS memory, not more.  It is far more
efficient if there is spare memory that is not explicitly needed by
applications, because of the OS disk cache.

Thanks,
Shawn

Re: growth of tlog

Posted by Rallavagu <ra...@gmail.com>.


On 10/30/15 8:39 AM, Erick Erickson wrote:
> I infer that this statement: "takes a while to recover before cloud
> becomes green"
> indicates that the node is in recovery or something while indexing. If you're
> still indexing, the new documents will be written to the followers
> tlog while the
> follower is recovering, leading to it growing. I expect that after followers
> all recover, the tlog shrinks after a few commits have gone by.

Correct. The recovery time is extended though. Also, this affects 
available physical memory as tlog continues to grow and it is memory mapped.

>
> If that's all true, the question is why the follower goes into
> recovery in the first
> place. Prior to 5.2, there was a situation in which very heavy indexing
> could cause a follower to go into Leader Initiated Recovery (LIR) (look for this
> in both the leader and follower logs). Here's the blog Tim Potter wrote
> on this subject:
> https://lucidworks.com/blog/2015/06/10/indexing-performance-solr-5-2-now-twice-fast/
>
> The smoking gun here is
> 1> heavy indexing is required
> 2> the _leader_ stays up
> 3> the _follower_ goes into recovery for no readily apparent reason
> 4> the nail in the coffin for this particular issue is seeing that the follower
>       went into LIR.
> 5> You'll also see a very large number of threads on the leader waiting
>        on sending the updates to the follower.
>
>
> If this is a problem, prior to 5.2 there are really only two solutions
> 1> throttle indexing
> 2> take all of the followers offline during indexing. When indexing is
>       completed, bring the followers back up and let them replicate the
>       full index down from the leader.
Other than shutting followers down, is there a elegant/graceful way of 
taking follower nodes offline? Also, to give you more idea, as per the 
following document I am testing "Index heavy, Query heavy" situation.

https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Thanks

>
> Best,
> Erick
>
> On Fri, Oct 30, 2015 at 8:28 AM, Rallavagu <ra...@gmail.com> wrote:
>> 4.10.4 solr cloud, 3 zk quorum, jdk 8
>>
>> autocommit: 15 sec, softcommit: 2 min
>>
>> Under heavy indexing load with above settings, i have seen tlog growing
>> (into GB). After the updates stopped coming in, it settles down and takes a
>> while to recover before cloud becomes "green".
>>
>> With 15 second autocommit setting, what could potentially cause tlog to
>> grow? What to look for?

Re: growth of tlog

Posted by Erick Erickson <er...@gmail.com>.

I infer that this statement: "takes a while to recover before cloud
becomes green"
indicates that the node is in recovery or something while indexing. If you're
still indexing, the new documents will be written to the followers
tlog while the
follower is recovering, leading to it growing. I expect that after followers
all recover, the tlog shrinks after a few commits have gone by.

If that's all true, the question is why the follower goes into
recovery in the first
place. Prior to 5.2, there was a situation in which very heavy indexing
could cause a follower to go into Leader Initiated Recovery (LIR) (look for this
in both the leader and follower logs). Here's the blog Tim Potter wrote
on this subject:
https://lucidworks.com/blog/2015/06/10/indexing-performance-solr-5-2-now-twice-fast/

The smoking gun here is
1> heavy indexing is required
2> the _leader_ stays up
3> the _follower_ goes into recovery for no readily apparent reason
4> the nail in the coffin for this particular issue is seeing that the follower
     went into LIR.
5> You'll also see a very large number of threads on the leader waiting
      on sending the updates to the follower.

If this is a problem, prior to 5.2 there are really only two solutions
1> throttle indexing
2> take all of the followers offline during indexing. When indexing is
     completed, bring the followers back up and let them replicate the
     full index down from the leader.

Best,
Erick

On Fri, Oct 30, 2015 at 8:28 AM, Rallavagu <ra...@gmail.com> wrote:
> 4.10.4 solr cloud, 3 zk quorum, jdk 8
>
> autocommit: 15 sec, softcommit: 2 min
>
> Under heavy indexing load with above settings, i have seen tlog growing
> (into GB). After the updates stopped coming in, it settles down and takes a
> while to recover before cloud becomes "green".
>
> With 15 second autocommit setting, what could potentially cause tlog to
> grow? What to look for?