You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Thomas Tickle <th...@gm.com> on 2016/11/10 22:09:17 UTC

CDCR: Help With Tlog Growth Issues

I am having an issue with cdcr that I could use some assistance in resolving.

I followed the instructions found here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462

The CDCR is setup with a single source to a single target.  Both the source and target cluster are identically setup as 3 machines, each running an external zookeeper and a solr instance.  I've enabled the data replication and successfully seen the documents replicated from the source to the target with no errors in the log files.

However, when examining the /cdcr?action=QUEUES command, I noticed that the tlogTotalSize and tlogTotalCount were alarmingly high.  Checking the data directory for each shard, I was able to confirm that there was several thousand logs files of each 3-4 megs.  It added up to almost 35 GBs of tlogs.  Obviously, this amount of tlogs causes a serious issue when trying to restart a solr server after activities such as patch.

Is it normal for old tlogs to never get removed in a CDCR setup?


Thomas Tickle



Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.

Re: CDCR: Help With Tlog Growth Issues

Posted by Renaud Delbru <re...@siren.solutions>.
Hi Shalin,

when the buffer is enabled, tlogs are not removed anymore, even if they 
were replicated [1]:
"When buffering updates, the updates log will store all the updates 
indefinitely. "

Once you disable the buffer, all the old tlogs should be cleaned (the 
next time the tlog cleaning process is triggered).

Buffer is useful in scenarios when you want to ensure that the source 
cluster will not clean updates until the target clusters are fully 
initialized. For example, let say we perform a whole index replication 
(SLR-6465), while the whole-index replication is performed, the source 
cluster should buffer updates until the whole-index replication is 
completed, otherwise we might miss some updates..

[1] 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462#CrossDataCenterReplication(CDCR)-TheBufferElement

Kind Regards
-- 
Renaud Delbru

On 01/12/2016 17:58, Shalin Shekhar Mangar wrote:
> Even if buffer is enabled, the old tlogs should be remove once the
> updates in those tlogs have been replicated to the target. So the real
> question is why they haven't been removed automatically?
>
> On Thu, Dec 1, 2016 at 9:13 PM, Renaud Delbru <re...@siren.solutions> wrote:
>> Hi Thomas,
>>
>> Looks like the buffer is enabled on the update log, and even if the updates
>> were replicated, they are not removed.
>>
>> What is the output of the command  `cdcr?action=STATUS` on both cluster ?
>>
>> If you see in the response `<str name=buffer>enabled</str>`, then the buffer
>> is enabled.
>> To disable it, you should run the command `/cdcr?action=DISABLEBUFFER`.
>>
>> Kind Regards
>> --
>> Renaud Delbru
>>
>> On 10/11/2016 23:09, Thomas Tickle wrote:
>>>
>>> I am having an issue with cdcr that I could use some assistance in
>>> resolving.
>>>
>>> I followed the instructions found here:
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462
>>>
>>> The CDCR is setup with a single source to a single target.  Both the
>>> source and target cluster are identically setup as 3 machines, each running
>>> an external zookeeper and a solr instance.  I\u2019ve enabled the data
>>> replication and successfully seen the documents replicated from the source
>>> to the target with no errors in the log files.
>>>
>>> However, when examining the /cdcr?action=QUEUES command, I noticed that
>>> the tlogTotalSize and tlogTotalCount were alarmingly high.  Checking the
>>> data directory for each shard, I was able to confirm that there was several
>>> thousand logs files of each 3-4 megs.  It added up to almost 35 GBs of
>>> tlogs.  Obviously, this amount of tlogs causes a serious issue when trying
>>> to restart a solr server after activities such as patch.
>>>
>>> *Is it normal for old tlogs to never get removed in a CDCR setup?*
>>>
>>> **
>>>
>>> Thomas Tickle
>>>
>>>
>>>
>>> Nothing in this message is intended to constitute an electronic signature
>>> unless a specific statement to the contrary is included in this message.
>>>
>>> Confidentiality Note: This message is intended only for the person or
>>> entity to which it is addressed. It may contain confidential and/or
>>> privileged material. Any review, transmission, dissemination or other use,
>>> or taking of any action in reliance upon this message by persons or entities
>>> other than the intended recipient is prohibited and may be unlawful. If you
>>> received this message in error, please contact the sender and delete it from
>>> your computer.
>>
>
>


Re: CDCR: Help With Tlog Growth Issues

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Even if buffer is enabled, the old tlogs should be remove once the
updates in those tlogs have been replicated to the target. So the real
question is why they haven't been removed automatically?

On Thu, Dec 1, 2016 at 9:13 PM, Renaud Delbru <re...@siren.solutions> wrote:
> Hi Thomas,
>
> Looks like the buffer is enabled on the update log, and even if the updates
> were replicated, they are not removed.
>
> What is the output of the command  `cdcr?action=STATUS` on both cluster ?
>
> If you see in the response `<str name=buffer>enabled</str>`, then the buffer
> is enabled.
> To disable it, you should run the command `/cdcr?action=DISABLEBUFFER`.
>
> Kind Regards
> --
> Renaud Delbru
>
> On 10/11/2016 23:09, Thomas Tickle wrote:
>>
>>
>> I am having an issue with cdcr that I could use some assistance in
>> resolving.
>>
>> I followed the instructions found here:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462
>>
>> The CDCR is setup with a single source to a single target.  Both the
>> source and target cluster are identically setup as 3 machines, each running
>> an external zookeeper and a solr instance.  I’ve enabled the data
>> replication and successfully seen the documents replicated from the source
>> to the target with no errors in the log files.
>>
>> However, when examining the /cdcr?action=QUEUES command, I noticed that
>> the tlogTotalSize and tlogTotalCount were alarmingly high.  Checking the
>> data directory for each shard, I was able to confirm that there was several
>> thousand logs files of each 3-4 megs.  It added up to almost 35 GBs of
>> tlogs.  Obviously, this amount of tlogs causes a serious issue when trying
>> to restart a solr server after activities such as patch.
>>
>> *Is it normal for old tlogs to never get removed in a CDCR setup?*
>>
>> **
>>
>> Thomas Tickle
>>
>>
>>
>> Nothing in this message is intended to constitute an electronic signature
>> unless a specific statement to the contrary is included in this message.
>>
>> Confidentiality Note: This message is intended only for the person or
>> entity to which it is addressed. It may contain confidential and/or
>> privileged material. Any review, transmission, dissemination or other use,
>> or taking of any action in reliance upon this message by persons or entities
>> other than the intended recipient is prohibited and may be unlawful. If you
>> received this message in error, please contact the sender and delete it from
>> your computer.
>
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: CDCR: Help With Tlog Growth Issues

Posted by Renaud Delbru <re...@siren.solutions>.
Hi Thomas,

Looks like the buffer is enabled on the update log, and even if the 
updates were replicated, they are not removed.

What is the output of the command  `cdcr?action=STATUS` on both cluster ?

If you see in the response `<str name=buffer>enabled</str>`, then the 
buffer is enabled.
To disable it, you should run the command `/cdcr?action=DISABLEBUFFER`.

Kind Regards
-- 
Renaud Delbru

On 10/11/2016 23:09, Thomas Tickle wrote:
>
> I am having an issue with cdcr that I could use some assistance in 
> resolving.
>
> I followed the instructions found here: 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62687462
>
> The CDCR is setup with a single source to a single target.  Both the 
> source and target cluster are identically setup as 3 machines, each 
> running an external zookeeper and a solr instance.  Ive enabled the 
> data replication and successfully seen the documents replicated from 
> the source to the target with no errors in the log files.
>
> However, when examining the /cdcr?action=QUEUES command, I noticed 
> that the tlogTotalSize and tlogTotalCount were alarmingly high.  
> Checking the data directory for each shard, I was able to confirm that 
> there was several thousand logs files of each 3-4 megs.  It added up 
> to almost 35 GBs of tlogs.  Obviously, this amount of tlogs causes a 
> serious issue when trying to restart a solr server after activities 
> such as patch.
>
> *Is it normal for old tlogs to never get removed in a CDCR setup?*
>
> **
>
> Thomas Tickle
>
>
>
> Nothing in this message is intended to constitute an electronic 
> signature unless a specific statement to the contrary is included in 
> this message.
>
> Confidentiality Note: This message is intended only for the person or 
> entity to which it is addressed. It may contain confidential and/or 
> privileged material. Any review, transmission, dissemination or other 
> use, or taking of any action in reliance upon this message by persons 
> or entities other than the intended recipient is prohibited and may be 
> unlawful. If you received this message in error, please contact the 
> sender and delete it from your computer.