You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Iftach Ben-Yosef <ib...@outbrain.com> on 2020/07/29 05:52:59 UTC

Mirrormaker 2 logs - WARN Catching up to assignment's config offset

Hello

I'm running a mirrormaker 2 cluster which copies from 3 source clusters
into 1 destination. Yesterday I restarted the cluster and it took 1 of the
mirrored topics a pretty long time to recover (2~ hours)

Since the restart the mm2 cluster has been sending a lot of these warning
messages from all 3 source clusters

 WARN [Worker clientId=connect-4, groupId=local-ny-mm2] Catching up to
assignment's config offset.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)

Here is a snippet of how the logs look like

TDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
groupId=local-chi-mm2] Rebalance started
(org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
STDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
groupId=local-chi-mm2] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
groupId=local-chi-mm2] Successfully joined group with generation 9005
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
groupId=local-chi-mm2] Joined group at generation 9005 with protocol
version 2 and got assignment: Assignment{error=0,
leader='connect-1-cb7aa52c-a29a-4cf9-8f50-b691ba38aa3d',
leaderUrl='NOTUSED/local-chi', offset=1178, connectorIds=[], taskIds=[],
revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
STDOUT: [2020-07-29 05:44:17,144] WARN [Worker clientId=connect-2,
groupId=local-chi-mm2] Catching up to assignment's config offset.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
groupId=local-chi-mm2] Current config state offset -1 is behind group
assignment 1178, reading to end of config log
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
STDOUT: [2020-07-29 05:44:17,183] INFO [Worker clientId=connect-6,
groupId=local-chi-mm2] Successfully joined group with generation 9317
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
groupId=local-chi-mm2] Joined group at generation 9317 with protocol
version 2 and got assignment: Assignment{error=0,
leader='connect-5-9a82cf55-e113-4112-86bf-d3cabcd44f54',
leaderUrl='NOTUSED/local-chi', offset=1401, connectorIds=[], taskIds=[],
revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
STDOUT: [2020-07-29 05:44:17,184] WARN [Worker clientId=connect-6,
groupId=local-chi-mm2] Catching up to assignment's config offset.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
groupId=local-chi-mm2] Current config state offset -1 is behind group
assignment 1401, reading to end of config log
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
groupId=local-sa-mm2] Finished reading to end of log and updated config
snapshot, new config log offset: -1
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1085)
STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
groupId=local-sa-mm2] Current config state offset -1 does not match group
assignment 1387. Forcing rebalance.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder:1053)
STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
groupId=local-sa-mm2] Rebalance started
(org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
groupId=local-sa-mm2] (Re-)joining group
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
STDOUT: [2020-07-29 05:44:17,276] INFO [Worker clientId=connect-8,
groupId=local-sa-mm2] Successfully joined group with generation 9077
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)

Is anyone familiar with this and can help? So far didnt find anything
useful online specifically for mm2. I tried to delete the config topics for
each source cluster from the destination cluster and then restarting the
mm2 service again. This did seem to help somewhat but I am still getting
those warnings and more logs than expected per min.


Thanks,
Iftach

-- 
The above terms reflect a potential business arrangement, are provided 
solely as a basis for further discussion, and are not intended to be and do 
not constitute a legally binding obligation. No legally binding obligations 
will be created, implied, or inferred until an agreement in final form is 
executed in writing by all parties involved.


This email and any 
attachments hereto may be confidential or privileged.  If you received this 
communication by mistake, please don't forward it to anyone else, please 
erase all copies and attachments, and please let me know that it has gone 
to the wrong person. Thanks.

Re: Mirrormaker 2 logs - WARN Catching up to assignment's config offset

Posted by Iftach Ben-Yosef <ib...@outbrain.com>.
Hey Ryanne,

Interesting points. I wasn't aware of the number of partitions in relation
to the number of connect workers. I will test this out and update.

Thanks!

On Wed, Jul 29, 2020, 21:01 Ryanne Dolan <ry...@gmail.com> wrote:

> Iftach, you can try deleting Connect's internal config and status topics.
> The status topic records, among other things, the offsets within the config
> topics iirc, so if you delete the configs without deleting the status,
> you'll get messages such as those. Just don't delete the mm2-offsets
> topics, as doing so would result in MM2 starting from the beginning of all
> source partitions and re-replicating everything.
>
> You can also check that there are enough partitions in the config and
> status topics to account for all the Connect workers. It's possible that
> you're in a rebalance loop from too many consumers to those internal
> topics.
>
> Ryanne
>
> On Wed, Jul 29, 2020 at 1:03 AM Iftach Ben-Yosef <ib...@outbrain.com>
> wrote:
>
> > Hello
> >
> > I'm running a mirrormaker 2 cluster which copies from 3 source clusters
> > into 1 destination. Yesterday I restarted the cluster and it took 1 of
> the
> > mirrored topics a pretty long time to recover (2~ hours)
> >
> > Since the restart the mm2 cluster has been sending a lot of these warning
> > messages from all 3 source clusters
> >
> >  WARN [Worker clientId=connect-4, groupId=local-ny-mm2] Catching up to
> > assignment's config offset.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
> >
> > Here is a snippet of how the logs look like
> >
> > TDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
> > groupId=local-chi-mm2] Rebalance started
> > (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
> > STDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
> > groupId=local-chi-mm2] (Re-)joining group
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
> > STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> > groupId=local-chi-mm2] Successfully joined group with generation 9005
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
> > STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> > groupId=local-chi-mm2] Joined group at generation 9005 with protocol
> > version 2 and got assignment: Assignment{error=0,
> > leader='connect-1-cb7aa52c-a29a-4cf9-8f50-b691ba38aa3d',
> > leaderUrl='NOTUSED/local-chi', offset=1178, connectorIds=[], taskIds=[],
> > revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance
> delay: 0
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
> > STDOUT: [2020-07-29 05:44:17,144] WARN [Worker clientId=connect-2,
> > groupId=local-chi-mm2] Catching up to assignment's config offset.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
> > STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> > groupId=local-chi-mm2] Current config state offset -1 is behind group
> > assignment 1178, reading to end of config log
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
> > STDOUT: [2020-07-29 05:44:17,183] INFO [Worker clientId=connect-6,
> > groupId=local-chi-mm2] Successfully joined group with generation 9317
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
> > STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
> > groupId=local-chi-mm2] Joined group at generation 9317 with protocol
> > version 2 and got assignment: Assignment{error=0,
> > leader='connect-5-9a82cf55-e113-4112-86bf-d3cabcd44f54',
> > leaderUrl='NOTUSED/local-chi', offset=1401, connectorIds=[], taskIds=[],
> > revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance
> delay: 0
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
> > STDOUT: [2020-07-29 05:44:17,184] WARN [Worker clientId=connect-6,
> > groupId=local-chi-mm2] Catching up to assignment's config offset.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
> > STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
> > groupId=local-chi-mm2] Current config state offset -1 is behind group
> > assignment 1401, reading to end of config log
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
> > STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> > groupId=local-sa-mm2] Finished reading to end of log and updated config
> > snapshot, new config log offset: -1
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1085)
> > STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> > groupId=local-sa-mm2] Current config state offset -1 does not match group
> > assignment 1387. Forcing rebalance.
> > (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1053)
> > STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> > groupId=local-sa-mm2] Rebalance started
> > (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
> > STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> > groupId=local-sa-mm2] (Re-)joining group
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
> > STDOUT: [2020-07-29 05:44:17,276] INFO [Worker clientId=connect-8,
> > groupId=local-sa-mm2] Successfully joined group with generation 9077
> > (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
> >
> > Is anyone familiar with this and can help? So far didnt find anything
> > useful online specifically for mm2. I tried to delete the config topics
> for
> > each source cluster from the destination cluster and then restarting the
> > mm2 service again. This did seem to help somewhat but I am still getting
> > those warnings and more logs than expected per min.
> >
> >
> > Thanks,
> > Iftach
> >
> > --
> > The above terms reflect a potential business arrangement, are provided
> > solely as a basis for further discussion, and are not intended to be and
> > do
> > not constitute a legally binding obligation. No legally binding
> > obligations
> > will be created, implied, or inferred until an agreement in final form is
> > executed in writing by all parties involved.
> >
> >
> > This email and any
> > attachments hereto may be confidential or privileged.  If you received
> > this
> > communication by mistake, please don't forward it to anyone else, please
> > erase all copies and attachments, and please let me know that it has gone
> > to the wrong person. Thanks.
> >
>

-- 
The above terms reflect a potential business arrangement, are provided 
solely as a basis for further discussion, and are not intended to be and do 
not constitute a legally binding obligation. No legally binding obligations 
will be created, implied, or inferred until an agreement in final form is 
executed in writing by all parties involved.


This email and any 
attachments hereto may be confidential or privileged.  If you received this 
communication by mistake, please don't forward it to anyone else, please 
erase all copies and attachments, and please let me know that it has gone 
to the wrong person. Thanks.

Re: Mirrormaker 2 logs - WARN Catching up to assignment's config offset

Posted by Ryanne Dolan <ry...@gmail.com>.
Iftach, you can try deleting Connect's internal config and status topics.
The status topic records, among other things, the offsets within the config
topics iirc, so if you delete the configs without deleting the status,
you'll get messages such as those. Just don't delete the mm2-offsets
topics, as doing so would result in MM2 starting from the beginning of all
source partitions and re-replicating everything.

You can also check that there are enough partitions in the config and
status topics to account for all the Connect workers. It's possible that
you're in a rebalance loop from too many consumers to those internal topics.

Ryanne

On Wed, Jul 29, 2020 at 1:03 AM Iftach Ben-Yosef <ib...@outbrain.com>
wrote:

> Hello
>
> I'm running a mirrormaker 2 cluster which copies from 3 source clusters
> into 1 destination. Yesterday I restarted the cluster and it took 1 of the
> mirrored topics a pretty long time to recover (2~ hours)
>
> Since the restart the mm2 cluster has been sending a lot of these warning
> messages from all 3 source clusters
>
>  WARN [Worker clientId=connect-4, groupId=local-ny-mm2] Catching up to
> assignment's config offset.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
>
> Here is a snippet of how the logs look like
>
> TDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
> groupId=local-chi-mm2] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
> STDOUT: [2020-07-29 05:44:17,143] INFO [Worker clientId=connect-2,
> groupId=local-chi-mm2] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
> STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> groupId=local-chi-mm2] Successfully joined group with generation 9005
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
> STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> groupId=local-chi-mm2] Joined group at generation 9005 with protocol
> version 2 and got assignment: Assignment{error=0,
> leader='connect-1-cb7aa52c-a29a-4cf9-8f50-b691ba38aa3d',
> leaderUrl='NOTUSED/local-chi', offset=1178, connectorIds=[], taskIds=[],
> revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
> STDOUT: [2020-07-29 05:44:17,144] WARN [Worker clientId=connect-2,
> groupId=local-chi-mm2] Catching up to assignment's config offset.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
> STDOUT: [2020-07-29 05:44:17,144] INFO [Worker clientId=connect-2,
> groupId=local-chi-mm2] Current config state offset -1 is behind group
> assignment 1178, reading to end of config log
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
> STDOUT: [2020-07-29 05:44:17,183] INFO [Worker clientId=connect-6,
> groupId=local-chi-mm2] Successfully joined group with generation 9317
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
> STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
> groupId=local-chi-mm2] Joined group at generation 9317 with protocol
> version 2 and got assignment: Assignment{error=0,
> leader='connect-5-9a82cf55-e113-4112-86bf-d3cabcd44f54',
> leaderUrl='NOTUSED/local-chi', offset=1401, connectorIds=[], taskIds=[],
> revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1549)
> STDOUT: [2020-07-29 05:44:17,184] WARN [Worker clientId=connect-6,
> groupId=local-chi-mm2] Catching up to assignment's config offset.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1020)
> STDOUT: [2020-07-29 05:44:17,184] INFO [Worker clientId=connect-6,
> groupId=local-chi-mm2] Current config state offset -1 is behind group
> assignment 1401, reading to end of config log
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1081)
> STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> groupId=local-sa-mm2] Finished reading to end of log and updated config
> snapshot, new config log offset: -1
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1085)
> STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> groupId=local-sa-mm2] Current config state offset -1 does not match group
> assignment 1387. Forcing rebalance.
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder:1053)
> STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> groupId=local-sa-mm2] Rebalance started
> (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator:222)
> STDOUT: [2020-07-29 05:44:17,239] INFO [Worker clientId=connect-8,
> groupId=local-sa-mm2] (Re-)joining group
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:552)
> STDOUT: [2020-07-29 05:44:17,276] INFO [Worker clientId=connect-8,
> groupId=local-sa-mm2] Successfully joined group with generation 9077
> (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:503)
>
> Is anyone familiar with this and can help? So far didnt find anything
> useful online specifically for mm2. I tried to delete the config topics for
> each source cluster from the destination cluster and then restarting the
> mm2 service again. This did seem to help somewhat but I am still getting
> those warnings and more logs than expected per min.
>
>
> Thanks,
> Iftach
>
> --
> The above terms reflect a potential business arrangement, are provided
> solely as a basis for further discussion, and are not intended to be and
> do
> not constitute a legally binding obligation. No legally binding
> obligations
> will be created, implied, or inferred until an agreement in final form is
> executed in writing by all parties involved.
>
>
> This email and any
> attachments hereto may be confidential or privileged.  If you received
> this
> communication by mistake, please don't forward it to anyone else, please
> erase all copies and attachments, and please let me know that it has gone
> to the wrong person. Thanks.
>