You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Julien C (Jira)" <ji...@apache.org> on 2020/11/12 09:46:00 UTC

[jira] [Updated] (KAFKA-10710) MirrorMaker 2 creates all combinations of herders

     [ https://issues.apache.org/jira/browse/KAFKA-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien C updated KAFKA-10710:
-----------------------------
    Description: 
We are using MM2 distributed to synchronize topics from a "Central" broker down to multiple "Local" brokers. 
{quote}# enable and configure individual replication flows
 replica_CENTRAL->replica_OLS.enabled = true
 replica_CENTRAL->replica_OLS.topics = _schemas
 replica_CENTRAL->replica_OLS.replication.factor = 3

replica_CENTRAL->replica_HBG.enabled = true
 replica_CENTRAL->replica_HBG.topics = _schemas
 replica_CENTRAL->replica_HBG.replication.factor = 3

# many more

replica_CENTRAL->replica_VIT.enabled = true
 replica_CENTRAL->replica_VIT.topics = _schemas
 replica_CENTRAL->replica_VIT.replication.factor = 3

replica_CENTRAL->replica_UGO.enabled = true
 replica_CENTRAL->replica_UGO.topics = _schemas
 replica_CENTRAL->replica_UGO.replication.factor = 3
{quote}
 

When looking into the Mirror Maker logs, we discover that a herder is created for each combination even if we specifically don't describe a link between 2 clusters

 

Exemples:
{quote}{{[2020-11-12 08:43:30,351] INFO creating herder for replica_VIT->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
{{ [2020-11-12 08:43:33,697] INFO creating herder for replica_CNO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
{{ [2020-11-12 08:43:38,508] INFO creating herder for replica_UMO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
{quote}
... And many many more.

So much that we reached the limit of our user property LimitNOFILE recently when trying to add a new "Local" cluster.

I believe this behavior leads to unecessary connections and resource usage that could be easily avoided by just limiting the herder creation only to elements specifically described in the mirrormaker.properties file

[https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorMaker.java#L130-L136]

 

  was:
We are using MM2 distributed to synchronize topics from a "Central" broker down to multiple "Local" brokers. 
{quote} 

# enable and configure individual replication flows
replica_CENTRAL->replica_OLS.enabled = true
replica_CENTRAL->replica_OLS.topics = _schemas
replica_CENTRAL->replica_OLS.replication.factor = 3

replica_CENTRAL->replica_HBG.enabled = true
replica_CENTRAL->replica_HBG.topics = _schemas
replica_CENTRAL->replica_HBG.replication.factor = 3

# ...

# many more

# ...

replica_CENTRAL->replica_VIT.enabled = true
replica_CENTRAL->replica_VIT.topics = _schemas
replica_CENTRAL->replica_VIT.replication.factor = 3

replica_CENTRAL->replica_UGO.enabled = true
replica_CENTRAL->replica_UGO.topics = _schemas
replica_CENTRAL->replica_UGO.replication.factor = 3

 
{quote}
# enable and configure individual replication flows

When looking into the Mirror Maker logs, we discover that a herder is created for each combination even if we specifically don't describe a link

Exemples:

[2020-11-12 08:43:30,351] INFO creating herder for replica_VIT->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)
[2020-11-12 08:43:33,697] INFO creating herder for replica_CNO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)
[2020-11-12 08:43:38,508] INFO creating herder for replica_UMO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)


> MirrorMaker 2 creates all combinations of herders
> -------------------------------------------------
>
>                 Key: KAFKA-10710
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10710
>             Project: Kafka
>          Issue Type: Bug
>          Components: mirrormaker
>    Affects Versions: 2.5.1
>            Reporter: Julien C
>            Priority: Major
>
> We are using MM2 distributed to synchronize topics from a "Central" broker down to multiple "Local" brokers. 
> {quote}# enable and configure individual replication flows
>  replica_CENTRAL->replica_OLS.enabled = true
>  replica_CENTRAL->replica_OLS.topics = _schemas
>  replica_CENTRAL->replica_OLS.replication.factor = 3
> replica_CENTRAL->replica_HBG.enabled = true
>  replica_CENTRAL->replica_HBG.topics = _schemas
>  replica_CENTRAL->replica_HBG.replication.factor = 3
> # many more
> replica_CENTRAL->replica_VIT.enabled = true
>  replica_CENTRAL->replica_VIT.topics = _schemas
>  replica_CENTRAL->replica_VIT.replication.factor = 3
> replica_CENTRAL->replica_UGO.enabled = true
>  replica_CENTRAL->replica_UGO.topics = _schemas
>  replica_CENTRAL->replica_UGO.replication.factor = 3
> {quote}
>  
> When looking into the Mirror Maker logs, we discover that a herder is created for each combination even if we specifically don't describe a link between 2 clusters
>  
> Exemples:
> {quote}{{[2020-11-12 08:43:30,351] INFO creating herder for replica_VIT->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
> {{ [2020-11-12 08:43:33,697] INFO creating herder for replica_CNO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
> {{ [2020-11-12 08:43:38,508] INFO creating herder for replica_UMO->replica_UGO (org.apache.kafka.connect.mirror.MirrorMaker)}}
> {quote}
> ... And many many more.
> So much that we reached the limit of our user property LimitNOFILE recently when trying to add a new "Local" cluster.
> I believe this behavior leads to unecessary connections and resource usage that could be easily avoided by just limiting the herder creation only to elements specifically described in the mirrormaker.properties file
> [https://github.com/apache/kafka/blob/trunk/connect/mirror/src/main/java/org/apache/kafka/connect/mirror/MirrorMaker.java#L130-L136]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)