You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "tinaselenge (via GitHub)" <gi...@apache.org> on 2023/04/27 16:25:11 UTC

[GitHub] [kafka] tinaselenge opened a new pull request, #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

tinaselenge opened a new pull request, #13650:
URL: https://github.com/apache/kafka/pull/13650

   Most of the contents in the README.md was already covered in the docs therefore only had to add the section for Exactly Once support. 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] showuon commented on pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "showuon (via GitHub)" <gi...@apache.org>.

showuon commented on PR #13650:
URL: https://github.com/apache/kafka/pull/13650#issuecomment-1591083560

   Re-triggering the CI: https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13650/3/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] showuon commented on a diff in pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "showuon (via GitHub)" <gi...@apache.org>.

showuon commented on code in PR #13650:
URL: https://github.com/apache/kafka/pull/13650#discussion_r1229514718


##########
connect/mirror/README.md:
##########
@@ -1,297 +0,0 @@
-
-# MirrorMaker 2.0
-
-MM2 leverages the Connect framework to replicate topics between Kafka
-clusters. MM2 includes several new features, including:
-
- - both topics and consumer groups are replicated
- - topic configuration and ACLs are replicated
- - cross-cluster offsets are synchronized
- - partitioning is preserved
-
-## Replication flows
-
-MM2 replicates topics and consumer groups from upstream source clusters
-to downstream target clusters. These directional flows are notated
-`A->B`.
-
-It's possible to create complex replication topologies based on these
-`source->target` flows, including:
-
- - *fan-out*, e.g. `K->A, K->B, K->C`
- - *aggregation*, e.g. `A->K, B->K, C->K`
- - *active/active*, e.g. `A->B, B->A`
-
-Each replication flow can be configured independently, e.g. to replicate
-specific topics or groups:
-
-    A->B.topics = topic-1, topic-2
-    A->B.groups = group-1, group-2
-
-By default, all topics and consumer groups are replicated (except
-excluded ones), across all enabled replication flows. Each
-replication flow must be explicitly enabled to begin replication:
-
-    A->B.enabled = true
-    B->A.enabled = true
-
-## Starting an MM2 process
-
-You can run any number of MM2 processes as needed. Any MM2 processes
-which are configured to replicate the same Kafka clusters will find each
-other, share configuration, load balance, etc.
-
-To start an MM2 process, first specify Kafka cluster information in a
-configuration file as follows:
-
-    # mm2.properties
-    clusters = us-west, us-east
-    us-west.bootstrap.servers = host1:9092
-    us-east.bootstrap.servers = host2:9092
-
-You can list any number of clusters this way.
-
-Optionally, you can override default MirrorMaker properties:
-
-    topics = .*
-    groups = group1, group2
-    emit.checkpoints.interval.seconds = 10
-
-These will apply to all replication flows. You can also override default
-properties for specific clusters or replication flows:
-
-    # configure a specific cluster
-    us-west.offset.storage.topic = mm2-offsets
-
-    # configure a specific source->target replication flow
-    us-west->us-east.emit.heartbeats = false
-
-Next, enable individual replication flows as follows:
-
-    us-west->us-east.enabled = true     # disabled by default
-
-Finally, launch one or more MirrorMaker processes with the `connect-mirror-maker.sh`
-script:
-
-    $ ./bin/connect-mirror-maker.sh mm2.properties
-
-## Multicluster environments
-
-MM2 supports replication between multiple Kafka clusters, whether in the
-same data center or across multiple data centers. A single MM2 cluster
-can span multiple data centers, but it is recommended to keep MM2's producers
-as close as possible to their target clusters. To do so, specify a subset
-of clusters for each MM2 node as follows:
-
-    # in west DC:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-This signals to the node that the given clusters are nearby, and prevents the
-node from sending records or configuration to clusters in other data centers.
-
-### Example
-
-Say there are three data centers (west, east, north) with two Kafka
-clusters in each data center (west-1, west-2 etc). We can configure MM2
-for active/active replication within each data center, as well as cross data
-center replication (XDCR) as follows:
-
-    # mm2.properties
-    clusters: west-1, west-2, east-1, east-2, north-1, north-2
-
-    west-1.bootstrap.servers = ...
-    ---%<---
-
-    # active/active in west
-    west-1->west-2.enabled = true
-    west-2->west-1.enabled = true
-
-    # active/active in east
-    east-1->east-2.enabled = true
-    east-2->east-1.enabled = true
-
-    # active/active in north
-    north-1->north-2.enabled = true
-    north-2->north-1.enabled = true
-
-    # XDCR via west-1, east-1, north-1
-    west-1->east-1.enabled = true
-    west-1->north-1.enabled = true
-    east-1->west-1.enabled = true
-    east-1->north-1.enabled = true
-    north-1->west-1.enabled = true
-    north-1->east-1.enabled = true
-
-Then, launch MM2 in each data center as follows:
-
-    # in west:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-    # in east:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters east-1 east-2
-
-    # in north:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters north-1 north-2
-    
-With this configuration, records produced to any cluster will be replicated
-within the data center, as well as across to other data centers. By providing
-the `--clusters` parameter, we ensure that each node only produces records to
-nearby clusters.
-
-N.B. that the `--clusters` parameter is not technically required here. MM2 will work fine without it; however, throughput may suffer from "producer lag" between
-data centers, and you may incur unnecessary data transfer costs.
-
-## Configuration
-The following sections target for dedicated MM2 cluster. If running MM2 in a Connect cluster, please refer to [KIP-382: MirrorMaker 2.0](https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0) for guidance.
- 
-### General Kafka Connect Config
-All Kafka Connect, Source Connector, Sink Connector configs, as defined in [Kafka official doc](https://kafka.apache.org/documentation/#connectconfigs), can be 
-directly used in MM2 configuration without prefix in the configuration name. As the starting point, most of these default configs may work well with the exception of `tasks.max`.
-
-In order to evenly distribute the workload across more than one MM2 instance, it is advised to set `tasks.max` at least to 2 or even larger depending on the hardware resources
-and the total number partitions to be replicated.
-
-### Kafka Connect Config for a Specific Connector
-If needed, Kafka Connect worker-level configs could be even specified "per connector", which needs to follow the format of `cluster_alias.config_name` in MM2 configuration. For example,
- 
-    backup.ssl.truststore.location = /usr/lib/jvm/zulu-8-amd64/jre/lib/security/cacerts // SSL cert location
-    backup.security.protocol = SSL // if target cluster needs SSL to send message
-    
-### MM2 Config for a Specific Connector
-MM2 itself has many configs to control how it behaves. To override those default values, add the config name by the format of `source_cluster_alias->target_cluster_alias.config_name` in MM2 configuration. For example,
-    
-    backup->primary.enabled = false // set to false if one-way replication is desired
-    primary->backup.topics.blacklist = topics_to_blacklist
-    primary->backup.emit.heartbeats.enabled = false
-    primary->backup.sync.group.offsets = true 
-
-### Producer / Consumer / Admin Config used by MM2
-In many cases, customized values for producer or consumer configurations are needed. In order to override the default values of producer or consumer used by MM2, 
-`target_cluster_alias.producer.producer_config_name`, `source_cluster_alias.consumer.consumer_config_name` or `cluster_alias.admin.admin_config_name` are the formats to use in MM2 configuration. For example,
-
-     backup.producer.compression.type = gzip
-     backup.producer.buffer.memory = 32768
-     primary.consumer.isolation.level = read_committed
-     primary.admin.bootstrap.servers = localhost:9092
-     
-### Shared configuration

Review Comment:
   Ah, you're right. It's just not having the same title. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] showuon commented on pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "showuon (via GitHub)" <gi...@apache.org>.

showuon commented on PR #13650:
URL: https://github.com/apache/kafka/pull/13650#issuecomment-1592238006

   Failed tests are unrelated:
   ```
       Build / JDK 11 and Scala 2.13 / org.apache.kafka.connect.mirror.integration.MirrorConnectorsIntegrationSSLTest.testSyncTopicConfigs()
       Build / JDK 11 and Scala 2.13 / kafka.admin.DescribeConsumerGroupTest.testDescribeOffsetsOfExistingGroupWithNoMembers()
       Build / JDK 11 and Scala 2.13 / kafka.zk.ZkMigrationIntegrationTest.[1] Type=ZK, Name=testNewAndChangedTopicsInDualWrite, MetadataVersion=3.4-IV0, Security=PLAINTEXT
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster()
       Build / JDK 8 and Scala 2.12 / kafka.api.ConsumerBounceTest.testClose()
       Build / JDK 8 and Scala 2.12 / kafka.api.PlaintextConsumerTest.testMaxPollIntervalMs()
       Build / JDK 8 and Scala 2.12 / org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest.shouldMigratePersistentKeyValueStoreToTimestampedKeyValueStoreUsingPapi
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] machi1990 commented on a diff in pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "machi1990 (via GitHub)" <gi...@apache.org>.

machi1990 commented on code in PR #13650:
URL: https://github.com/apache/kafka/pull/13650#discussion_r1179535812


##########
docs/ops.html:
##########
@@ -694,6 +694,45 @@ <h5 class="anchor-heading"><a id="georeplication-config-syntax" class="anchor-li
 us-east.admin.bootstrap.servers = broker8-secondary:9092
 </code></pre>
 
+  <h5 class="anchor-heading"><a id="georeplication-exactly_once" class="anchor-link"></a><a href="#georeplication-exactly_once">Exactly once</a></h5>
+
+  <p>
+    Exactly-once semantics are supported for dedicated MirrorMaker clusters as of version 3.5.0.</p>
+  
+  <p>
+    For new MirrorMaker clusters, set the <code>exactly.once.source.support</code> property to enabled for all targeted Kafka clusters that should be written to with exactly-once semantics. For example, to enable exactly-once for writes to cluster </code>us-east</code>, the following configuration can be used:
+  </p>
+
+<pre class="line-numbers"><code class="language-text">us-east.exactly.once.source.support = enabled
+</code></pre>
+  
+  <p>
+    For existing MirrorMaker clusters, a two-step upgrade is necessary. Instead of immediately setting the <code>exactly.once.source.support</code> property to enabled, first set it to <code>preparing</code> on all nodes in the cluster. Once this is complete, it can be set to </code>enabled</code> on all nodes in the cluster, in a second round of restarts.
+  </p>
+  
+  <p>
+    In either case, it is also necessary to enable intra-cluster communication between the MirrorMaker nodes, as described in  <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters">KIP-710</a>. To do this, the <code>dedicated.mode.enable.internal.rest</code> property must be set to <code>true</code>. In addition, many of the REST-related  <a href="https://kafka.apache.org/documentation/#connectconfigs">configuration properties available for Kafka Connect</a> can be specified the MirrorMaker config. For example, to enable intra-cluster communication in MirrorMaker cluster with each node listening on port 8080 of their local machine, the following should be added to the MirrorMaker config file:
+  </p>
+
+<pre class="line-numbers"><code class="language-text">dedicated.mode.enable.internal.rest = true
+listeners = http://localhost:8080
+</code></pre>
+  
+  <p><b>
+    Note that, if intra-cluster communication is enabled in production environments, it is highly recommended to secure the REST servers brought up by each MirrorMaker node. See the configuration properties for Kafka Connect for information on how this can be accomplished.

Review Comment:
   ```suggestion
       Note that, if intra-cluster communication is enabled in production environments, it is highly recommended to secure the REST servers brought up by each MirrorMaker node. See the <a href="https://kafka.apache.org/documentation/#connectconfigs">configuration properties for Kafka Connect</a> for information on how this can be accomplished.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] showuon commented on a diff in pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "showuon (via GitHub)" <gi...@apache.org>.

showuon commented on code in PR #13650:
URL: https://github.com/apache/kafka/pull/13650#discussion_r1222665546


##########
connect/mirror/README.md:
##########
@@ -1,297 +0,0 @@
-
-# MirrorMaker 2.0
-
-MM2 leverages the Connect framework to replicate topics between Kafka
-clusters. MM2 includes several new features, including:
-
- - both topics and consumer groups are replicated
- - topic configuration and ACLs are replicated
- - cross-cluster offsets are synchronized
- - partitioning is preserved
-
-## Replication flows
-
-MM2 replicates topics and consumer groups from upstream source clusters
-to downstream target clusters. These directional flows are notated
-`A->B`.
-
-It's possible to create complex replication topologies based on these
-`source->target` flows, including:
-
- - *fan-out*, e.g. `K->A, K->B, K->C`
- - *aggregation*, e.g. `A->K, B->K, C->K`
- - *active/active*, e.g. `A->B, B->A`
-
-Each replication flow can be configured independently, e.g. to replicate
-specific topics or groups:
-
-    A->B.topics = topic-1, topic-2
-    A->B.groups = group-1, group-2
-
-By default, all topics and consumer groups are replicated (except
-excluded ones), across all enabled replication flows. Each
-replication flow must be explicitly enabled to begin replication:
-
-    A->B.enabled = true
-    B->A.enabled = true
-
-## Starting an MM2 process
-
-You can run any number of MM2 processes as needed. Any MM2 processes
-which are configured to replicate the same Kafka clusters will find each
-other, share configuration, load balance, etc.
-
-To start an MM2 process, first specify Kafka cluster information in a
-configuration file as follows:
-
-    # mm2.properties
-    clusters = us-west, us-east
-    us-west.bootstrap.servers = host1:9092
-    us-east.bootstrap.servers = host2:9092
-
-You can list any number of clusters this way.
-
-Optionally, you can override default MirrorMaker properties:
-
-    topics = .*
-    groups = group1, group2
-    emit.checkpoints.interval.seconds = 10
-
-These will apply to all replication flows. You can also override default
-properties for specific clusters or replication flows:
-
-    # configure a specific cluster
-    us-west.offset.storage.topic = mm2-offsets
-
-    # configure a specific source->target replication flow
-    us-west->us-east.emit.heartbeats = false
-
-Next, enable individual replication flows as follows:
-
-    us-west->us-east.enabled = true     # disabled by default
-
-Finally, launch one or more MirrorMaker processes with the `connect-mirror-maker.sh`
-script:
-
-    $ ./bin/connect-mirror-maker.sh mm2.properties
-
-## Multicluster environments
-
-MM2 supports replication between multiple Kafka clusters, whether in the
-same data center or across multiple data centers. A single MM2 cluster
-can span multiple data centers, but it is recommended to keep MM2's producers
-as close as possible to their target clusters. To do so, specify a subset
-of clusters for each MM2 node as follows:
-
-    # in west DC:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-This signals to the node that the given clusters are nearby, and prevents the
-node from sending records or configuration to clusters in other data centers.
-
-### Example
-
-Say there are three data centers (west, east, north) with two Kafka
-clusters in each data center (west-1, west-2 etc). We can configure MM2
-for active/active replication within each data center, as well as cross data
-center replication (XDCR) as follows:
-
-    # mm2.properties
-    clusters: west-1, west-2, east-1, east-2, north-1, north-2
-
-    west-1.bootstrap.servers = ...
-    ---%<---
-
-    # active/active in west
-    west-1->west-2.enabled = true
-    west-2->west-1.enabled = true
-
-    # active/active in east
-    east-1->east-2.enabled = true
-    east-2->east-1.enabled = true
-
-    # active/active in north
-    north-1->north-2.enabled = true
-    north-2->north-1.enabled = true
-
-    # XDCR via west-1, east-1, north-1
-    west-1->east-1.enabled = true
-    west-1->north-1.enabled = true
-    east-1->west-1.enabled = true
-    east-1->north-1.enabled = true
-    north-1->west-1.enabled = true
-    north-1->east-1.enabled = true
-
-Then, launch MM2 in each data center as follows:
-
-    # in west:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-    # in east:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters east-1 east-2
-
-    # in north:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters north-1 north-2
-    
-With this configuration, records produced to any cluster will be replicated
-within the data center, as well as across to other data centers. By providing
-the `--clusters` parameter, we ensure that each node only produces records to
-nearby clusters.
-
-N.B. that the `--clusters` parameter is not technically required here. MM2 will work fine without it; however, throughput may suffer from "producer lag" between
-data centers, and you may incur unnecessary data transfer costs.
-
-## Configuration
-The following sections target for dedicated MM2 cluster. If running MM2 in a Connect cluster, please refer to [KIP-382: MirrorMaker 2.0](https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0) for guidance.
- 
-### General Kafka Connect Config
-All Kafka Connect, Source Connector, Sink Connector configs, as defined in [Kafka official doc](https://kafka.apache.org/documentation/#connectconfigs), can be 
-directly used in MM2 configuration without prefix in the configuration name. As the starting point, most of these default configs may work well with the exception of `tasks.max`.
-
-In order to evenly distribute the workload across more than one MM2 instance, it is advised to set `tasks.max` at least to 2 or even larger depending on the hardware resources
-and the total number partitions to be replicated.
-
-### Kafka Connect Config for a Specific Connector
-If needed, Kafka Connect worker-level configs could be even specified "per connector", which needs to follow the format of `cluster_alias.config_name` in MM2 configuration. For example,
- 
-    backup.ssl.truststore.location = /usr/lib/jvm/zulu-8-amd64/jre/lib/security/cacerts // SSL cert location
-    backup.security.protocol = SSL // if target cluster needs SSL to send message
-    
-### MM2 Config for a Specific Connector
-MM2 itself has many configs to control how it behaves. To override those default values, add the config name by the format of `source_cluster_alias->target_cluster_alias.config_name` in MM2 configuration. For example,
-    
-    backup->primary.enabled = false // set to false if one-way replication is desired
-    primary->backup.topics.blacklist = topics_to_blacklist
-    primary->backup.emit.heartbeats.enabled = false
-    primary->backup.sync.group.offsets = true 
-
-### Producer / Consumer / Admin Config used by MM2
-In many cases, customized values for producer or consumer configurations are needed. In order to override the default values of producer or consumer used by MM2, 
-`target_cluster_alias.producer.producer_config_name`, `source_cluster_alias.consumer.consumer_config_name` or `cluster_alias.admin.admin_config_name` are the formats to use in MM2 configuration. For example,
-
-     backup.producer.compression.type = gzip
-     backup.producer.buffer.memory = 32768
-     primary.consumer.isolation.level = read_committed
-     primary.admin.bootstrap.servers = localhost:9092
-     
-### Shared configuration

Review Comment:
   I don't see this `Shared configuration` is mentioned in the doc. Could you check it again?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] tinaselenge commented on a diff in pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "tinaselenge (via GitHub)" <gi...@apache.org>.

tinaselenge commented on code in PR #13650:
URL: https://github.com/apache/kafka/pull/13650#discussion_r1229238520


##########
connect/mirror/README.md:
##########
@@ -1,297 +0,0 @@
-
-# MirrorMaker 2.0
-
-MM2 leverages the Connect framework to replicate topics between Kafka
-clusters. MM2 includes several new features, including:
-
- - both topics and consumer groups are replicated
- - topic configuration and ACLs are replicated
- - cross-cluster offsets are synchronized
- - partitioning is preserved
-
-## Replication flows
-
-MM2 replicates topics and consumer groups from upstream source clusters
-to downstream target clusters. These directional flows are notated
-`A->B`.
-
-It's possible to create complex replication topologies based on these
-`source->target` flows, including:
-
- - *fan-out*, e.g. `K->A, K->B, K->C`
- - *aggregation*, e.g. `A->K, B->K, C->K`
- - *active/active*, e.g. `A->B, B->A`
-
-Each replication flow can be configured independently, e.g. to replicate
-specific topics or groups:
-
-    A->B.topics = topic-1, topic-2
-    A->B.groups = group-1, group-2
-
-By default, all topics and consumer groups are replicated (except
-excluded ones), across all enabled replication flows. Each
-replication flow must be explicitly enabled to begin replication:
-
-    A->B.enabled = true
-    B->A.enabled = true
-
-## Starting an MM2 process
-
-You can run any number of MM2 processes as needed. Any MM2 processes
-which are configured to replicate the same Kafka clusters will find each
-other, share configuration, load balance, etc.
-
-To start an MM2 process, first specify Kafka cluster information in a
-configuration file as follows:
-
-    # mm2.properties
-    clusters = us-west, us-east
-    us-west.bootstrap.servers = host1:9092
-    us-east.bootstrap.servers = host2:9092
-
-You can list any number of clusters this way.
-
-Optionally, you can override default MirrorMaker properties:
-
-    topics = .*
-    groups = group1, group2
-    emit.checkpoints.interval.seconds = 10
-
-These will apply to all replication flows. You can also override default
-properties for specific clusters or replication flows:
-
-    # configure a specific cluster
-    us-west.offset.storage.topic = mm2-offsets
-
-    # configure a specific source->target replication flow
-    us-west->us-east.emit.heartbeats = false
-
-Next, enable individual replication flows as follows:
-
-    us-west->us-east.enabled = true     # disabled by default
-
-Finally, launch one or more MirrorMaker processes with the `connect-mirror-maker.sh`
-script:
-
-    $ ./bin/connect-mirror-maker.sh mm2.properties
-
-## Multicluster environments
-
-MM2 supports replication between multiple Kafka clusters, whether in the
-same data center or across multiple data centers. A single MM2 cluster
-can span multiple data centers, but it is recommended to keep MM2's producers
-as close as possible to their target clusters. To do so, specify a subset
-of clusters for each MM2 node as follows:
-
-    # in west DC:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-This signals to the node that the given clusters are nearby, and prevents the
-node from sending records or configuration to clusters in other data centers.
-
-### Example
-
-Say there are three data centers (west, east, north) with two Kafka
-clusters in each data center (west-1, west-2 etc). We can configure MM2
-for active/active replication within each data center, as well as cross data
-center replication (XDCR) as follows:
-
-    # mm2.properties
-    clusters: west-1, west-2, east-1, east-2, north-1, north-2
-
-    west-1.bootstrap.servers = ...
-    ---%<---
-
-    # active/active in west
-    west-1->west-2.enabled = true
-    west-2->west-1.enabled = true
-
-    # active/active in east
-    east-1->east-2.enabled = true
-    east-2->east-1.enabled = true
-
-    # active/active in north
-    north-1->north-2.enabled = true
-    north-2->north-1.enabled = true
-
-    # XDCR via west-1, east-1, north-1
-    west-1->east-1.enabled = true
-    west-1->north-1.enabled = true
-    east-1->west-1.enabled = true
-    east-1->north-1.enabled = true
-    north-1->west-1.enabled = true
-    north-1->east-1.enabled = true
-
-Then, launch MM2 in each data center as follows:
-
-    # in west:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters west-1 west-2
-
-    # in east:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters east-1 east-2
-
-    # in north:
-    $ ./bin/connect-mirror-maker.sh mm2.properties --clusters north-1 north-2
-    
-With this configuration, records produced to any cluster will be replicated
-within the data center, as well as across to other data centers. By providing
-the `--clusters` parameter, we ensure that each node only produces records to
-nearby clusters.
-
-N.B. that the `--clusters` parameter is not technically required here. MM2 will work fine without it; however, throughput may suffer from "producer lag" between
-data centers, and you may incur unnecessary data transfer costs.
-
-## Configuration
-The following sections target for dedicated MM2 cluster. If running MM2 in a Connect cluster, please refer to [KIP-382: MirrorMaker 2.0](https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0) for guidance.
- 
-### General Kafka Connect Config
-All Kafka Connect, Source Connector, Sink Connector configs, as defined in [Kafka official doc](https://kafka.apache.org/documentation/#connectconfigs), can be 
-directly used in MM2 configuration without prefix in the configuration name. As the starting point, most of these default configs may work well with the exception of `tasks.max`.
-
-In order to evenly distribute the workload across more than one MM2 instance, it is advised to set `tasks.max` at least to 2 or even larger depending on the hardware resources
-and the total number partitions to be replicated.
-
-### Kafka Connect Config for a Specific Connector
-If needed, Kafka Connect worker-level configs could be even specified "per connector", which needs to follow the format of `cluster_alias.config_name` in MM2 configuration. For example,
- 
-    backup.ssl.truststore.location = /usr/lib/jvm/zulu-8-amd64/jre/lib/security/cacerts // SSL cert location
-    backup.security.protocol = SSL // if target cluster needs SSL to send message
-    
-### MM2 Config for a Specific Connector
-MM2 itself has many configs to control how it behaves. To override those default values, add the config name by the format of `source_cluster_alias->target_cluster_alias.config_name` in MM2 configuration. For example,
-    
-    backup->primary.enabled = false // set to false if one-way replication is desired
-    primary->backup.topics.blacklist = topics_to_blacklist
-    primary->backup.emit.heartbeats.enabled = false
-    primary->backup.sync.group.offsets = true 
-
-### Producer / Consumer / Admin Config used by MM2
-In many cases, customized values for producer or consumer configurations are needed. In order to override the default values of producer or consumer used by MM2, 
-`target_cluster_alias.producer.producer_config_name`, `source_cluster_alias.consumer.consumer_config_name` or `cluster_alias.admin.admin_config_name` are the formats to use in MM2 configuration. For example,
-
-     backup.producer.compression.type = gzip
-     backup.producer.buffer.memory = 32768
-     primary.consumer.isolation.level = read_committed
-     primary.admin.bootstrap.servers = localhost:9092
-     
-### Shared configuration

Review Comment:
   I believe this is already included in the doc. https://github.com/apache/kafka/blob/f275ed812a71a02362f7c75c3fbde46c381cbe1a/docs/ops.html#L874 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] showuon merged pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "showuon (via GitHub)" <gi...@apache.org>.

showuon merged PR #13650:
URL: https://github.com/apache/kafka/pull/13650


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [kafka] machi1990 commented on a diff in pull request #13650: KAFKA-14709: Move content in connect/mirror/README.md to the docs

Posted by "machi1990 (via GitHub)" <gi...@apache.org>.

machi1990 commented on code in PR #13650:
URL: https://github.com/apache/kafka/pull/13650#discussion_r1179537982


##########
docs/ops.html:
##########
@@ -694,6 +694,45 @@ <h5 class="anchor-heading"><a id="georeplication-config-syntax" class="anchor-li
 us-east.admin.bootstrap.servers = broker8-secondary:9092
 </code></pre>
 
+  <h5 class="anchor-heading"><a id="georeplication-exactly_once" class="anchor-link"></a><a href="#georeplication-exactly_once">Exactly once</a></h5>
+
+  <p>
+    Exactly-once semantics are supported for dedicated MirrorMaker clusters as of version 3.5.0.</p>
+  
+  <p>
+    For new MirrorMaker clusters, set the <code>exactly.once.source.support</code> property to enabled for all targeted Kafka clusters that should be written to with exactly-once semantics. For example, to enable exactly-once for writes to cluster </code>us-east</code>, the following configuration can be used:
+  </p>
+
+<pre class="line-numbers"><code class="language-text">us-east.exactly.once.source.support = enabled
+</code></pre>
+  
+  <p>
+    For existing MirrorMaker clusters, a two-step upgrade is necessary. Instead of immediately setting the <code>exactly.once.source.support</code> property to enabled, first set it to <code>preparing</code> on all nodes in the cluster. Once this is complete, it can be set to </code>enabled</code> on all nodes in the cluster, in a second round of restarts.
+  </p>
+  
+  <p>
+    In either case, it is also necessary to enable intra-cluster communication between the MirrorMaker nodes, as described in  <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters">KIP-710</a>. To do this, the <code>dedicated.mode.enable.internal.rest</code> property must be set to <code>true</code>. In addition, many of the REST-related  <a href="https://kafka.apache.org/documentation/#connectconfigs">configuration properties available for Kafka Connect</a> can be specified the MirrorMaker config. For example, to enable intra-cluster communication in MirrorMaker cluster with each node listening on port 8080 of their local machine, the following should be added to the MirrorMaker config file:
+  </p>
+
+<pre class="line-numbers"><code class="language-text">dedicated.mode.enable.internal.rest = true
+listeners = http://localhost:8080
+</code></pre>
+  
+  <p><b>
+    Note that, if intra-cluster communication is enabled in production environments, it is highly recommended to secure the REST servers brought up by each MirrorMaker node. See the configuration properties for Kafka Connect for information on how this can be accomplished.

Review Comment:
   nit: adding a link here as well. The config properties are linked above but I thought it could be useful to link them over here as well



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org