You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Fernando Vega (JIRA)" <ji...@apache.org> on 2018/07/27 18:20:00 UTC

[jira] [Commented] (KAFKA-5407) Mirrormaker dont start after upgrade

    [ https://issues.apache.org/jira/browse/KAFKA-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560099#comment-16560099 ] 

Fernando Vega commented on KAFKA-5407:
--------------------------------------

[~omkreddy] [~hachikuji] [~huxi_2b]
Just double checking. I try this again and I found a few things:

 

a- Once I upgraded the cluster, I attempted to use the new consumer file again for the mirrormakers  we have whitelisting the same topics and I get the same exception. 

b- However I did another test, using same exact configs that the production topics used the only difference was I created a single topic in order to check if the issue was something realted with Kafka or the package installed and mirroring using all new files, and it worked just fine. But with the current production topics it doesnt?

c- Also we have seeing that sometimes the mirrormaker threads die with no reason, I see messages in the logs where it states that the mirrormaker was shutdown successfully, however we havent stop them or restart them in order to see this message.

d- sometimes when we use consumer group scrip to check the LAG of consumption we see the list of the topic and its consumers, but in some cases when we display the information we see the topics not having consumers, so what we do is stop mm remove the consumer group and start the mm and that seem to fix it.



if you guys can provide any suggestion that will be great, also any tool that you guys suggest that we can use to check, monitor or understand troubleshooting this behavior will be great as well.

> Mirrormaker dont start after upgrade
> ------------------------------------
>
>                 Key: KAFKA-5407
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5407
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.10.2.1
>         Environment: Operating system
> CentOS 6.8
> HW
> Board Mfg             : HP
>  Board Product         : ProLiant DL380p Gen8
> CPU's x2
> Product Manufacturer  : Intel
>  Product Name          :  Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
>  Memory Type           : DDR3 SDRAM
>  SDRAM Capacity        : 2048 MB
>  Total Memory:             : 64GB
> Hardrives size and layout:
> 9 drives using jbod
> drive size 3.6TB each
>            Reporter: Fernando Vega
>            Priority: Critical
>         Attachments: broker.hkg1.new, debug.hkg1.new, mirrormaker-repl-sjc2-to-hkg1.log.8
>
>
> Currently Im upgrading the cluster from 0.8.2-beta to 0.10.2.1
> So I followed the rolling procedure:
> Here the config files:
> Consumer
> {noformat}
> #
> # Cluster: repl
> # Topic list(goes into command line): REPL-ams1-global,REPL-atl1-global,REPL-sjc2-global,REPL-ams1-global-PN_HXIDMAP_.*,REPL-atl1-global-PN_HXIDMAP_.*,REPL-sjc2-global-PN_HXIDMAP_.*,REPL-ams1-global-PN_HXCONTEXTUALV2_.*,REPL-atl1-global-PN_HXCONTEXTUALV2_.*,REPL-sjc2-global-PN_HXCONTEXTUALV2_.*
> bootstrap.servers=app001:9092,app002:9092,app003:9092,app004:9092
> group.id=hkg1_cluster
> auto.commit.interval.ms=60000
> partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
> {noformat}
> Producer
> {noformat}
>  hkg1
> # # Producer
> # # hkg1
> bootstrap.servers=app001:9092,app002:9092,app003:9092,app004:9092
> compression.type=gzip
> acks=0
> {noformat}
> Broker
> {noformat}
> auto.leader.rebalance.enable=true
> delete.topic.enable=true
> socket.receive.buffer.bytes=1048576
> socket.send.buffer.bytes=1048576
> default.replication.factor=2
> auto.create.topics.enable=true
> num.partitions=1
> num.network.threads=8
> num.io.threads=40
> log.retention.hours=1
> log.roll.hours=1
> num.replica.fetchers=8
> zookeeper.connection.timeout.ms=30000
> zookeeper.session.timeout.ms=30000
> inter.broker.protocol.version=0.10.2
> log.message.format.version=0.8.2
> {noformat}
> I tried also using stock configuraiton with no luck.
> The error that I get is this:
> {noformat}
> 2017-06-07 12:24:45,476] INFO ConsumerConfig values:
> 	auto.commit.interval.ms = 60000
> 	auto.offset.reset = latest
> 	bootstrap.servers = [app454.sjc2.mytest.com:9092, app455.sjc2.mytest.com:9092, app456.sjc2.mytest.com:9092, app457.sjc2.mytest.com:9092, app458.sjc2.mytest.com:9092, app459.sjc2.mytest.com:9092]
> 	check.crcs = true
> 	client.id = MirrorMaker_hkg1-1
> 	connections.max.idle.ms = 540000
> 	enable.auto.commit = false
> 	exclude.internal.topics = true
> 	fetch.max.bytes = 52428800
> 	fetch.max.wait.ms = 500
> 	fetch.min.bytes = 1
> 	group.id = MirrorMaker_hkg1
> 	heartbeat.interval.ms = 3000
> 	interceptor.classes = null
> 	key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
> 	max.partition.fetch.bytes = 1048576
> 	max.poll.interval.ms = 300000
> 	max.poll.records = 500
> 	metadata.max.age.ms = 300000
> 	metric.reporters = []
> 	metrics.num.samples = 2
> 	metrics.recording.level = INFO
> 	metrics.sample.window.ms = 30000
> 	partition.assignment.strategy = [org.apache.kafka.clients.consumer.RoundRobinAssignor]
> 	receive.buffer.bytes = 65536
> 	reconnect.backoff.ms = 50
> 	request.timeout.ms = 305000
> 	retry.backoff.ms = 100
> 	sasl.jaas.config = null
> 	sasl.kerberos.kinit.cmd = /usr/bin/kinit
> 	sasl.kerberos.min.time.before.relogin = 60000
> 	sasl.kerberos.service.name = null
> 	sasl.kerberos.ticket.renew.jitter = 0.05
> 	sasl.kerberos.ticket.renew.window.factor = 0.8
> 	sasl.mechanism = GSSAPI
> 	security.protocol = PLAINTEXT
> 	send.buffer.bytes = 131072
> 	session.timeout.ms = 10000
> 	ssl.cipher.suites = null
> 	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
> 	ssl.endpoint.identification.algorithm = null
> 	ssl.key.password = null
> 	ssl.keymanager.algorithm = SunX509
> 	ssl.keystore.location = null
> 	ssl.keystore.password = null
> 	ssl.keystore.type = JKS
> 	ssl.protocol = TLS
> 	ssl.provider = null
> 	ssl.secure.random.implementation = null
> 	ssl.trustmanager.algorithm = PKIX
> 	ssl.truststore.location = null
> 	ssl.truststore.password = null
> 	ssl.truststore.type = JKS
> 	value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
> INFO Kafka commitId : e89bffd6b2eff799 (org.apache.kafka.common.utils.AppInfoParser)
> [2017-06-07 12:24:45,497] INFO [mirrormaker-thread-0] Starting mirror maker thread mirrormaker-thread-0 (kafka.tools.MirrorMaker$MirrorMakerThread)
> [2017-06-07 12:24:45,497] INFO [mirrormaker-thread-1] Starting mirror maker thread mirrormaker-thread-1 (kafka.tools.MirrorMaker$MirrorMakerThread)
> [2017-06-07 12:24:48,619] INFO Discovered coordinator app458.sjc2.mytest.com:9092 (id: 2147483613 rack: null) for group MirrorMaker_hkg1. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-06-07 12:24:48,620] INFO Discovered coordinator app458.sjc2.mytest.com:9092 (id: 2147483613 rack: null) for group MirrorMaker_hkg1. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-06-07 12:24:48,625] INFO Revoking previously assigned partitions [] for group MirrorMaker_hkg1 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> [2017-06-07 12:24:48,625] INFO Revoking previously assigned partitions [] for group MirrorMaker_hkg1 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> [2017-06-07 12:24:48,648] INFO (Re-)joining group MirrorMaker_hkg1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-06-07 12:24:48,649] INFO (Re-)joining group MirrorMaker_hkg1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
> [2017-06-07 12:24:53,560] FATAL [mirrormaker-thread-1] Mirror maker thread failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)
> org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The server experienced an unexpected error when processing the request
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:548)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupResponseHandler.handle(AbstractCoordinator.java:521)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:784)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:765)
> 	at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:186)
> 	at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:149)
> 	at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:116)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:493)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:322)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:253)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:172)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:347)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:290)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1029)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
> 	at kafka.tools.MirrorMaker$MirrorMakerNewConsumer.receive(MirrorMaker.scala:625)
> 	at kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:431)
> {noformat}
> Im using mirrormaker



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)