You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Henry Cai (JIRA)" <ji...@apache.org> on 2019/03/11 06:34:02 UTC

[jira] [Created] (KAFKA-8089) High level consumer from MirrorMaker is slow to deal with SSL certification expiration

Henry Cai created KAFKA-8089:
--------------------------------

             Summary: High level consumer from MirrorMaker is slow to deal with SSL certification expiration
                 Key: KAFKA-8089
                 URL: https://issues.apache.org/jira/browse/KAFKA-8089
             Project: Kafka
          Issue Type: Bug
          Components: clients, consumer
    Affects Versions: 2.0.0
            Reporter: Henry Cai


We have been using Kafka 2.0's mirror maker (which used High level consumer) to do replication.  The topic is SSL enabled and the certificate will expire at a random time within 12 hours.  When the certificate expired we will see many SSL related exception in the log
 
[2019-03-07 18:02:54,128] ERROR [Consumer clientId=kafkamirror-euw1-use1-m10nkafka03-1, groupId=kafkamirror-euw1-use1-m10nkafka03] Connection to node 3005 failed authentication due to: SSL handshake failed (org.apache.kafka.clients.NetworkClient)

This error will repeat for several hours.

However even with the SSL error, the preexisting socket connection will still work so the main fetching activities is actually not affected, but the metadata operations from the client and the heartbeats from heartbeat thread will be affected since they might open new socket connections.  I think those errors are most likely originated from those side activities.

The situation will last several hours until the main fetcher thread tried to open a new connection (usually due to consumer rebalance) and then the SSL Authentication exception will abort the operation and mirror maker will exit.

During that several hours, the client wouldn't be able to get the latest metadata and heartbeats also falters (we see rebalancing triggered because of this).

In NetworkClient.processDisconnection(), when the above method prints the ERROR message, can it just throw the AuthenticationException up, this will kill the KafkaConsumer.poll(), and this will speedup the certificate recycle (in our case, we will restart the mirror maker with the new certificate)
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)