You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "shylaja kokoori (Jira)" <ji...@apache.org> on 2021/12/23 00:31:00 UTC
[jira] [Commented] (KAFKA-13418) Brokers disconnect intermittently with TLS1.3
[ https://issues.apache.org/jira/browse/KAFKA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464190#comment-17464190 ]
shylaja kokoori commented on KAFKA-13418:
-----------------------------------------
After enabling SSL logging (javax.net.debug=ssl,handshake),
I see that unwrap call in the SslTransportLayer.read function returns handshakeStatus=NEED_WRAP when ssl key_update takes place. (log snippet below)
Based on documentation provided in [https://datatracker.ietf.org/doc/html/rfc8446]
key_updates normally happen during a read/write and connection has to be closed when it happens during handshake.
Given that here key_updates are happening after handshaking is done, will something like attached patch work? I am new to Kafka and any feedback would be helpful.
Kafka log:
{code:java}
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.574 UTC|KeyUpdate.java:192|Consuming KeyUpdate post-handshake message (
"KeyUpdate": {
"request_update": update_requested
}
)
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:236|KeyUpdate: read key updated
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:271|Produced KeyUpdate post-handshake message (
"KeyUpdate": {
"request_update": update_not_requested
}
)
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:323|KeyUpdate: write key updated
[2021-12-21 06:14:09,575] ERROR [SslTransportLayer channelId=2 key=channel=java.nio.channels.SocketChannel[connection-pending remote=/192.168.24.11:9093], selector=sun.nio.ch.EPollSelectorImpl@2eb1a872, interestOps=8, readyOps=0] Renegotiation requested, but it is not supported, channelId 2, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 147 handshakeStatus NEED_WRAP State READY (org.apache.kafka.common.network.SslTransportLayer)
javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.578 UTC|Alert.java:238|Received alert message (
"Alert": {
"level" : "warning",
"description": "close_notify"
}
)
javax.net.ssl|ALL|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.580 UTC|SSLEngineImpl.java:752|Closing outbound of SSLEngine{code}
> Brokers disconnect intermittently with TLS1.3
> ---------------------------------------------
>
> Key: KAFKA-13418
> URL: https://issues.apache.org/jira/browse/KAFKA-13418
> Project: Kafka
> Issue Type: Bug
> Components: clients
> Affects Versions: 2.8.0
> Reporter: shylaja kokoori
> Assignee: shylaja kokoori
> Priority: Minor
> Attachments: tls1_3.patch
>
>
> Using TLS1.3 (with JDK11) is causing a regression and an increase in inter-broker p99 latency, as mentioned by Yiming in [Kafka-9320|https://issues.apache.org/jira/browse/KAFKA-9320?focusedCommentId=17401818&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17401818]. We tested this with Kafka 2.8.
> The issue seems to be because of a renegotiation exception being thrown by
> {code:java}
> read(ByteBuffer dst)
> {code}
> &
> {code:java}
> write(ByteBuffer src)
> {code}
> in
> _clients/src/main/java/org/apache/kafka/common/network/SslTransportLayer.java_
> This exception is causing the connection to close between the brokers before read/write is completed. In our internal experiments we have seen the p99 latency stabilize when we remove this exception.
> Given that TLS1.3 does not support renegotiation, I would like to make it applicable just for TLS1.2.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)