You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by radai-rosenblatt <gi...@git.apache.org> on 2017/11/15 23:23:00 UTC

[GitHub] kafka pull request #4223: when closing a socket in response to an IOExceptio...

GitHub user radai-rosenblatt opened a pull request:

    https://github.com/apache/kafka/pull/4223

    when closing a socket in response to an IOException, also print the root issue if closing fails

    `Selector.pollSelectionKeys()` attempts to close the channel in response to an Exception (lets call this exception the root issue).
    
    if the root issue itself is an IOException, its printed to log at debug level (which is usually off for production users):
    
    ```java
     catch (Exception e) {
        String desc = channel.socketDescription();
        if (e instanceof IOException)
           log.debug("Connection with {} disconnected", desc, e);   <----- does not appear in real-life log
        else if (e instanceof AuthenticationException) // will be logged later as error by clients
           log.debug("Connection with {} disconnected due to authentication exception", desc, e);
        else
           log.warn("Unexpected error from {}; closing connection", desc, e);
        close(channel, true); 
    }
    ```
    
    for some cases, close itself would throw an exception. this exception is printed to log as a warning (`Selector.doClose()`):
    ```java
    try {
       channel.close();
    } catch (IOException e) {
       log.error("Exception closing connection to node {}:", channel.id(), e);
    }
    ```
    
    this tends to actually show up in user log, looking something like this (note - line numbers are from kafka 10.2.*):
    ```
    java.io.IOException: Connection reset by peer
    	at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
    	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
    	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
    	at sun.nio.ch.IOUtil.write(IOUtil.java:65)
    	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:470)
    	at org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:195)
    	at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:163)
    	at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:731)
    	at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:54)
    	at org.apache.kafka.common.network.Selector.doClose(Selector.java:540)
    	at org.apache.kafka.common.network.Selector.close(Selector.java:531)
    	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:378)
    	at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
    	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
    	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:233)
    	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:131)
    	at java.lang.Thread.run(Thread.java:745)
    ```
    
    this issue spams user's logs and is not really helpful in diagnosing the real underlying cause, which (after turning debug logs on) turned out to be (for this particular case):
    ```
    javax.net.ssl.SSLHandshakeException: General SSLEngine problem
    	at sun.security.ssl.Handshaker.checkThrown(Handshaker.java:1409)
    	at sun.security.ssl.SSLEngineImpl.checkTaskThrown(SSLEngineImpl.java:535)
    	at sun.security.ssl.SSLEngineImpl.writeAppRecord(SSLEngineImpl.java:1214)
    	at sun.security.ssl.SSLEngineImpl.wrap(SSLEngineImpl.java:1186)
    	at javax.net.ssl.SSLEngine.wrap(SSLEngine.java:469)
    	at org.apache.kafka.common.network.SslTransportLayer.handshakeWrap(SslTransportLayer.java:382)
    	at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:243)
    	at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:69)
    	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:350)
    	at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
    	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
    	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:233)
    	at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:131)
    	at java.lang.Thread.run(Thread.java:745)
    Caused by: javax.net.ssl.SSLHandshakeException: General SSLEngine problem
    	at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
    	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
    	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:304)
    	at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296)
    	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1478)
    	at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:212)
    	at sun.security.ssl.Handshaker.processLoop(Handshaker.java:957)
    	at sun.security.ssl.Handshaker$1.run(Handshaker.java:897)
    	at sun.security.ssl.Handshaker$1.run(Handshaker.java:894)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at sun.security.ssl.Handshaker$DelegatedTask.run(Handshaker.java:1347)
    	at org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks(SslTransportLayer.java:336)
    	at org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(SslTransportLayer.java:417)
    	at org.apache.kafka.common.network.SslTransportLayer.handshake(SslTransportLayer.java:270)
    	... 7 more
    Caused by: sun.security.validator.ValidatorException: No trusted certificate found
    	at sun.security.validator.SimpleValidator.buildTrustedChain(SimpleValidator.java:384)
    	at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:133)
    	at sun.security.validator.Validator.validate(Validator.java:260)
    	at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324)
    	at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:281)
    	at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:136)
    	at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1465)
    	... 16 more
    ```
    this patch allows submitting the root cause exception (the one caught in pollSelectionKeys) to close so that in the event close encounters an exception the root cause is also printed alongside the close exception.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/radai-rosenblatt/kafka better-logging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/4223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4223
    
----
commit d05e4fda9171f8cf0b73d175c05dc15ed99deba4
Author: radai-rosenblatt <ra...@gmail.com>
Date:   2017-11-15T23:12:42Z

    when closing a socket in response to an IOException, also print the root issue if closing fails

----


---