You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/04/04 11:37:25 UTC

[jira] [Commented] (KAFKA-3488) commitAsync() fails if metadata update creates new SASL/SSL connection

    [ https://issues.apache.org/jira/browse/KAFKA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223872#comment-15223872 ] 

ASF GitHub Bot commented on KAFKA-3488:
---------------------------------------

GitHub user rajinisivaram opened a pull request:

    https://github.com/apache/kafka/pull/1183

    KAFKA-3488: Avoid failing of unsent requests in consumer where possible

    Fail unsent requests only when returning from KafkaConsumer.poll().

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rajinisivaram/kafka KAFKA-3488

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1183
    
----
commit dfad7b0215573800bed56abd3bcc2cf7f6134513
Author: Rajini Sivaram <ra...@googlemail.com>
Date:   2016-04-04T08:38:27Z

    KAFKA-3488: Avoid failing of unsent requests in consumer where possible

----


> commitAsync() fails if metadata update creates new SASL/SSL connection
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-3488
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3488
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.9.0.1
>            Reporter: Rajini Sivaram
>            Assignee: Rajini Sivaram
>             Fix For: 0.10.0.0
>
>
> Sasl/SslConsumerTest.testSimpleConsumption() fails intermittently with a failure in {{commitAsync()}}. The exception stack trace shows:
> {quote}
> kafka.api.SaslPlaintextConsumerTest.testSimpleConsumption FAILED
> java.lang.AssertionError: expected:<1> but was:<0>
> 	at org.junit.Assert.fail(Assert.java:88)
> 	at org.junit.Assert.failNotEquals(Assert.java:834)
> 	at org.junit.Assert.assertEquals(Assert.java:645)
> 	at org.junit.Assert.assertEquals(Assert.java:631)
> 	at kafka.api.BaseConsumerTest.awaitCommitCallback(BaseConsumerTest.scala:340)
> 	at kafka.api.BaseConsumerTest.testSimpleConsumption(BaseConsumerTest.scala:85)
> {quote}
> I have recreated this with some additional trace. The tests run with a very small metadata expiry interval, triggering metadata updates quite often. If a metadata request immediately following a {{commitAsync()}} call creates a new SSL/SASL connection, {{ConsumerNetworkClient.poll}} returns to process the connection handshake packets. Since {{ConsumerNetworkClient.poll}} discards all unsent packets before returning from poll, this can result in the failure of the commit - the callback is invoked with {{SendFailedException}}.
> I understand that {{ConsumerNetworkClient.poll()}} discards unsent packets rather than buffer them to keep the code simple. And perhaps it is ok to fail {{commitAsync}} occasionally since the callback does indicate that the caller should retry. But it feels like an unnecessary limitation that requires error handling in client applications when there are no real failures and makes it much harder to test reliably. As special handling to fix issues like KAFKA-3412, KAFKA-2672 adds more complexity to the code anyway, and because it is much harder to debug failures that affect only SSL/SASL, it may be worth considering improving this behaviour.
> I will see if I can submit a PR for the specific issue I was seeing with the impact of handshakes on {{commitAsync()}}, but I will be interested in views on improving the logic in {{ConsumerNetworkClient}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)