You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jake Maes (JIRA)" <ji...@apache.org> on 2017/09/20 19:42:00 UTC

[jira] [Resolved] (SAMZA-1392) KafkaSystemProducer performance and correctness with concurrent sends and flushes

     [ https://issues.apache.org/jira/browse/SAMZA-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jake Maes resolved SAMZA-1392.
------------------------------
    Resolution: Fixed

Issue resolved by pull request 272
[https://github.com/apache/samza/pull/272]

> KafkaSystemProducer performance and correctness with concurrent sends and flushes
> ---------------------------------------------------------------------------------
>
>                 Key: SAMZA-1392
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1392
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jake Maes
>            Assignee: Jake Maes
>             Fix For: 0.14.0
>
>         Attachments: Producer Performance Tests for SAMZA-1392 - Sheet1.pdf
>
>
> There are 2 issues we need to fix in the KafkaSystemProducer when sends and flushes are called concurrently:
> 1. Concurrent sends contend for the sendlock, especially when producer compression is enabled. The fix is to use the producer.flush() API, which kafka has supported since at least version 0.9.x. This way we won't need to track the latest future, so we won't need the lock.
> 2. When task.async.commit is enabled, the threads calling send() could set the exceptionInCallback to null before the exception is handled in user code or flush(). This could allow us to checkpoint offsets for which the corresponding output was not successfully sent.
> The short term solution here is to only handle the callback exceptions from flush() and allow users to configure the exceptions as ignorable in case they don't want flush to fail.
> The long term solution is to support a fully asynchronous SystemProducer. Ticket SAMZA-1393.
> I found issue #2 while working on issue #1, so while they're separate issues, it's easier to fix them with one ticket/patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)