You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jason Gustafson (JIRA)" <ji...@apache.org> on 2016/09/07 04:16:20 UTC

[jira] [Created] (KAFKA-4137) Refactor multi-threaded consumer for safer network layer access

Jason Gustafson created KAFKA-4137:
--------------------------------------

             Summary: Refactor multi-threaded consumer for safer network layer access
                 Key: KAFKA-4137
                 URL: https://issues.apache.org/jira/browse/KAFKA-4137
             Project: Kafka
          Issue Type: Improvement
          Components: consumer
            Reporter: Jason Gustafson


In KIP-62, we added a background thread to send heartbeats while the user is processing fetched data from a call to poll(). In the implementation, we elected to share the instance of {{NetworkClient}} between the foreground thread and this background thread. After working with the system test failure in KAFKA-3807, we've realized that this probably wasn't a good decision. It is very tricky to get the synchronization correct with respect to response callbacks and reasoning about the multi-threaded behavior is very difficult. For example, a common pattern is to send a request and then call {{NetworkClient.poll()}} to await its return. With another thread also potentially calling poll(), the response can actually return before the sending thread itself invokes poll(). This can cause unnecessary (and potentially unbounded) blocking, and avoiding it is quite complex. 

A different approach we've discussed would be to use two instances of NetworkClient, one dedicated to fetching, and one dedicated to coordinator communication. The fetching NetworkClient can continue to work exclusively in the foreground thread and we can confine the coordinator NetworkClient to the background thread. This provides much better isolation and avoids all of the race conditions with calling poll() from two threads. The main complication is in how to expose blocking APIs to interact with the background thread. For example, in the current consumer API, rebalance are completed in the foreground thread, so we would need to coordinate with the background thread to preserve this (e.g. by using a Future abstraction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)