You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/04/05 09:23:41 UTC

[jira] [Commented] (KAFKA-4901) Make ProduceRequest thread-safe

    [ https://issues.apache.org/jira/browse/KAFKA-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956565#comment-15956565 ] 

ASF GitHub Bot commented on KAFKA-4901:
---------------------------------------

GitHub user ijuma opened a pull request:

    https://github.com/apache/kafka/pull/2810

    KAFKA-4901: Make ProduceRequest thread-safe

    A more conservative version of the change for the 0.10.2
    branch.
    
    Trunk commit: 1659ca1773596b.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ijuma/kafka kafka-4901-produce-request-thread-safety-0-10-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2810.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2810
    
----
commit 91b0aa305e368b84e506d2ab8bc92d1a3a031d7c
Author: Ismael Juma <is...@juma.me.uk>
Date:   2017-03-16T16:13:58Z

    KAFKA-4901: Make ProduceRequest thread-safe
    
    A more conservative version of the change for the 0.10.2
    branch.

----


> Make ProduceRequest thread-safe
> -------------------------------
>
>                 Key: KAFKA-4901
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4901
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Ismael Juma
>            Assignee: Ismael Juma
>             Fix For: 0.11.0.0, 0.10.2.1
>
>
> If request logging is enabled, ProduceRequest can be accessed
> and mutated concurrently from a network thread (which calls
> toString) and the request handler thread (which calls
> clearPartitionRecords()).
> That can lead to a ConcurrentModificationException when iterating
> the partitionRecords map.
> The underlying thread-safety issue has existed since the server
> started using the Java implementation of ProduceRequest in 0.10.0.
> However, we were incorrectly not clearing the underlying struct until
> 0.10.2, so toString itself was thread-safe until that change. In 0.10.2,
> toString is no longer thread-safe and we could potentially see a
> NullPointerException given the right set of interleavings between
> toString and clearPartitionRecords although we haven't seen that
> happen yet.
> In trunk, we changed the requests to have a toStruct method
> instead of creating a struct in the constructor and toString was
> no longer printing the contents of the Struct. This accidentally
> fixed the race condition, but it meant that request logging was less
> useful.
> A couple of days ago, AbstractRequest.toString was changed to
> print the contents of the request by calling toStruct().toString()
> and reintroduced the race condition. The impact is more visible
> because we iterate over a HashMap, which proactively
> checks for concurrent modification (unlike arrays).
> We will need a separate PR for 0.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)