You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/12/12 22:10:00 UTC

[jira] [Commented] (GEODE-7569) Hang during StateFlush due to new flipping the containsRegionContentChange on PartitionMessageWithDirectReply

    [ https://issues.apache.org/jira/browse/GEODE-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995147#comment-16995147 ] 

ASF subversion and git services commented on GEODE-7569:
--------------------------------------------------------

Commit cf019ddf238f7ad6b005897484733073bf4d1ed9 in geode's branch refs/heads/develop from Bill Burcham
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=cf019dd ]

GEODE-7569: repair play dead in membership tests (#4467)



> Hang during StateFlush due to new flipping the containsRegionContentChange on PartitionMessageWithDirectReply
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-7569
>                 URL: https://issues.apache.org/jira/browse/GEODE-7569
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Dan Smith
>            Assignee: Dan Smith
>            Priority: Major
>             Fix For: 1.12.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> The recent changes in GEODE-7435 in e3a31e190031f094ac3bd1517722d6bead710418 have caused a distributed deadlock when making a copy of a bucket.
> These changes flipped the value of containsRegionContentChange for PartitionMessageWithDirectReply.
> That flag controls what messages participate in a state flush operation. Now, many new messages are part of a state flush, including messages which trigger bucket creation. This causes the following distributed deadlock:
> 1. Member A is waiting for a StateFlush to finish
> 2. Member B is stuck in StateStabilizationMessage, waiting for messages to be processed
> 3. Member B is in the middle of processing some messages, which is what is holding up the StateStabilizationMessage
> 4. Some of those messages are PartitionMessageWithDirectReply messages that end up triggering createBucketAtomically. That method is blocks waiting for bucket creation in Member A to finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)