You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bill Burcham (Jira)" <ji...@apache.org> on 2019/12/12 22:10:00 UTC
[jira] [Resolved] (GEODE-7569) Hang during StateFlush due to new
flipping the containsRegionContentChange on PartitionMessageWithDirectReply
[ https://issues.apache.org/jira/browse/GEODE-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bill Burcham resolved GEODE-7569.
---------------------------------
Fix Version/s: 1.12.0
Resolution: Fixed
> Hang during StateFlush due to new flipping the containsRegionContentChange on PartitionMessageWithDirectReply
> -------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-7569
> URL: https://issues.apache.org/jira/browse/GEODE-7569
> Project: Geode
> Issue Type: Bug
> Components: membership
> Reporter: Dan Smith
> Assignee: Dan Smith
> Priority: Major
> Fix For: 1.12.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> The recent changes in GEODE-7435 in e3a31e190031f094ac3bd1517722d6bead710418 have caused a distributed deadlock when making a copy of a bucket.
> These changes flipped the value of containsRegionContentChange for PartitionMessageWithDirectReply.
> That flag controls what messages participate in a state flush operation. Now, many new messages are part of a state flush, including messages which trigger bucket creation. This causes the following distributed deadlock:
> 1. Member A is waiting for a StateFlush to finish
> 2. Member B is stuck in StateStabilizationMessage, waiting for messages to be processed
> 3. Member B is in the middle of processing some messages, which is what is holding up the StateStabilizationMessage
> 4. Some of those messages are PartitionMessageWithDirectReply messages that end up triggering createBucketAtomically. That method is blocks waiting for bucket creation in Member A to finish.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)