You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "nabarun (JIRA)" <ji...@apache.org> on 2018/10/03 21:38:32 UTC
[jira] [Closed] (GEODE-5307) Hang with servers all in
waitForPrimaryMember and one server in NO_PRIMARY_HOSTING state
[ https://issues.apache.org/jira/browse/GEODE-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nabarun closed GEODE-5307.
--------------------------
> Hang with servers all in waitForPrimaryMember and one server in NO_PRIMARY_HOSTING state
> ----------------------------------------------------------------------------------------
>
> Key: GEODE-5307
> URL: https://issues.apache.org/jira/browse/GEODE-5307
> Project: Geode
> Issue Type: Bug
> Components: regions
> Affects Versions: 1.1.0, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0
> Reporter: Bruce Schuchardt
> Assignee: Bruce Schuchardt
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.7.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> I've run into a hang in a test where servers are continuously creating PRs, doing putAll ops on them and closing/local-destroying the PR. Sometimes the servers hang with any thread needing a particular bucket in waitingForPrimaryMember().
> This seems to happen because of this sequence of events:
> 1. two servers create a partitioned region
> 2. one server initiates a putAll and requests the other server manage a bucket
> 3. the putAll server closes or locally-destroys its region
> 4. the close() operation completes
> 5. the other server initializes its bucket and still uses the requesting server as a primaryElector. This keeps it from deciding to volunteer to become primary.
> The problem is that the server that closed its region caused exceptions to be thrown in the putAll thread and abandon creation of the bucket. No-one will ever trip the switch that makes the other server become the primary for the bucket.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)