You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Dave Barnes (Jira)" <ji...@apache.org> on 2020/09/10 15:52:11 UTC

[jira] [Closed] (GEODE-7945) Cluster restart recovery from disk blocked by waiting replies of CreateRegionMessage

     [ https://issues.apache.org/jira/browse/GEODE-7945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Barnes closed GEODE-7945.
------------------------------

> Cluster restart recovery from disk blocked by waiting replies of CreateRegionMessage
> ------------------------------------------------------------------------------------
>
>                 Key: GEODE-7945
>                 URL: https://issues.apache.org/jira/browse/GEODE-7945
>             Project: Geode
>          Issue Type: Improvement
>            Reporter: Jianxia Chen
>            Assignee: Jianxia Chen
>            Priority: Major
>              Labels: GeodeCommons
>             Fix For: 1.13.0
>
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> A cluster restart recovering from disk has unexpected delays in some of the members. The logs show that those delayed members are waiting for replies of CreateRegionMessage, before loading the krf files. And the reason for waiting the replies of CreateRegionMessage is likely because other members are holding some lock while busy loading the krf files. Therefore the replies are delayed. 
> Once those delayed members get the replies of CreateRegionMessage, it starts loading the krf files. If the delayed members contain the latest data, this could block other members waiting for the latest data.
> Because the cluster members are blocking each other at different stages during the cluster restart recovery process, the whole process is unexpectedly long when the disk store contains large amount of data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)