You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/01/12 14:46:52 UTC

[jira] [Commented] (OAK-5446) Lease Impossible to Renew once Failure Margin Passed

    [ https://issues.apache.org/jira/browse/OAK-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821151#comment-15821151 ] 

Julian Reschke commented on OAK-5446:
-------------------------------------

Would be good to have a unit test (looking now).

FWIW, another way to avid this would be to special case the CLUSTER collection in the {{LeaseCheckDocumentStoreWrapper}}.

> Lease Impossible to Renew once Failure Margin Passed
> ----------------------------------------------------
>
>                 Key: OAK-5446
>                 URL: https://issues.apache.org/jira/browse/OAK-5446
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.4, 1.5.14
>            Reporter: Stefan Eissing
>            Priority: Blocker
>
> Fighting with cluster nodes losing their lease and shutting down oak-core in a cloud environment. For reasons unknown at this point in time, the whole process seems to skip about two minutes of real time.
> This is a situation from which oak currently does not recover. Code analysis shows that {{ClusterNodeInfo}} is handed the {{LeaseCheckDocumentStoreWrapper}} instance to use as store. This is fatal since any action the {{renewLease()}} tries to do will first invoke the {{performLeaseCheck()}}. The lease check will, when the {{FailureMargin}} is reached, _stall the renewLease() thread_ for 5 retry attempts and then declare the lease to be lost.
> The {{ClusterNodeInfo}} should instead be using the "real" {{DocumentStore}}, not the wrapped one, IMO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)