You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Mark Hanson (Jira)" <ji...@apache.org> on 2022/03/25 23:25:00 UTC

[jira] [Commented] (GEODE-9704) When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions

    [ https://issues.apache.org/jira/browse/GEODE-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512627#comment-17512627 ] 

Mark Hanson commented on GEODE-9704:
------------------------------------

PR 7442 is available. I have made  changes to fix the behavior that was causing the problems.  The core of the problem was that registerinterst should be called before readyforevents. It was reversed effectively, so that has been corrected. 

LocalRegionUpdateTest.java was created to house two unit tests for the new code.

AuthExpirationDUnitTest has a test by Jinmei that has been uncommented that would typically be flaky, but with this fix, no longer fails.

I believe this bug is done with the exception of the review phase of the PR and associated changes.

> When durable clients recovers, it sends "ready for event" signal before register for interest, this might cause problem for caching_proxy regions
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-9704
>                 URL: https://issues.apache.org/jira/browse/GEODE-9704
>             Project: Geode
>          Issue Type: Bug
>          Components: regions
>    Affects Versions: 1.15.0
>            Reporter: Jinmei Liao
>            Assignee: Mark Hanson
>            Priority: Major
>              Labels: GeodeOperationAPI, blocks-1.15.1, pull-request-available
>
> This is the old Geode behavior, but may or may not be the correct behavior.
> When durable clients recovers, there is a queueTimer thread that runs `QueueManagerImp.recoverPrimary` method,  it 
>  * makes new connection to server
>  - sends readyForEvents (which will cause the server to start sending the queued events)
>  - recovers interest
>   - clears the region of keys of interest
>   - re-registers interest
> It sends readyForEvents before it clears region of keys of interest, if server sends some events of those keys in between, it will clear them, thus it seems to the user that the client region doesn't have those keys. 
>  
> Run geode-core distributedTest AuthExpirationDUnitTest.registeredInterest_slowReAuth_policyKeys_durableClient(), change the InterestResultPolicy to NONE, you would see the test would fail occasionally, Adding sleep code in QueueManagerImp.recoverPrimary between `createNewPrimary` and `recoverInterest` would make the test fail more consistently.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)