You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "nabarun (JIRA)" <ji...@apache.org> on 2018/10/03 21:38:24 UTC

[jira] [Closed] (GEODE-5513) Clients may miss PR region events due to race during registerInterest

     [ https://issues.apache.org/jira/browse/GEODE-5513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

nabarun closed GEODE-5513.
--------------------------

> Clients may miss PR region events due to race during registerInterest
> ---------------------------------------------------------------------
>
>                 Key: GEODE-5513
>                 URL: https://issues.apache.org/jira/browse/GEODE-5513
>             Project: Geode
>          Issue Type: Bug
>          Components: client queues
>    Affects Versions: 1.6.0
>            Reporter: Kenneth Howe
>            Assignee: Kenneth Howe
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.7.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Here is the scenario:
>  Consider two servers and client:
>  - Server1 hosting the primary bucket
>  - Server2 hosting secondary bucket and also primary queue for the Client2
>  - Client1 Doing remove operation
>  - Client2 doing register interest
> - The Client1 starts remove-all operation
>  - At the same time Client2 is registering interest
>  - Server1 receives the remove-all operation processes it, and sends the adjunct message to the Server2 (Its still not yet received the interest info from server1)
>  - While the remove-all to server2 in flight
>  - Server2 sends interest profile info to Server1 for client2; and then Server2 (as it is hosting the primary queue) starts building the initial image snapshot for the interest. When building initial image for PR preference is given to collect data from local node. During this time the removal message is still in flight and hasn't applied on Server2. The initial image for interest registration calculates the snapshot from local data, and sends it to client, missing the remove-all op.
> This could happen with non-bulk ops; but it gets worse with bulk ops as the time taken to replicate the bulk ops will take more time.
> The solution is to build the initial register interest response by getting the data from primary bucket. This will add little overhead in building the interest response; but considering that most or always the register response will involve remote node, this may be negligible.
> Clients registering interest in a region



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)