You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by "Jason Huynh (JIRA)" <ji...@apache.org> on 2016/12/21 22:51:58 UTC

[jira] [Commented] (GEODE-2205) Race condition in startup of ConcurrentSerialGatewaySenderProcessor

    [ https://issues.apache.org/jira/browse/GEODE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768402#comment-15768402 ] 

Jason Huynh commented on GEODE-2205:
------------------------------------

Commit as 164f04fbd85f20de6c7f9edef267d3f48463a954 
Was committed with incorrect jira number of GEODE-2215.  Should have been GEODE-2205

Commit 164f04fbd85f20de6c7f9edef267d3f48463a954 in geode's branch refs/heads/develop from Jason Huynh
[ https://git-wip-us.apache.org/repos/asf?p=geode.git;h=164f04f ]
GEODE-2215: GatewaySenderAdvisor checks the current processor to see if it has started
Previously it was checking the top level sender (possibly a concurrent sendor)
This allowed a race condition where the top level sender was still starting up
but the individual processors were ready to process. They would check the flag
and because the sender was not ready, the processors would act and start initiating
failover, which left the processor in a very weird state

> Race condition in startup of ConcurrentSerialGatewaySenderProcessor
> -------------------------------------------------------------------
>
>                 Key: GEODE-2205
>                 URL: https://issues.apache.org/jira/browse/GEODE-2205
>             Project: Geode
>          Issue Type: Bug
>          Components: wan
>            Reporter: Jason Huynh
>            Assignee: Jason Huynh
>
> ConcurrentSerialGatewayEventSenderProcessor spins up the individual SerialGatewayEventSenderProcessors.  During this time, the individual processors will call waitForPrimary on the GatewaySenderAdvisor.  The advisor uses the stopped flag from ConcurrentSerialGatewayEventSenderProcessor, which starts off as false (only set to true after all Serial processors are started).  
> This is where the timing issue arises.  If the serial processors start up and the GatewaySenderAdvisor uses the flag from the Concurrent processor, the serial senders will breaks out of the loop for waitingForPrimary and then tries to handle failover.  The Concurrent processor eventually sets it's flag to true and everything continues to run.
> If the serial processor was not a primary, it stays as a secondary and is in a weird state where anything enqueued will throw an assert error.
> This issue began due to  changes in GEODE-2107: c4ae846aa1689e2c5659b6ecc17e38689dd93976 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)