You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Barry Oglesby (JIRA)" <ji...@apache.org> on 2017/10/02 19:20:00 UTC

[jira] [Commented] (GEODE-3730) With gemfire.GatewaySender.REMOVE_FROM_QUEUE_ON_EXCEPTION=false, gateway events can still be removed from the queue when an exception occurs

    [ https://issues.apache.org/jira/browse/GEODE-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188658#comment-16188658 ] 

Barry Oglesby commented on GEODE-3730:
--------------------------------------

I see whats going on with this issue.

A connection is re-established (meaning the remote site members went down and have come back up):
{noformat}
[info 2017/09/11 13:50:24.910 PDT cluster1-server2 <Event Processor for GatewaySender_ny_0> tid=0x56] ny: Using Pooled Connection to server-2:3528: Connection[server-2:3528]@295453478
{noformat}
The {{Event Processor}} sends 2 batches in succession. Batch 1 contains event with tradeId=1768. Batch 2 contains event with tradeId=1770.
{noformat}
[info 2017/09/11 13:50:25.230 PDT cluster1-server2 <Event Processor for GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the following : 1 events (batch #1):
  	Event EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0]:1768->Trade [tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1768.order.1768, executingBroker=executingBroker1768]1781

[info 2017/09/11 13:50:25.444 PDT cluster1-server2 <Event Processor for GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the following : 1 events (batch #2):
  	Event EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10]:1770->Trade [tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1770.order.1770, executingBroker=executingBroker1770]1804
{noformat}
The {{AckReaderThread}} receives the ack for batch 1with tradeId=1768. It contains a {{BatchException}}. Since {{gemfire.GatewaySender.REMOVE_FROM_QUEUE_ON_EXCEPTION=false}}, the event is not removed from the queue.
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run read ack batchId=1; gotBatchException=true

[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] A BatchException occurred processing event 0
com.gemstone.gemfire.internal.cache.wan.BatchException70: Exception occurred while processing a batch on the receiver running on DistributedSystem with Id: 3, DistributedMember on which the receiver is running: server-2(cluster3-server2:64368)<v18>:58108
Caused by: com.gemstone.gemfire.cache.persistence.PartitionOfflineException: Region /trade bucket 0 has persistent data that is no longer online stored at these locations: [/192.168.2.14:/regionData created at timestamp 1505162997703 version 0 diskStoreId ff1a27e2ec154f39-a906b607a314e030 name null]

[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] The event being processed when the BatchException occurred was:  SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade [tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1768.order.1768, executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey= 1781;timeStamp=1505163025216;acked=false;dispatched=false]
{noformat}
The {{AbstractGatewaySenderEventProcessor batchIdToEventsMap}} contains two in-progress events (one for each of the sent batches):
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run events=
  	batchId=1
  		value=[[SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade [tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1768.order.1768, executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey= 1781;timeStamp=1505163025216;acked=false;dispatched=false]], [SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade [tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1768.order.1768, executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey= 1781;timeStamp=1505163025216;acked=false;dispatched=false]]]
  	batchId=2
  		value=[[SenderEventImpl[id=EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10];action=0;operation=CREATE;region=/trade;key=1770;value=Trade [tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1770.order.1770, executingBroker=executingBroker1770];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025430;shadowKey= 1804;timeStamp=1505163025428;acked=false;dispatched=false]], [SenderEventImpl[id=EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10];action=0;operation=CREATE;region=/trade;key=1770;value=Trade [tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1770.order.1770, executingBroker=executingBroker1770];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025430;shadowKey= 1804;timeStamp=1505163025428;acked=false;dispatched=false]]]
{noformat}
The {{AckReaderThread}} receives the ack for batch 2 with tradeId=1770. Since it does not contain a {{BatchException}}, the {{GatewayEventFilterAdapter afterAcknowledgement}} method is invoked.
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] XXX AckReaderThread.run read ack batchId=2; gotBatchException=false

[info 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] Received ACK for GatewayQueueEvent[key=1770,operation=CREATE]
{noformat}
The event with tradeId=1770 is attempted to be removed from the queue, but since remove just removes the head of the queue (not the key), the wrong event is removed (tradeId=1768) .
{noformat}
[warning 2017/09/11 13:50:25.574 PDT cluster1-server2 <AckReaderThread for : Event Processor for GatewaySender_ny_0> tid=0xfa] XXX ParallelGatewaySenderQueue.remove about to remove event=SenderEventImpl[id=EventID[threadID=0x10000|4;sequenceID=1770;bucketId=0];action=0;operation=CREATE;region=/trade;key=1768;value=Trade [tradeId=1768, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1768.order.1768, executingBroker=executingBroker1768];valueIsObject=1;numberOfParts=9;callbackArgument=GatewaySenderEventCallbackArgument [originalCallbackArg=null;originatingSenderId=1;recipientGatewayReceivers=[2, 3, 4]];possibleDuplicate=false;creationTime=1505163025218;shadowKey= 1781;timeStamp=1505163025216;acked=false;dispatched=false]
{noformat}
The next batch is sent containing the already-processed event with tradeId=1770. The failed event with tradeId=1768 is never processed.
{noformat}
[info 2017/09/11 13:50:25.650 PDT cluster1-server2 <Event Processor for GatewaySender_ny_0> tid=0x56] During normal processing, dispatching the following : 1 events (batch #3):
  	Event EventID[threadID=0x1000a|4;sequenceID=1772;bucketId=10]:1770->Trade [tradeId=1770, tradeTime=Mon Sep 11 13:50:25 PDT 2017, orderNumber=1770.order.1770, executingBroker=executingBroker1770]1804
{noformat}



> With gemfire.GatewaySender.REMOVE_FROM_QUEUE_ON_EXCEPTION=false, gateway events can still be removed from the queue when an exception occurs
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-3730
>                 URL: https://issues.apache.org/jira/browse/GEODE-3730
>             Project: Geode
>          Issue Type: Bug
>          Components: wan
>            Reporter: Barry Oglesby
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)