You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Eric Shu (Jira)" <ji...@apache.org> on 2020/08/27 23:04:00 UTC

[jira] [Assigned] (GEODE-8465) A stray event can resurrect on a client due to a race condition

     [ https://issues.apache.org/jira/browse/GEODE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Shu reassigned GEODE-8465:
-------------------------------

    Assignee: Eric Shu

> A stray event can resurrect on a client due to a race condition
> ---------------------------------------------------------------
>
>                 Key: GEODE-8465
>                 URL: https://issues.apache.org/jira/browse/GEODE-8465
>             Project: Geode
>          Issue Type: Bug
>          Components: client queues
>            Reporter: Eric Shu
>            Assignee: Eric Shu
>            Priority: Major
>
> A member with primary queue could send the QRM for the event before it removes the event from its HARegionQueue – this causes a race that a newly started member can gii and get the events from the member with primary queue after it has already sent the QRM for the event to existing members holding the secondary queue.
> The following analysis showing that the race exists. 
> {noformat}
> The stray create of Object_5234on edge 10634 shows this issue with this cause.
> From edge 10634, we know the stray create is for v3 of the object and from bridge 15141.
> [fine 2020/06/17 16:38:49.621 PDT edgegemfire_2_1_host1_10634 <Cache Client Updater Thread on rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_15141:15141)<ec><v35>:41007(version:UNKNOWN[ordinal=125]) port 26905> tid=0x541] VersionedThinRegionEntryHeapStringKey2@1b56c386 (key=Object_5234; rawValue=VMCachedDeserializable@526996532; version=
> {v3; rv28; mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007; ds=2; time=1592434556093}
> ;member=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007) dispatching event EntryEventImpl[op=CREATE;region=/DefaultRegion;key=Object_5234;oldValue=null;newValue=VMCachedDeserializable@526996532;callbackArg=Update event originated in pid 10655;originRemote=true;originMember=rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;version=
> {v3; rv28; mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750)<ec><v27>:41007; ds=2; time=1592434556093; remote}
> ;id=EventIDid=52bytes;threadID=52;sequenceID=108];isFromServer]
> [fine 2020/06/17 16:38:49.621 PDT edgegemfire_2_1_host1_10634 <Cache Client Updater Thread on rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_15141:15141)<ec><v35>:41007(version:UNKNOWN[ordinal=125]) port 26905> tid=0x541] Put entry for region: /DefaultRegion key: Object_5234 callbackArgument: Update event originated in pid 10655
> ***This version (v3) is an update. The primary queue at the time is bridgegemfire_2_2_host1_10538. The event is enqueued and dispatched ***
> [fine 2020/06/17 15:55:56.098 PDT bridgegemfire_2_2_host1_10538 <P2P message reader for rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]) unshared ordered uid=358 dom #1 port=43914> tid=0x16d] DefaultRegion: notifying 1 bridge servers of event: EntryEventImpl[op=UPDATE;region=/DefaultRegion;key=Object_5234;oldValue=VMCachedDeserializable@1538889434;newValue=VMCachedDeserializable@2141181294;callbackArg=Update event originated in pid 10655;originRemote=true;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]);version=
> {v3; rv28; mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]); ds=2; time=1592434556093; remote}
> ;context=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;id=EventIDid=52bytes;threadID=52;sequenceID=108];routing=]
> [fine 2020/06/17 15:55:56.098 PDT bridgegemfire_2_2_host1_10538 <P2P message reader for rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]) unshared ordered uid=358 dom #1 port=43914> tid=0x16d] VMStatsRegionEntryHeapLongKey@38ccc0e5 (key=3867; rawValue=HAEventWrapper[refCount=2; msg=ClientUpdateMessageImpl[op=AFTER_UPDATE;region=/DefaultRegion;key=Object_5234;isObject=1;cbArg=Update event originated in pid 10655;memberId=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;eventId=EventIDid=52bytes;threadID=52;sequenceID=108];shouldConflate=false;versionTag=
> {v3; rv28; mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]); ds=2; time=1592434556093; remote}
> ;hasCqs=false]]) dispatching event EntryEventImpl[op=CREATE;region=/_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue;key=3867;oldValue=null;newValue=HAEventWrapper[refCount=2; msg=ClientUpdateMessageImpl[op=AFTER_UPDATE;region=/DefaultRegion;key=Object_5234;isObject=1;cbArg=Update event originated in pid 10655;memberId=identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655,connection=1;eventId=EventIDid=52bytes;threadID=52;sequenceID=108];shouldConflate=false;versionTag=
> {v3; rv28; mbr=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125]); ds=2; time=1592434556093; remote}
> ;hasCqs=false]];callbackArg=null;originRemote=false;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006]
> ***QRM is sent to the 2 peers at the time with the region ***
> [fine 2020/06/17 15:55:56.213 PDT bridgegemfire_2_2_host1_10538 <Queue Removal Thread> tid=0x83] Sending (QueueRemovalMessage[_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue, 1, EventIDid=52bytes;threadID=52;sequenceID=108], _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_3_host1_10666:10666:loner):60462:26db73c4:edgegemfire_2_3_host1_10666_1_queue, 1, EventIDid=52bytes;threadID=52;sequenceID=108]]) to 2 peers ([rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_1_host1_18750:18750)<ec><v27>:41007(version:UNKNOWN[ordinal=125])@277, rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_4_host1_23574:23574)<ec><v31>:41011(version:UNKNOWN[ordinal=125])@566]) via tcp/ip
> ***the newly started bridgegemfire_2_3_24320 creates the HARegionQueue and gii from the member with primary queue (10538) ***
> [info 2020/06/17 15:56:08.800 PDT <Client Queue Initialization Thread 1> tid=0x76] Initializing region _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [info 2020/06/17 15:56:09.285 PDT <Client Queue Initialization Thread 1> tid=0x76] _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue is done getting image from rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006(version:GEODE 1.3.0). isDeltaGII is false
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> tid=0x76] adding haContainerKey to HARegion at 1:HAEventWrapper[region=/DefaultRegion; key=Object_5234; refCount=1; putInProgress=0; event=EventID[rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;threadID=52;sequenceID=108]; no message] for _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> tid=0x76] invoking listeners: [org.apache.geode.internal.cache.ha.HARegionQueue$1@2e69472b]
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> tid=0x76] Adding position 1 to available IDs. Region: _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [debug 2020/06/17 15:56:09.377 PDT <Client Queue Initialization Thread 1> tid=0x76] RegionQueue on _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue(backup) done putting GII data into queue
> ***Please note that the event was being removed from its primary queue after QRM has been sent and bridge 24320 gii the events from it ***
> [fine 2020/06/17 15:56:09.666 PDT bridgegemfire_2_2_host1_10538 <Client Message Dispatcher for rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634> tid=0x84] VMStatsRegionEntryHeapLongKey@38ccc0e5 (key=3867; rawValue=REMOVED_PHASE2) dispatching event EntryEventImpl[op=LOCAL_DESTROY;region=/_gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue;key=3867;oldValue=HAEventWrapper[region=/DefaultRegion; key=Object_5234; refCount=1; inContainer=true; putInProgress=false; event=EventIDid=52bytes;threadID=52;sequenceID=108]; no message];newValue=null;callbackArg=null;originRemote=false;originMember=rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_2_host1_10538:10538)<ec><v2>:41006]
> ***Bridge 24320 now has the event for Object_5234(v3) that has been sent to the client and won't get QRM again ***
> ***Later, bridge 15141 gii from 24320 and got the event ***
> [info 2020/06/17 16:17:36.454 PDT <Client Queue Initialization Thread 2> tid=0x78] Initializing region _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue
> [info 2020/06/17 16:17:40.187 PDT <Client Queue Initialization Thread 2> tid=0x78] _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue is done getting image from rs-GEM-2944-1535-hydra-client-1(bridgegemfire_2_3_host1_24320:24320)<ec><v33>:41010. isDeltaGII is false
> [debug 2020/06/17 16:17:40.415 PDT <Client Queue Initialization Thread 2> tid=0x78] _gfe_non_durable_client_with_id_rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634_1_queue processing queue key 1 and value HAEventWrapper[region=/DefaultRegion; key=Object_5234; refCount=1; putInProgress=0; event=EventID[rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_2_host1_10655:loner):60216:b0d873c4:edgegemfire_2_2_host1_10655;threadID=52;sequenceID=108]; no message]
> ***It will send the event to client after it becomes the primary queue holder for client 10634 ***
> [debug 2020/06/17 16:38:49.125 PDT <ServerConnection on port 26905 Thread 5> tid=0xb4] CacheClientProxy[identity(rs-GEM-2944-1535-hydra-client-1(edgegemfire_2_1_host1_10634:10634:loner):51062:4bd673c4:edgegemfire_2_1_host1_10634,connection=1; port=53006; primary=true; version=GEODE 1.3.0]: Queueing marker message. <or...@75bf95c8>. The queue contains 57 entries.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)