You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Jens Deppe (JIRA)" <ji...@apache.org> on 2016/01/07 18:33:39 UTC
[jira] [Created] (GEODE-738) DEFAULT diskstore is missing in some
members in cluster config tests
Jens Deppe created GEODE-738:
--------------------------------
Summary: DEFAULT diskstore is missing in some members in cluster config tests
Key: GEODE-738
URL: https://issues.apache.org/jira/browse/GEODE-738
Project: Geode
Issue Type: Bug
Components: management
Reporter: Jens Deppe
This is observed in 8.1 regression.
{noformat}
Host name: w1-gst-dev04
OS name: Linux
Architecture: amd64
OS version: 3.10.0-123.el7.x86_64
Java version: 1.7.0_72
Java vm name: Java HotSpot(TM) 64-Bit Server VM
Java vendor: Oracle Corporation
Java home: /export/gcm/where/jdk/1.7.0_72/x86_64.linux/jre
#####################################################
GemFire Version 8.1.0.6
Source Date: Fri, 22 May 2015 10:39:59 -0700
Source Revision: 6aa105334252e99556c4a01c70abacdb20fc033b
Source Repository: gemfire810X_maint
Build Id: build 052215
Build Date: 05/22/2015 11:09:03 PDT
Build Version: 8.1.0.6 build 052215 05/22/2015 11:09:03 PDT javac 1.7.0_72
Build JDK: Java 1.7.0_72
Build Platform: Linux 2.6.32-220.23.1.el6.x86_64 i386
#####################################################
Test was run from /export/snaps/gfe/81/Linux/snapshots.052215/gf810XMaintsancout/tests/classes/management/clusterconfig/clusterConfig.bt
Test:
management/clusterconfig/serialCacheRegionEntryGroupOpsRecycleVM.conf
A=dataStoreGroup1
B=dataStoreGroup2
C=cli
cliHosts=2
cliThreadsPerVM=1
cliVMsPerHost=1
dataStoreGroup1Hosts=3
dataStoreGroup1ThreadsPerVM=1
dataStoreGroup1VMsPerHost=1
dataStoreGroup2Hosts=3
dataStoreGroup2ThreadsPerVM=1
dataStoreGroup2VMsPerHost=1
locatorHosts=2
locatorThreadsPerVM=1
locatorVMsPerHost=1
numDataStoreMembersToHostRegion=0
numMembersCreateCacheOnly=6
numMembersJoinDSOnly=0
redundantCopies=2
No local.conf for this run
//randomSeed extracted from test:
hydra.Prms-randomSeed=1432324011057;
*** Test failed with this error:
CLIENT vm_3_thr_3_dataStoreGroup12_w1-gst-dev04_19748
CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot
ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache
util.TestException: Expected to find diskStores [DEFAULT] defined in cache
at management.clusterconfig.ClusterConfigTest.verifyDiskStoreSnapshot(ClusterConfigTest.java:421)
at management.clusterconfig.ClusterConfigTest.verifySnapshot(ClusterConfigTest.java:316)
at management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot(ClusterConfigTest.java:292)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at hydra.MethExecutor.execute(MethExecutor.java:189)
at hydra.MethExecutor.execute(MethExecutor.java:153)
at hydra.TestTask.execute(TestTask.java:194)
at hydra.RemoteTestModule$1.run(RemoteTestModule.java:217)
The test has two group of clusters as given below.
group1 : vm_2, vm_3, vm_4
group2 : vm_5, vm_6, vm_7
*** vm_4 from group1 has written the snapshot, written the diskstore to bb which contains "DEFAULT" diskstore
[info 2015/05/22 12:50:32.387 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Received task: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot
[info 2015/05/22 12:50:32.391 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Waiting for a period of silence for 60 seconds...
[info 2015/05/22 12:51:32.439 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Done waiting, clients have been silent for 60048 ms
[info 2015/05/22 12:51:32.451 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Writing following senders in snapshot:
[info 2015/05/22 12:51:32.456 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] After writing following senders exist in snapshot:
[info 2015/05/22 12:51:32.456 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Writing following receivers in snapshot:
[info 2015/05/22 12:51:32.457 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] After writing following recievers exist in snapshot:
*** vm_2 and vm_3 compared their diskstore with snapshot (writen by vm_4) and reported missing "DEFAULT" diskstore
dataStoreGroup1Gemfire1/vm_2_dataStoreGroup11_w1-gst-dev04_18209.log:[severe 2015/05/22 12:51:34.473 PDT <vm_2_thr_2_dataStoreGroup11_w1-gst-dev04_18209> tid=0xc7] Task result: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot: ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire1/vm_2_dataStoreGroup11_w1-gst-dev04_18209.log: util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire2/vm_3_dataStoreGroup12_w1-gst-dev04_19748.log:[severe 2015/05/22 12:51:34.473 PDT <vm_3_thr_3_dataStoreGroup12_w1-gst-dev04_19748> tid=0x92] Task result: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot: ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire2/vm_3_dataStoreGroup12_w1-gst-dev04_19748.log: util.TestException: Expected to find diskStores [DEFAULT] defined in cache
Looks like the "DEFAULT" diskstore was not created on vm_2 and vm_3
{noformat}
----
Additional comments from Nilkanth
Some of my findings are as follow.
Issue is not reproducing on each run but easily reproduced when run into iteration. For example, out of 10 iteration, this is occurring for 2 times. While running only the {{serialCacheRegionEntryGroupOpsRecycleVM.conf}} alone (commented out others) did not fail but running it for 10 or more times in iteration may reproduce the issue.
To me, it seems like a issue related to cleanup and tried following in that direction.
updated {{GemfireCacheImpl.closeDiskStores(..)}} to close regionOwnedDiskStores.
{noformat}
public void closeDiskStores() {
//close diskStores
//close regionOwnedDiskStores
}
{noformat}
Added log statements to {{GemfireCacheImpl.addDiskStore()}} and {{GemfireCacheImpl.removeDiskStore()}} to observe the changes
Still not sure what is causing this failure.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)