You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Jens Deppe (JIRA)" <ji...@apache.org> on 2016/01/07 18:33:39 UTC
[jira] [Created] (GEODE-738) DEFAULT diskstore is missing in some members in cluster config tests

Jens Deppe created GEODE-738:
--------------------------------

             Summary: DEFAULT diskstore is missing in some members in cluster config tests
                 Key: GEODE-738
                 URL: https://issues.apache.org/jira/browse/GEODE-738
             Project: Geode
          Issue Type: Bug
          Components: management
            Reporter: Jens Deppe


This is observed in 8.1 regression.
{noformat}
Host name: w1-gst-dev04
OS name: Linux
Architecture: amd64
OS version: 3.10.0-123.el7.x86_64
Java version: 1.7.0_72
Java vm name: Java HotSpot(TM) 64-Bit Server VM
Java vendor: Oracle Corporation
Java home: /export/gcm/where/jdk/1.7.0_72/x86_64.linux/jre

  #####################################################
  
  GemFire Version 8.1.0.6
  Source Date: Fri, 22 May 2015 10:39:59 -0700
  
  Source Revision: 6aa105334252e99556c4a01c70abacdb20fc033b
  Source Repository: gemfire810X_maint
  
  Build Id: build 052215
  Build Date: 05/22/2015 11:09:03 PDT
  Build Version: 8.1.0.6 build 052215 05/22/2015 11:09:03 PDT javac 1.7.0_72
  Build JDK: Java 1.7.0_72
  Build Platform: Linux 2.6.32-220.23.1.el6.x86_64 i386
  
  #####################################################


Test was run from /export/snaps/gfe/81/Linux/snapshots.052215/gf810XMaintsancout/tests/classes/management/clusterconfig/clusterConfig.bt

Test:
management/clusterconfig/serialCacheRegionEntryGroupOpsRecycleVM.conf
   A=dataStoreGroup1
   B=dataStoreGroup2
   C=cli
   cliHosts=2
   cliThreadsPerVM=1
   cliVMsPerHost=1
   dataStoreGroup1Hosts=3
   dataStoreGroup1ThreadsPerVM=1
   dataStoreGroup1VMsPerHost=1
   dataStoreGroup2Hosts=3
   dataStoreGroup2ThreadsPerVM=1
   dataStoreGroup2VMsPerHost=1
   locatorHosts=2
   locatorThreadsPerVM=1
   locatorVMsPerHost=1
   numDataStoreMembersToHostRegion=0
   numMembersCreateCacheOnly=6
   numMembersJoinDSOnly=0
   redundantCopies=2

No local.conf for this run

//randomSeed extracted from test:
hydra.Prms-randomSeed=1432324011057;

*** Test failed with this error:
CLIENT vm_3_thr_3_dataStoreGroup12_w1-gst-dev04_19748
CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot
ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache

util.TestException: Expected to find diskStores [DEFAULT] defined in cache
	at management.clusterconfig.ClusterConfigTest.verifyDiskStoreSnapshot(ClusterConfigTest.java:421)
	at management.clusterconfig.ClusterConfigTest.verifySnapshot(ClusterConfigTest.java:316)
	at management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot(ClusterConfigTest.java:292)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at hydra.MethExecutor.execute(MethExecutor.java:189)
	at hydra.MethExecutor.execute(MethExecutor.java:153)
	at hydra.TestTask.execute(TestTask.java:194)
	at hydra.RemoteTestModule$1.run(RemoteTestModule.java:217)


The test has two group of clusters as given below.
group1 : vm_2, vm_3, vm_4
group2 : vm_5, vm_6, vm_7


*** vm_4 from group1 has written the snapshot, written the diskstore to bb which contains "DEFAULT" diskstore
[info 2015/05/22 12:50:32.387 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Received task: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot
[info 2015/05/22 12:50:32.391 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Waiting for a period of silence for 60 seconds...
[info 2015/05/22 12:51:32.439 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Done waiting, clients have been silent for 60048 ms
[info 2015/05/22 12:51:32.451 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Writing following senders in snapshot:
[info 2015/05/22 12:51:32.456 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] After writing following senders exist in snapshot:
[info 2015/05/22 12:51:32.456 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] Writing following receivers in snapshot:
[info 2015/05/22 12:51:32.457 PDT <vm_4_thr_4_dataStoreGroup13_w1-gst-dev04_17678> tid=0xdb] After writing following recievers exist in snapshot:


*** vm_2 and vm_3 compared their diskstore with snapshot (writen by vm_4) and reported missing "DEFAULT" diskstore
dataStoreGroup1Gemfire1/vm_2_dataStoreGroup11_w1-gst-dev04_18209.log:[severe 2015/05/22 12:51:34.473 PDT <vm_2_thr_2_dataStoreGroup11_w1-gst-dev04_18209> tid=0xc7] Task result: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot: ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire1/vm_2_dataStoreGroup11_w1-gst-dev04_18209.log:  util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire2/vm_3_dataStoreGroup12_w1-gst-dev04_19748.log:[severe 2015/05/22 12:51:34.473 PDT <vm_3_thr_3_dataStoreGroup12_w1-gst-dev04_19748> tid=0x92] Task result: CLOSETASK[0] management.clusterconfig.ClusterConfigTest.HydraTask_verifySnapshot: ERROR util.TestException: Expected to find diskStores [DEFAULT] defined in cache
dataStoreGroup1Gemfire2/vm_3_dataStoreGroup12_w1-gst-dev04_19748.log:  util.TestException: Expected to find diskStores [DEFAULT] defined in cache

Looks like the "DEFAULT" diskstore was not created on vm_2 and vm_3
{noformat}
----
Additional comments from Nilkanth

Some of my findings are as follow.
Issue is not reproducing on each run but easily reproduced when run into iteration. For example, out of 10 iteration, this is occurring for 2 times. While running only the {{serialCacheRegionEntryGroupOpsRecycleVM.conf}} alone (commented out others) did not fail but running it for 10 or more times in iteration may reproduce the issue.
To me, it seems like a issue related to cleanup and tried following in that direction.
updated {{GemfireCacheImpl.closeDiskStores(..)}} to close regionOwnedDiskStores.
{noformat}
public void closeDiskStores() {
    //close diskStores

    //close regionOwnedDiskStores
  }
{noformat}
Added log statements to {{GemfireCacheImpl.addDiskStore()}} and {{GemfireCacheImpl.removeDiskStore()}} to observe the changes
Still not sure what is causing this failure.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)