You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2021/09/02 08:07:56 UTC

[GitHub] [geode-native] jvarenina opened a new pull request #864: GEODE-9576: Fix for single-hop function execution

jvarenina opened a new pull request #864:
URL: https://github.com/apache/geode-native/pull/864


   Fault:
   "InternalFunctionInvocationTargetException: Multiple target nodes
   found for single hop operation" occurs on server when executing
   function in a single hop manner for all buckets during the period when
   client bucket metadata doesn't contains all buckets locations.
   
   Fix:
   The client will execute function in a non single-hop manner until it
   recevies all buckets locations. This solution is aligned with java
   client.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] echobravopapa commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
echobravopapa commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705553536



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       A similarly difficult issue was tested and fixed in recent months, regarding incorrect EventIds etc...  which fixed a "fix" that was making native operate at parity with java client, but the initial fix was incomplete... 
   I'm sure @pdxcodemonkey has more to say about this...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pdxcodemonkey commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pdxcodemonkey commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r706251752



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       Sorry, totally missed this test - dunno how that happened.  To confirm your conjecture above, yes my test sleeps because it has to have a metadata update, with partial info, to repro the failure, just as yours does.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pdxcodemonkey merged pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pdxcodemonkey merged pull request #864:
URL: https://github.com/apache/geode-native/pull/864


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pdxcodemonkey commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pdxcodemonkey commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705589383



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       The `putAllAndVerifyKeysExist` test, added in this PR: https://github.com/apache/geode-native/pull/784/files
   
   (very nearly?) guarantees a state in which not all metadata buckets are present on the client, so I kind of expect executing a single-hop function instead of the `putAll` call in this test code would reproduce.  This test has been _very_ reliable, so I wouldn't expect to have a lot of issues with flaky behavior with respect to metadata updates.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pivotal-jbarrett commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pivotal-jbarrett commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r700973535



##########
File path: tests/javaobject/MultiGetAllFunctionNonHA.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package javaobject;
+
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.Properties;
+import java.util.Vector;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.geode.cache.Cache;
+import org.apache.geode.cache.CacheClosedException;
+import org.apache.geode.cache.CacheFactory;
+import org.apache.geode.cache.Declarable;
+import org.apache.geode.cache.execute.FunctionAdapter;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.ResultSender;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.Function;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.execute.RegionFunctionContext;
+import org.apache.geode.cache.partition.PartitionRegionHelper;
+import org.apache.geode.distributed.internal.InternalDistributedSystem;
+import org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException;
+
+public class MultiGetAllFunctionNonHA extends FunctionAdapter implements Declarable{

Review comment:
       `FunctionAdapter` is deprecated and should not be used. `Function` has generic type parameters that should be declared.

##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       How is this test asserting that the PR metadata is not complete for region prior to and during the execution of this function?

##########
File path: tests/javaobject/MultiGetAllFunctionNonHA.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package javaobject;
+
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.Properties;
+import java.util.Vector;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.geode.cache.Cache;
+import org.apache.geode.cache.CacheClosedException;
+import org.apache.geode.cache.CacheFactory;
+import org.apache.geode.cache.Declarable;
+import org.apache.geode.cache.execute.FunctionAdapter;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.ResultSender;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.Function;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.execute.RegionFunctionContext;
+import org.apache.geode.cache.partition.PartitionRegionHelper;
+import org.apache.geode.distributed.internal.InternalDistributedSystem;
+import org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException;
+
+public class MultiGetAllFunctionNonHA extends FunctionAdapter implements Declarable{
+
+  public void execute(FunctionContext context) {

Review comment:
       This and other methods should be declared as `@Override`.

##########
File path: tests/javaobject/MultiGetAllFunctionNonHA.java
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package javaobject;
+
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.Properties;
+import java.util.Vector;
+import java.util.List;
+import java.util.Set;
+
+import org.apache.geode.cache.Cache;
+import org.apache.geode.cache.CacheClosedException;
+import org.apache.geode.cache.CacheFactory;
+import org.apache.geode.cache.Declarable;
+import org.apache.geode.cache.execute.FunctionAdapter;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.ResultSender;
+import org.apache.geode.cache.Region;
+import org.apache.geode.cache.execute.Function;
+import org.apache.geode.cache.execute.FunctionContext;
+import org.apache.geode.cache.execute.RegionFunctionContext;
+import org.apache.geode.cache.partition.PartitionRegionHelper;
+import org.apache.geode.distributed.internal.InternalDistributedSystem;
+import org.apache.geode.internal.cache.execute.InternalFunctionInvocationTargetException;
+
+public class MultiGetAllFunctionNonHA extends FunctionAdapter implements Declarable{
+
+  public void execute(FunctionContext context) {
+    RegionFunctionContext regionContext = (RegionFunctionContext) context;
+    final Region<String, String> region =
+        PartitionRegionHelper.getLocalDataForContext(regionContext);
+
+    Set<String> keys = region.keySet();
+    List<String> listKey = new ArrayList<>();

Review comment:
       Why and intermediate list since `ResultSender` can send multiple results, just iterate until you are at the last key and send that using `lastResult`.

##########
File path: cppcache/src/ClientMetadataService.cpp
##########
@@ -578,12 +577,10 @@ ClientMetadataService::pruneNodes(
     const auto locations = metadata->adviseServerLocations(bucketId);
     if (locations.size() == 0) {
       LOGDEBUG(
-          "ClientMetadataService::pruneNodes Since no server location "
-          "available for bucketId = %d  putting it into "
-          "bucketSetWithoutServer ",
+          "ClientMetadataService::pruneNodes Use non single-hop path "

Review comment:
       These changes strike me as unit testable. Please produce a unit test for these changes.

##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",

Review comment:
       We should be leveraging modern configuration APIs, not the legacy cache.xml, whenever possible. This may mean providing `gfsh` commands to configure the servers via cluster config. At the very least the cache.xml files should be located in the new integration tests as resources and not the old tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on pull request #864:
URL: https://github.com/apache/geode-native/pull/864#issuecomment-925669664


   Hi all, sorry for bothering. I just want to double check with you all if there is something that I missed in this PR and should be done by me. I still see request change by @echobravopapa , but I think that one is resolved. Am I right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] mreddington commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
mreddington commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r706337802



##########
File path: cppcache/test/mock/ClientMetadataMock.hpp
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#ifndef GEODE_CLIENTMETADATAMOCK_H_
+#define GEODE_CLIENTMETADATAMOCK_H_
+
+#include <gmock/gmock.h>
+
+#include "ClientMetadata.hpp"
+
+namespace apache {
+namespace geode {
+namespace client {
+class ClientMetadataMock : public ClientMetadata {

Review comment:
       I would rather you overload the ClientMetadata constructor to accept a bucket set than burden the system with yet more polymorphism so you can inject a bucket set.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705952887



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       Thanks for the review! Really appreciate it!
   
   Actually I already did the same thing in my integration test `FunctionExecutionWithIncompleteBucketLocations`. If you inspect it closely you will see following similarities:
   
   1. I do same here as you when execute 50 puts. Additionally, I use hook to check that metadata is updated on client. For this case it is important to have incomplete metadata (number of locations is not important in retrieved metadata), because in case metadata is never refreshed on client the faulty case would not be triggered. I think that you use sleep instead the hook for this reason.
   ```
     // Populate region in a way that not all buckets are created.
     // Servers in this case will create 88 of possible 113 buckets.
     populateRegion(region);
   
     // Check that PR metadata is updated. This is done to be sure
     // that client will execute function in a non single hop manner
     // because metadata doesn't contain all bucket locations.
     // After metadata is refreshed, it will contain at least one
     // bucket location.
     CacheImpl *cacheImpl = CacheRegionHelper::getCacheImpl(&cache);
     waitUntilPRMetadataIsRefreshed(cacheImpl);
   ```
   2. Then I execute function, which then trigger faulty case. You can remove fix and execute this case, and you will get "InternalFunctionInvocationTargetException" every time.
   ```
     auto functionService = FunctionService::onRegion(region);
     auto rc =
         functionService.withCollector(std::make_shared<TestResultCollector>())
             .execute("MultiGetAllFunctionNonHA");
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on pull request #864:
URL: https://github.com/apache/geode-native/pull/864#issuecomment-930893408


   Should we merge this PR or still need to wait on some more reviews?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r701115101



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",

Review comment:
       Ok, I'll change test case to only use `gfsh` configuration.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r701104899



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       In this particular case, the client PR metadata will never contain all 113 bucket locations, because `populateRegion` function will only trigger creation of some buckets on servers, and not all of them. Without the fix, the single hop function execution in this case would always fail with the exception `InternalFunctionInvocationTargetException` on servers. Client would then trigger `ResultCollector::clearResult` and re-execute function in a non single hop manner. So for now, I only checked that that re-execution is not triggered, by checking that `ResultCollector::clearResult` is never triggered.
   
   As you pointed out, maybe it should be good to expand the test case. Maybe this way:
   
   1. Populate region in a way that not all buckets on servers are created
   2. Check that client received partial PR metadata (some bucket locations are missing).
   3. Execute function and check that all data is received
   4. Repopulate region, so that now all buckets are created on servers
   5. Trigger PR metadata refresh (by executing function) and check that client received complete PR metadata (all buckets locations)
   6. Execute function again and check that all data is received
   
   What do you think about above case? Should I add or remove something?
   
   Additionally, I think that client should not try to `clearResult` and re-execute the function when function `isHA` is set to false, but immediately fail with the received exception. What do you think?
   
   Thanks for the review!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pivotal-jbarrett commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pivotal-jbarrett commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705544017



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       Unfortunately this is notoriously hard to test. PR metadata updates are asynchronous and there isn't any hooks to trigger when an update is complete. You may want to start with adding that function. The downside to that is to use them in an integration tests we would have to export those functions and we try not to export internal/test symbols if we can. Perhaps it might be easier to create a unit test that asserts this part of the function execution logic does the correct thing when the metadata is incomplete. It should be easier to tease out a function that is given the metadata and makes this decisions and then write a test to verify the expected behaviors. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] echobravopapa commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
echobravopapa commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705531306



##########
File path: cppcache/src/ClientMetadataService.cpp
##########
@@ -578,12 +579,10 @@ ClientMetadataService::pruneNodes(
     const auto locations = metadata->adviseServerLocations(bucketId);
     if (locations.size() == 0) {
       LOGDEBUG(
-          "ClientMetadataService::pruneNodes Since no server location "
-          "available for bucketId = %d  putting it into "
-          "bucketSetWithoutServer ",
+          "ClientMetadataService::pruneNodes Use non single-hop path "
+          "since no server location is available for bucketId = %d",
           bucketId);
-      bucketSetWithoutServer.insert(bucketId);
-      continue;
+      return nullptr;

Review comment:
       I'd like to see an integration test that exposes the issue being fixed...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r709813659



##########
File path: cppcache/src/ClientMetadataService.cpp
##########
@@ -578,12 +579,10 @@ ClientMetadataService::pruneNodes(
     const auto locations = metadata->adviseServerLocations(bucketId);
     if (locations.size() == 0) {
       LOGDEBUG(
-          "ClientMetadataService::pruneNodes Since no server location "
-          "available for bucketId = %d  putting it into "
-          "bucketSetWithoutServer ",
+          "ClientMetadataService::pruneNodes Use non single-hop path "
+          "since no server location is available for bucketId = %d",
           bucketId);
-      bucketSetWithoutServer.insert(bucketId);
-      continue;
+      return nullptr;

Review comment:
       Hi @echobravopapa ,
   
   Is this integration test OK to you? Maybe to remove request change if there is nothing else to add?
   
   Thanks/Jakov 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pdxcodemonkey commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pdxcodemonkey commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r706251752



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +181,58 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < 113; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionSingleHopNonHA) {
+  Cluster cluster{
+      LocatorCount{1}, ServerCount{3},
+      CacheXMLFiles(
+          {std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver1_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver2_pool_nonHA.xml",
+           std::string(getFrameworkString(FrameworkVariable::TestCacheXmlDir)) +
+               "/func_cacheserver3_pool_nonHA.xml"})};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  cluster.applyLocators(poolFactory);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  populateRegion(region);
+
+  for (int i = 0; i < 30; i++) {
+    auto functionService = FunctionService::onRegion(region);
+    auto rc =
+        functionService.withCollector(std::make_shared<TestResultCollector>())

Review comment:
       Sorry, totally missed this was an integration test - dunno how that happened.  To confirm your conjecture above, yes my test sleeps because it has to have a metadata update, with partial info, to repro the failure, just as yours does.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] jvarenina commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
jvarenina commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r705954380



##########
File path: cppcache/src/ClientMetadataService.cpp
##########
@@ -578,12 +579,10 @@ ClientMetadataService::pruneNodes(
     const auto locations = metadata->adviseServerLocations(bucketId);
     if (locations.size() == 0) {
       LOGDEBUG(
-          "ClientMetadataService::pruneNodes Since no server location "
-          "available for bucketId = %d  putting it into "
-          "bucketSetWithoutServer ",
+          "ClientMetadataService::pruneNodes Use non single-hop path "
+          "since no server location is available for bucketId = %d",
           bucketId);
-      bucketSetWithoutServer.insert(bucketId);
-      continue;
+      return nullptr;

Review comment:
       Thank you for the review!
   
   There is already integration "FunctionExecutionTest.FunctionExecutionWithIncompleteBucketLocations" test in this PR that exposes the issue. Please check comment https://github.com/apache/geode-native/pull/864#discussion_r705952887 .




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] pivotal-jbarrett commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
pivotal-jbarrett commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r706508046



##########
File path: cppcache/test/mock/ClientMetadataMock.hpp
##########
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#pragma once
+
+#ifndef GEODE_CLIENTMETADATAMOCK_H_
+#define GEODE_CLIENTMETADATAMOCK_H_
+
+#include <gmock/gmock.h>
+
+#include "ClientMetadata.hpp"
+
+namespace apache {
+namespace geode {
+namespace client {
+class ClientMetadataMock : public ClientMetadata {

Review comment:
       I am not sure I understand @mreddington, this is mock and this is how you mock classes using gmock.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode-native] alb3rtobr commented on a change in pull request #864: GEODE-9576: Fix for single-hop function execution

Posted by GitBox <gi...@apache.org>.
alb3rtobr commented on a change in pull request #864:
URL: https://github.com/apache/geode-native/pull/864#discussion_r719434065



##########
File path: cppcache/integration/test/FunctionExecutionTest.cpp
##########
@@ -175,6 +196,88 @@ TEST(FunctionExecutionTest,
   cache.close();
 }
 
+void populateRegion(const std::shared_ptr<Region> &region) {
+  for (int i = 0; i < PARTITION_REGION_ENTRIES_SIZE; i++) {
+    region->put("KEY--" + std::to_string(i), "VALUE--" + std::to_string(i));
+  }
+}
+
+void waitUntilPRMetadataIsRefreshed(CacheImpl *cacheImpl) {
+  auto end = std::chrono::system_clock::now() + std::chrono::minutes(2);
+  while (!cacheImpl->getAndResetPrMetadataUpdatedFlag()) {
+    std::this_thread::sleep_for(std::chrono::milliseconds(200));
+    ASSERT_FALSE(std::chrono::system_clock::now() > end);
+  }
+}
+
+TEST(FunctionExecutionTest, FunctionExecutionWithIncompleteBucketLocations) {
+  std::vector<uint16_t> serverPorts;
+  serverPorts.push_back(Framework::getAvailablePort());
+  serverPorts.push_back(Framework::getAvailablePort());
+  serverPorts.push_back(Framework::getAvailablePort());
+  Cluster cluster{LocatorCount{1}, ServerCount{3}, serverPorts};
+
+  cluster.start([&]() {
+    cluster.getGfsh()
+        .deploy()
+        .jar(getFrameworkString(FrameworkVariable::JavaObjectJarPath))
+        .execute();
+  });
+
+  cluster.getGfsh()
+      .create()
+      .region()
+      .withName("partition_region")
+      .withType("PARTITION")
+      .execute();
+
+  auto cache = CacheFactory().create();
+  auto poolFactory = cache.getPoolManager().createFactory();
+
+  ServerAddress serverAddress = cluster.getServers()[2].getAddress();
+  cluster.applyServer(poolFactory, serverAddress);
+
+  auto pool =
+      poolFactory.setPRSingleHopEnabled(true).setRetryAttempts(0).create(
+          "pool");
+
+  auto region = cache.createRegionFactory(RegionShortcut::PROXY)
+                    .setPoolName("pool")
+                    .create("partition_region");
+
+  // Populate region in a way that not all buckets are created.
+  // Servers in this case will create 88 of possible 113 buckets.
+  populateRegion(region);
+
+  // Check that PR metadata is updated. This is done to be sure
+  // that client will execute function in a non single hop manner
+  // because metadata doesn't contain all bucket locations.
+  // After metadata is refreshed, it will contain at least one
+  // bucket location.
+  CacheImpl *cacheImpl = CacheRegionHelper::getCacheImpl(&cache);
+  waitUntilPRMetadataIsRefreshed(cacheImpl);
+
+  auto functionService = FunctionService::onRegion(region);
+  auto rc =
+      functionService.withCollector(std::make_shared<TestResultCollector>())
+          .execute("MultiGetAllFunctionNonHA");
+
+  std::shared_ptr<TestResultCollector> resultCollector =
+      std::dynamic_pointer_cast<TestResultCollector>(rc);
+
+  // check that function is not re-executed
+  ASSERT_FALSE(resultCollector->isClearTriggered());
+
+  // check that PR metadata is updated after function is executed
+  waitUntilPRMetadataIsRefreshed(cacheImpl);
+
+  // check that correct nubmer of events is received in function result

Review comment:
       There is a typo here (`nubmer`)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org