You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@ignite.apache.org by GitBox <gi...@apache.org> on 2020/05/04 21:29:59 UTC

[GitHub] [ignite] ololo3000 opened a new pull request #7771: IGNITE-12894 Adds ability to wait for the service topology.

ololo3000 opened a new pull request #7771:
URL: https://github.com/apache/ignite/pull/7771


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] ololo3000 commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
ololo3000 commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425118375



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       I updated PR according to my proposal. Could you please take a look?
   Unfortunately I find it difficult to test this race condition.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] daradurvs merged pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
daradurvs merged pull request #7771:
URL: https://github.com/apache/ignite/pull/7771


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] ololo3000 commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
ololo3000 commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425044202



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       I got it. If after [1] but before [2] service will be deleted and full message with requested service topology will be received, topology of the local ServiceDescriptor reference will never be updated. Is that what you mean? 
   
   ```
   while (true) {
       desc = lookupInRegisteredServices(name);
   
       if (timeout == 0 && desc == null) // [1]
           return null;
   
       synchronized (servicesTopsUpdateMux) { // [2]
           if (desc != null && desc.topologyInitialized())
               return desc.topologySnapshot();
   
           long wait = 0;
   
           if (timeout != 0) {
               wait = timeout - (U.currentTimeMillis() - startTime);
   
               if (wait <= 0)
                   return desc == null ? null : desc.topologySnapshot();
           }
   
           try {
               servicesTopsUpdateMux.wait(wait);
           }
           catch (InterruptedException e) {
               throw new IgniteInterruptedCheckedException(e);
           }
       }
   }
   ```
   
   It seems that it will be enough to only move 
   
   ```
   desc = lookupInRegisteredServices(name);
   
   if (timeout == 0 && desc == null) // [1]
       return null;
   ```
   
   inside the synchronization block. 
   In that case if service is removed before synchronization block acquired or full message is received no matter the order, we will return on 
   
   ```
    if (timeout == 0 && desc == null) 
       return null;
   ```
   
   if service is removed during synchronization block execution, we anyway will receive the full message and stop waiting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] daradurvs commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
daradurvs commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425619196



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       Yes, you understood correctly.
   Changes look good to me.
   Please rerun tests and update TC bot visa in Jira, just to be sure.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] daradurvs commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
daradurvs commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r424988121



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       It looks like this block should be moved outside the loop
   ```
   desc = lookupInRegisteredServices(name);
   
   if (timeout == 0 && desc == null)
       return null;
   ```
   and variable `desc` shoud be refreshed under mutex
   ```
   synchronized (servicesTopsUpdateMux) {
       desc = lookupInRegisteredServices(name);
       ...
   ```
   Because there is might be a race, when during the checks service can be unregistered and `ServiceClusterDeploymentResultBatch` processed.
   That means it's possible hanging because in your case `wait` might be equal to `0` - that means "wait forever".
   
   Refreshing `desc` under mutex - not to provide us happens-before guarantees but provide processing order guarantees: when undeployment may be processed before we start waiting for topology initialization.
   
   Alternative solution - do not wait for `0` and use `100` for example to retry the cycle.
   But it's would be better to provide processing order guarantees.
   
   What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] ololo3000 commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
ololo3000 commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425044202



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       I got it. If after [1] but before [2] service is deleted and full message with requested service topology is received, topology of the local ServiceDescriptor reference will never be updated. Is that what you meant? 
   
   ```
   while (true) {
       desc = lookupInRegisteredServices(name);
   
       if (timeout == 0 && desc == null) // [1]
           return null;
   
       synchronized (servicesTopsUpdateMux) { // [2]
           if (desc != null && desc.topologyInitialized())
               return desc.topologySnapshot();
   
           long wait = 0;
   
           if (timeout != 0) {
               wait = timeout - (U.currentTimeMillis() - startTime);
   
               if (wait <= 0)
                   return desc == null ? null : desc.topologySnapshot();
           }
   
           try {
               servicesTopsUpdateMux.wait(wait);
           }
           catch (InterruptedException e) {
               throw new IgniteInterruptedCheckedException(e);
           }
       }
   }
   ```
   
   It seems that it will be enough to only move 
   
   ```
   desc = lookupInRegisteredServices(name);
   
   if (timeout == 0 && desc == null) // [1]
       return null;
   ```
   
   inside the synchronization block. 
   In that case if service is removed before synchronization block acquired or full message is received no matter the order, we will return on 
   
   ```
    if (timeout == 0 && desc == null) 
       return null;
   ```
   
   if service is removed during synchronization block execution, we anyway will receive the full message and stop waiting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] ololo3000 commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
ololo3000 commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425044202



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       I got it. If after [1] but before [2] service will be deleted and full message with requested service topology will be received, topology of the local ServiceDescriptor reference will never be updated. Is that what you mean? 
   
   ```
   while (true) {
       desc = lookupInRegisteredServices(name);
   
       if (timeout == 0 && desc == null) // [1]
           return null;
   
       synchronized (servicesTopsUpdateMux) { // [2]
           if (desc != null && desc.topologyInitialized())
               return desc.topologySnapshot();
   
           long wait = 0;
   
           if (timeout != 0) {
               wait = timeout - (U.currentTimeMillis() - startTime);
   
               if (wait <= 0)
                   return desc == null ? null : desc.topologySnapshot();
           }
   
           try {
               servicesTopsUpdateMux.wait(wait);
           }
           catch (InterruptedException e) {
               throw new IgniteInterruptedCheckedException(e);
           }
       }
   }
   ```
   
   It seems that it will be enough to only move 
   
   ```
   desc = lookupInRegisteredServices(name);
   
   if (timeout == 0 && desc == null) // [1]
       return null;
   ```
   
   inside the syncgronization block. 
   In that case if service is removed before synchronization block or full message is received no matter the order, we will retrurn on 
   
   ```
    if (timeout == 0 && desc == null) 
       return null;
   ```
   
   if service is removed during synchronization block execution, we anyway will receive the full message and stop waiting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [ignite] ololo3000 commented on a change in pull request #7771: IGNITE-12894 Adds ability to wait for the service topology initialization.

Posted by GitBox <gi...@apache.org>.
ololo3000 commented on a change in pull request #7771:
URL: https://github.com/apache/ignite/pull/7771#discussion_r425044202



##########
File path: modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java
##########
@@ -823,19 +823,26 @@ else if (prj.predicate() == F.<ClusterNode>alwaysTrue())
 
         long startTime = U.currentTimeMillis();
 
-        Map<UUID, Integer> top;
+        ServiceInfo desc;
 
         while (true) {
-            top = serviceTopology(name);
+             desc = lookupInRegisteredServices(name);

Review comment:
       I got it. If after [1] but before [2] service will be deleted and full message with requested service topology will be received, topology of the local ServiceDescriptor reference will never be updated. Is that what you mean? 
   
   ```
   while (true) {
       desc = lookupInRegisteredServices(name);
   
       if (timeout == 0 && desc == null) // [1]
           return null;
   
       synchronized (servicesTopsUpdateMux) { // [2]
           if (desc != null && desc.topologyInitialized())
               return desc.topologySnapshot();
   
           long wait = 0;
   
           if (timeout != 0) {
               wait = timeout - (U.currentTimeMillis() - startTime);
   
               if (wait <= 0)
                   return desc == null ? null : desc.topologySnapshot();
           }
   
           try {
               servicesTopsUpdateMux.wait(wait);
           }
           catch (InterruptedException e) {
               throw new IgniteInterruptedCheckedException(e);
           }
       }
   }
   ```
   
   It seems that it will be enough to only move 
   
   ```
   desc = lookupInRegisteredServices(name);
   
   if (timeout == 0 && desc == null) // [1]
       return null;
   ```
   
   inside the synchronization block. 
   In that case if service is removed before synchronization block or full message is received no matter the order, we will retrurn on 
   
   ```
    if (timeout == 0 && desc == null) 
       return null;
   ```
   
   if service is removed during synchronization block execution, we anyway will receive the full message and stop waiting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org