You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@uniffle.apache.org by GitBox <gi...@apache.org> on 2023/01/13 03:20:45 UTC

[GitHub] [incubator-uniffle] zuston opened a new pull request, #481: Fix potential race condition when registering remote storage info

zuston opened a new pull request, #481:
URL: https://github.com/apache/incubator-uniffle/pull/481

   
   
   ### What changes were proposed in this pull request?
   
   1. Use the concurrentHashmap's computeIfAbsent to keep thread safe
   
   ### Why are the changes needed?
   1. To fix potential race condition when registering remote storage info
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Existing UTs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] zuston commented on pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
zuston commented on PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#issuecomment-1383527135

   Thanks for your review @advancedxy . Merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] advancedxy commented on a diff in pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
advancedxy commented on code in PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#discussion_r1069559881


##########
server/src/main/java/org/apache/uniffle/server/storage/HdfsStorageManager.java:
##########
@@ -119,23 +119,20 @@ public Checker getStorageChecker() {
   @Override
   public void registerRemoteStorage(String appId, RemoteStorageInfo remoteStorageInfo) {
     String remoteStorage = remoteStorageInfo.getPath();
-    Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
-    if (!pathToStorages.containsKey(remoteStorage)) {
+    pathToStorages.computeIfAbsent(remoteStorage, key -> {
+      Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
       Configuration remoteStorageHadoopConf = new Configuration(hadoopConf);
       if (remoteStorageConf != null && remoteStorageConf.size() > 0) {
         for (Map.Entry<String, String> entry : remoteStorageConf.entrySet()) {
           remoteStorageHadoopConf.setStrings(entry.getKey(), entry.getValue());
         }
       }
-      pathToStorages.putIfAbsent(remoteStorage, new HdfsStorage(remoteStorage, remoteStorageHadoopConf));
-      // registerRemoteStorage may be called in different threads,
-      // make sure metrics won't be created duplicated
-      // there shouldn't have performance issue because
-      // it will be called only few times according to the number of remote storage
-      String storageHost = pathToStorages.get(remoteStorage).getStorageHost();
+      HdfsStorage hdfsStorage = new HdfsStorage(remoteStorage, remoteStorageHadoopConf);
+      String storageHost = hdfsStorage.getStorageHost();
       ShuffleServerMetrics.addDynamicCounterForRemoteStorage(storageHost);
-    }
-    appIdToStorages.putIfAbsent(appId, pathToStorages.get(remoteStorage));
+      return hdfsStorage;
+    });
+    appIdToStorages.computeIfAbsent(appId, key -> pathToStorages.get(remoteStorage));

Review Comment:
   this line doesn't have race condition. `putIfAbsent` should be just fine?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] smallzhongfeng commented on pull request #481: Fix potential race condition when registering remote storage info

Posted by "smallzhongfeng (via GitHub)" <gi...@apache.org>.
smallzhongfeng commented on PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#issuecomment-1415232095

   Late LGTM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] zuston commented on a diff in pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
zuston commented on code in PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#discussion_r1070751149


##########
server/src/main/java/org/apache/uniffle/server/storage/HdfsStorageManager.java:
##########
@@ -119,23 +119,20 @@ public Checker getStorageChecker() {
   @Override
   public void registerRemoteStorage(String appId, RemoteStorageInfo remoteStorageInfo) {
     String remoteStorage = remoteStorageInfo.getPath();
-    Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
-    if (!pathToStorages.containsKey(remoteStorage)) {
+    pathToStorages.computeIfAbsent(remoteStorage, key -> {
+      Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
       Configuration remoteStorageHadoopConf = new Configuration(hadoopConf);
       if (remoteStorageConf != null && remoteStorageConf.size() > 0) {
         for (Map.Entry<String, String> entry : remoteStorageConf.entrySet()) {
           remoteStorageHadoopConf.setStrings(entry.getKey(), entry.getValue());
         }
       }
-      pathToStorages.putIfAbsent(remoteStorage, new HdfsStorage(remoteStorage, remoteStorageHadoopConf));
-      // registerRemoteStorage may be called in different threads,
-      // make sure metrics won't be created duplicated
-      // there shouldn't have performance issue because
-      // it will be called only few times according to the number of remote storage
-      String storageHost = pathToStorages.get(remoteStorage).getStorageHost();
+      HdfsStorage hdfsStorage = new HdfsStorage(remoteStorage, remoteStorageHadoopConf);
+      String storageHost = hdfsStorage.getStorageHost();
       ShuffleServerMetrics.addDynamicCounterForRemoteStorage(storageHost);
-    }
-    appIdToStorages.putIfAbsent(appId, pathToStorages.get(remoteStorage));
+      return hdfsStorage;
+    });
+    appIdToStorages.computeIfAbsent(appId, key -> pathToStorages.get(remoteStorage));

Review Comment:
   If using `putIfAbsent`, the `pathToStorages.get(remoteStorage)` will be invoked everytime. Although it looks unrelated with this PR. Anyway, this is a performance improvement



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] zuston commented on a diff in pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
zuston commented on code in PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#discussion_r1070752457


##########
server/src/main/java/org/apache/uniffle/server/storage/HdfsStorageManager.java:
##########
@@ -119,23 +119,20 @@ public Checker getStorageChecker() {
   @Override
   public void registerRemoteStorage(String appId, RemoteStorageInfo remoteStorageInfo) {
     String remoteStorage = remoteStorageInfo.getPath();
-    Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
-    if (!pathToStorages.containsKey(remoteStorage)) {
+    pathToStorages.computeIfAbsent(remoteStorage, key -> {
+      Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
       Configuration remoteStorageHadoopConf = new Configuration(hadoopConf);
       if (remoteStorageConf != null && remoteStorageConf.size() > 0) {
         for (Map.Entry<String, String> entry : remoteStorageConf.entrySet()) {
           remoteStorageHadoopConf.setStrings(entry.getKey(), entry.getValue());
         }
       }
-      pathToStorages.putIfAbsent(remoteStorage, new HdfsStorage(remoteStorage, remoteStorageHadoopConf));
-      // registerRemoteStorage may be called in different threads,
-      // make sure metrics won't be created duplicated
-      // there shouldn't have performance issue because
-      // it will be called only few times according to the number of remote storage
-      String storageHost = pathToStorages.get(remoteStorage).getStorageHost();
+      HdfsStorage hdfsStorage = new HdfsStorage(remoteStorage, remoteStorageHadoopConf);
+      String storageHost = hdfsStorage.getStorageHost();
       ShuffleServerMetrics.addDynamicCounterForRemoteStorage(storageHost);
-    }
-    appIdToStorages.putIfAbsent(appId, pathToStorages.get(remoteStorage));
+      return hdfsStorage;
+    });
+    appIdToStorages.computeIfAbsent(appId, key -> pathToStorages.get(remoteStorage));

Review Comment:
   Similar above problems exist in current codebase, maybe we could create some issues as `good first issue`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] jerqi commented on pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
jerqi commented on PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#issuecomment-1381272802

   @smallzhongfeng Could you help me review this pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] advancedxy commented on a diff in pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
advancedxy commented on code in PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481#discussion_r1070798646


##########
server/src/main/java/org/apache/uniffle/server/storage/HdfsStorageManager.java:
##########
@@ -119,23 +119,20 @@ public Checker getStorageChecker() {
   @Override
   public void registerRemoteStorage(String appId, RemoteStorageInfo remoteStorageInfo) {
     String remoteStorage = remoteStorageInfo.getPath();
-    Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
-    if (!pathToStorages.containsKey(remoteStorage)) {
+    pathToStorages.computeIfAbsent(remoteStorage, key -> {
+      Map<String, String> remoteStorageConf = remoteStorageInfo.getConfItems();
       Configuration remoteStorageHadoopConf = new Configuration(hadoopConf);
       if (remoteStorageConf != null && remoteStorageConf.size() > 0) {
         for (Map.Entry<String, String> entry : remoteStorageConf.entrySet()) {
           remoteStorageHadoopConf.setStrings(entry.getKey(), entry.getValue());
         }
       }
-      pathToStorages.putIfAbsent(remoteStorage, new HdfsStorage(remoteStorage, remoteStorageHadoopConf));
-      // registerRemoteStorage may be called in different threads,
-      // make sure metrics won't be created duplicated
-      // there shouldn't have performance issue because
-      // it will be called only few times according to the number of remote storage
-      String storageHost = pathToStorages.get(remoteStorage).getStorageHost();
+      HdfsStorage hdfsStorage = new HdfsStorage(remoteStorage, remoteStorageHadoopConf);
+      String storageHost = hdfsStorage.getStorageHost();
       ShuffleServerMetrics.addDynamicCounterForRemoteStorage(storageHost);
-    }
-    appIdToStorages.putIfAbsent(appId, pathToStorages.get(remoteStorage));
+      return hdfsStorage;
+    });
+    appIdToStorages.computeIfAbsent(appId, key -> pathToStorages.get(remoteStorage));

Review Comment:
   > Anyway, this is a performance improvement
   
   Good to know. I was forgetting the performance difference. Although, `putIfAbsent` is more cleaner in most cases.
   
   >Similar above problems exist in current codebase, maybe we could create some issues as good first issue.
   
   Yes, please create some new issues.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org


[GitHub] [incubator-uniffle] zuston merged pull request #481: Fix potential race condition when registering remote storage info

Posted by GitBox <gi...@apache.org>.
zuston merged PR #481:
URL: https://github.com/apache/incubator-uniffle/pull/481


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@uniffle.apache.org
For additional commands, e-mail: issues-help@uniffle.apache.org