You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/10/12 08:38:54 UTC

[GitHub] [spark] panbingkun opened a new pull request, #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

panbingkun opened a new pull request, #38220:
URL: https://github.com/apache/spark/pull/38220

   ### What changes were proposed in this pull request?
   
   
   ### Why are the changes needed?
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Pass GA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38220:
URL: https://github.com/apache/spark/pull/38220#discussion_r993173731


##########
common/network-common/src/main/java/org/apache/spark/network/util/DBProvider.java:
##########
@@ -57,8 +64,29 @@ public static DB initDB(DBBackend dbBackend, File file) throws IOException {
           case ROCKSDB: return new RocksDB(RocksDBProvider.initRocksDB(file));
           default:
             throw new IllegalArgumentException("Unsupported DBBackend: " + dbBackend);
+          }
+        }
+        return null;
+    }
+
+    public static void migrate(
+        DBBackend fromDbBackend,
+        File fromFile,
+        DBBackend toDbBackend,
+        File toFile,
+        StoreVersion version,
+        ObjectMapper mapper) throws IOException {
+      LOGGER.warn("Migrate DBBackend from {}({}) to {}({})",
+        fromDbBackend.name(), fromFile.getCanonicalPath(),
+        toDbBackend.name(), toFile.getCanonicalPath());
+      DB fromDb = initDB(fromDbBackend, fromFile, version, mapper);

Review Comment:
   Should close the `fromDb` and `toDb` after used



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38220:
URL: https://github.com/apache/spark/pull/38220#issuecomment-1278450069

   @panbingkun Any updates of this one ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #38220:
URL: https://github.com/apache/spark/pull/38220#issuecomment-1278456119

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38220:
URL: https://github.com/apache/spark/pull/38220#discussion_r993205188


##########
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:
##########
@@ -493,6 +496,49 @@ protected Path getRecoveryPath(String fileName) {
     return _recoveryPath;
   }
 
+  private void migrateRecoveryDb() throws IOException {
+    Preconditions.checkNotNull(_recoveryPath,

Review Comment:
   should not be a `checkNotNull`, I think it should be  `_recoveryPath`
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] panbingkun commented on pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
panbingkun commented on PR #38220:
URL: https://github.com/apache/spark/pull/38220#issuecomment-1278899914

   > @panbingkun Any updates of this one ?
   
   I will update it weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #38220:
URL: https://github.com/apache/spark/pull/38220#issuecomment-1275843768

   1. How to deal with the old database after migration? Delete or Rename?
   2. If there are multiple old databases at the same time, what should we do?
   3. Add configuration to control migration behavior?
   4. How to test?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #38220: [WIP][SPARK-40464][YARN] Support automatic data format conversion for shuffle state db

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #38220:
URL: https://github.com/apache/spark/pull/38220#discussion_r993201467


##########
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java:
##########
@@ -493,6 +496,49 @@ protected Path getRecoveryPath(String fileName) {
     return _recoveryPath;
   }
 
+  private void migrateRecoveryDb() throws IOException {
+    Preconditions.checkNotNull(_recoveryPath,
+            "recovery path should not be null if NM recovery is enabled");
+
+    DBBackend registeredExecutorsDbBackend = findExistedDbBackend(RECOVERY_FILE_NAME);

Review Comment:
   line 503 ~ line 511 can be extract  into a general method and put into the helper class



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org