You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by "waitinfuture (via GitHub)" <gi...@apache.org> on 2023/07/05 02:15:36 UTC

[GitHub] [incubator-celeborn] waitinfuture commented on a diff in pull request #1678: [CELEBORN-764] Fix celeborn on HDFS might clean using app directories.

waitinfuture commented on code in PR #1678:
URL: https://github.com/apache/incubator-celeborn/pull/1678#discussion_r1252467658


##########
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/storage/StorageManager.scala:
##########
@@ -534,7 +534,8 @@ final private[worker] class StorageManager(conf: CelebornConf, workerSource: Abs
         val iter = hadoopFs.listStatusIterator(hdfsWorkPath)
         while (iter.hasNext) {
           val fileStatus = iter.next()
-          if (!appIds.contains(fileStatus.getPath.getName)) {
+          if (!appIds.contains(fileStatus.getPath.getName)

Review Comment:
   getModificationTime will not reflect the change in nested directory. For example I have path /tmp/test/, then I upload a new file into /tmp/test, the modified time of /tmp will not change.
   IMO, HDFS directory does not belong to worker, maybe we should let Master to clean hdfs. cc @pan3793 @RexXiong @AngersZhuuuu 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org