You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2021/04/11 17:50:12 UTC

[GitHub] [hadoop] virajjasani opened a new pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

virajjasani opened a new pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611354463



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       Sure @jojochuang , I can update mtime after rename(). However, for the better understanding, when I looked into it, it seems that mtime is updated for parent dirs rather than file itself.
   For instance, renameTo updates mtime [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java#L222) and it internally updates mtime of srcParent and dstParent [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java#L747-L750)
   Does this seem correct (unless I am missing something)? I also manually verified that mtime is not updated for the file after it's rename(), however, if we want to update mtime after rename() for safety purpose, that should be good I believe.
   
   Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani closed pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
virajjasani closed pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#issuecomment-818959488


   Closing this PR as this is being tracked actively on #2897 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jojochuang commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
jojochuang commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611335147



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       I think you should update mtime after rename() (below) instead. The HDFS rename updates mtime too. Not sure about other file system implementations but it's safe to assume that being the case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jojochuang commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
jojochuang commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611367826



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       Ah. interesting. Thanks for pointing out. I wasn't aware of that.
   But I guess that makes sense for HDFS, since after rename, we update the quota for src and dest parent directory, so we do need to update mtime for them.
   
   however, we need to support other file systems and it's not clear how they implement it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611354463



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       Sure @jojochuang , I can update mtime after rename(). However, for the better understanding, when I looked into it, it seems that mtime is updated for parent dirs rather than file itself.
   For instance, renameTo updates mtime [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java#L222) and it internally updates mtime of srcParent and dstParent [here](https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirRenameOp.java#L747-L750)
   Does this seem correct (unless I am missing something)?
   Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#issuecomment-817623790


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 40s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 27s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 12s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 26s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 20s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 16s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 52s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m 15s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  24m 47s |  |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not generate ASF License warnings.  |
   |  |   |  96m  8s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2892 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 5f0fea497640 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 557a74ec0c56ef206b5c5eb69c22079ce295892e |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/3/testReport/ |
   | Max. process+thread count | 747 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#issuecomment-817359032


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m  3s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  13m 55s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 25s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 24s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 21s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 21s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 48s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  13m 35s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 42s |  |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not generate ASF License warnings.  |
   |  |   |  92m 16s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2892 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 6bb31b0d0a0d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6c083276ad81798b7ab2cdc760753de60b67283e |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/1/testReport/ |
   | Max. process+thread count | 542 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] hadoop-yetus commented on pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
hadoop-yetus commented on pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#issuecomment-817622166


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 13s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   0m 24s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m  3s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 26s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 23s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  javac  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 17s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m  6s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  23m 30s |  |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not generate ASF License warnings.  |
   |  |   |  94m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2892 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e048a2a270fc 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 557a74ec0c56ef206b5c5eb69c22079ce295892e |
   | Default Java | Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/2/testReport/ |
   | Max. process+thread count | 614 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2892/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] virajjasani commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611370975



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       Makes sense, let me update this after rename()




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] amaroti commented on pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
amaroti commented on pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#issuecomment-817942490


   Hey @virajjasani take a look at this: https://github.com/apache/hadoop/pull/2897. This also restores the parent directories modification time access time. I have tested this myself by copying a 13 GB file between two clusters.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


[GitHub] [hadoop] jojochuang commented on a change in pull request #2892: HADOOP-17611. Distcp parallel file copy should retain first chunk modifiedTime after concat

Posted by GitBox <gi...@apache.org>.
jojochuang commented on a change in pull request #2892:
URL: https://github.com/apache/hadoop/pull/2892#discussion_r611335147



##########
File path: hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java
##########
@@ -635,7 +635,10 @@ private void concatFileChunks(Configuration conf, Path sourceFile,
         ++i;
       }
     }
+    long firstChunkLastModifiedTs = dstfs.getFileStatus(firstChunkFile)
+        .getModificationTime();
     dstfs.concat(firstChunkFile, restChunkFiles);
+    dstfs.setTimes(firstChunkFile, firstChunkLastModifiedTs, -1);

Review comment:
       I think you should update mtime after rename(). The HDFS rename updates mtime too. Not sure about other file system implementations but it's safe to assume that being the case.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org