You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/10/02 15:58:00 UTC

[jira] [Work logged] (HADOOP-16932) distcp copy calls getFileStatus() needlessly and can fail against S3

     [ https://issues.apache.org/jira/browse/HADOOP-16932?focusedWorklogId=494027&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-494027 ]

ASF GitHub Bot logged work on HADOOP-16932:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Oct/20 15:57
            Start Date: 02/Oct/20 15:57
    Worklog Time Spent: 10m 
      Work Description: hadoop-yetus removed a comment on pull request #1936:
URL: https://github.com/apache/hadoop/pull/1936#issuecomment-609805287


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 51s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  22m 31s |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 25s |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   0m 22s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 26s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 25s |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  trunk passed  |
   | +0 :ok: |  spotbugs  |   0m 38s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   0m 37s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 24s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 19s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 16s |  hadoop-tools/hadoop-distcp: The patch generated 2 new + 158 unchanged - 12 fixed = 160 total (was 170)  |
   | +1 :green_heart: |  mvnsite  |   0m 22s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 25s |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 19s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   0m 43s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m 57s |  hadoop-distcp in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 28s |  The patch does not generate ASF License warnings.  |
   |  |   |  73m 38s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1936/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1936 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux cdc0443a8acb 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / ab7495d |
   | Default Java | 1.8.0_242 |
   | checkstyle | https://builds.apache.org/job/hadoop-multibranch/job/PR-1936/3/artifact/out/diff-checkstyle-hadoop-tools_hadoop-distcp.txt |
   |  Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1936/3/testReport/ |
   | Max. process+thread count | 350 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
   | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1936/3/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 494027)
    Remaining Estimate: 0h
            Time Spent: 10m

> distcp copy calls getFileStatus() needlessly and can fail against S3
> --------------------------------------------------------------------
>
>                 Key: HADOOP-16932
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16932
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3, tools/distcp
>    Affects Versions: 3.0.0, 3.1.2, 3.2.1
>         Environment: Hadoop CDH 6.3
>            Reporter: Thangamani Murugasamy
>            Assignee: Steve Loughran
>            Priority: Minor
>             Fix For: 3.3.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Distcp to AWS s3 was working fine on CDH 5.16 with distcp 2.6.0. but after upgrade to CDH 6.3 which comes with distcp-3.0 JAR which is through error as below.
> The same error with repeats on Hadoop-distcp-3.2.1.jar as well. Tried with -direct option in 3.2.1, still same error.
>  
> Error: java.io.FileNotFoundException: No such file or directory: s3a://XXXXXXXXXXXXX/part-00012-baa6a706-3816-4dfa-ba07-0fb56fd38178-c000.snappy.parquet
>  at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2255)
>  at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2149)
>  at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2088)
>  at org.apache.hadoop.tools.util.DistCpUtils.preserve(DistCpUtils.java:203)
>  at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:220)
>  at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
>  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org