You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Murtaza A. Basrai (JIRA)" <ji...@apache.org> on 2007/10/31 00:41:50 UTC

[jira] Created: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

distcp between two clusters does not work if it is run on the target cluster
----------------------------------------------------------------------------

                 Key: HADOOP-2129
                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
             Project: Hadoop
          Issue Type: Bug
          Components: util
    Affects Versions: 0.16.0
         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
With patches for HADOOP-2033 and HADOOP-2048.

            Reporter: Murtaza A. Basrai
            Priority: Critical


I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:

hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir


I tried 4 ways of doing it:

1) Copy from A to B, by running distcp on A
2) Copy from A to B, by running distcp on B
3) Copy from B to A, by running distcp on B
4) Copy from B to A, by running distcp on A

Invocations 1 and 3 succeeded, but 2 and 4 failed.

I got a lot of errors of the type below:

07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
java.io.IOException: Some copies could not complete. See log for details.
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)

followed by the job failing:

07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
Copy failed: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
        at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
        at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
        at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Fix Version/s:     (was: 0.16.0)
                   0.15.2

> When/if we roll a 0.15.2 should this be included?

+1  I'll merge it to the branch.



> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.15.2
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Fix Version/s: 0.16.0
         Assignee: Doug Cutting  (was: Chris Douglas)
           Status: Patch Available  (was: Open)

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Attachment: HADOOP-2129-3.patch

New version that actually compiles!

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Murtaza A. Basrai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538952 ] 

Murtaza A. Basrai commented on HADOOP-2129:
-------------------------------------------

BTW, when the failures happen, the distcp log directory (/logdir) is empty, and the map-reduce log files do not have any error messages except the ones given above.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545616 ] 

Doug Cutting commented on HADOOP-2129:
--------------------------------------

> It looks like o.a.h.dfs.DistributedFileSystem::getPathName discards the scheme, authority, and port [...]

That's actually appropriate: getPathName is supposed to do that.  I think the bug is that DistributedFileSystem#listStatus() does not qualify the paths, as does DistributedFileSystem#listPaths(), by using the DfsPath whose constructor qualifies.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546007 ] 

Hadoop QA commented on HADOOP-2129:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370335/HADOOP-2129-3.patch
against trunk revision r598699.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1177/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1177/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1177/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1177/console

This message is automatically generated.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541447 ] 

chris.douglas edited comment on HADOOP-2129 at 11/9/07 1:38 PM:
----------------------------------------------------------------

Copy from A to B, by running distcp on B with -i (ignore read failures) generated the following exception trace (prolifically):

{noformat}
FAIL hdfs://namenode-of-B:8600/targetdir/targetfile : org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open filename /targetdir/targetfile
        at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at org.apache.hadoop.ipc.Client.call(Client.java:482)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840)
        at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285)
        at org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:244)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:367)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:218)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
{noformat}

All the directories were created successfully- so the src file list is readable at the destination- but none of the files could be opened at src.

The trace through o.a.h.u.CopyFiles doesn't match what's in the repository, though. Is this running with any custom patches?

      was (Author: chris.douglas):
    Copy from A to B, by running distcp on B with -i (ignore read failures) generated the following exception trace (prolifically):

{{noformat}}
FAIL hdfs://namenode-of-B:8600/targetdir/targetfile : org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open filename /targetdir/targetfile
        at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at org.apache.hadoop.ipc.Client.call(Client.java:482)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840)
        at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285)
        at org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:244)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:367)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:218)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
{{noformat}}

All the directories were created successfully- so the src file list is readable at the destination- but none of the files could be opened at src.

The trace through o.a.h.u.CopyFiles doesn't match what's in the repository, though. Is this running with any custom patches?
  
> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546321 ] 

Owen O'Malley commented on HADOOP-2129:
---------------------------------------

When/if we roll a 0.15.2 should this be included?

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Status: Open  (was: Patch Available)

Sigh.  I renamed a variable after testing this, and then submitted a broken, untested, patch...

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12541447 ] 

Chris Douglas commented on HADOOP-2129:
---------------------------------------

Copy from A to B, by running distcp on B with -i (ignore read failures) generated the following exception trace (prolifically):

{{noformat}}
FAIL hdfs://namenode-of-B:8600/targetdir/targetfile : org.apache.hadoop.ipc.RemoteException: java.io.IOException: Cannot open filename /targetdir/targetfile
        at org.apache.hadoop.dfs.NameNode.open(NameNode.java:238)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

        at org.apache.hadoop.ipc.Client.call(Client.java:482)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.open(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:848)
        at org.apache.hadoop.dfs.DFSClient$DFSInputStream.<init>(DFSClient.java:840)
        at org.apache.hadoop.dfs.DFSClient.open(DFSClient.java:285)
        at org.apache.hadoop.dfs.DistributedFileSystem.open(DistributedFileSystem.java:114)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:244)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:367)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:218)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
{{noformat}}

All the directories were created successfully- so the src file list is readable at the destination- but none of the files could be opened at src.

The trace through o.a.h.u.CopyFiles doesn't match what's in the repository, though. Is this running with any custom patches?

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I committed this.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546231 ] 

Hudson commented on HADOOP-2129:
--------------------------------

Integrated in Hadoop-Nightly #316 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/316/])

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Attachment: HADOOP-2129-2.patch

Here's a version that doesn't add a new public FileStatus#getPath(), but rather fixes this entirely in DistributedFileSystem.java.  Is that better?

It's important that, in addition to returning a fully-qualified path, we return an instance of DfsPath, so that future RPCs with that path are cached.


> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Oscar Stiffelman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545256 ] 

Oscar Stiffelman commented on HADOOP-2129:
------------------------------------------

I experienced the same problem.  Adding some logging revealed that the source paths no longer had the "hdfs://host:port" portion of the URI when they were used in the copy() method of the mapper.  This caused the source file-system to be incorrectly identified (it was set to the default file system, which was what it found in the config file).

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545704 ] 

Hadoop QA commented on HADOOP-2129:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370250/HADOOP-2129-2.patch
against trunk revision r598469.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs -1.  The patch appears to cause Findbugs to fail.

    core tests -1.  The patch failed core unit tests.

    contrib tests -1.  The patch failed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1166/testReport/
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1166/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1166/console

This message is automatically generated.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Status: Patch Available  (was: Open)

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Doug Cutting
>            Priority: Critical
>             Fix For: 0.16.0
>
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129-3.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545600 ] 

Chris Douglas commented on HADOOP-2129:
---------------------------------------

It looks like o.a.h.dfs.DistributedFileSystem::getPathName discards the scheme, authority, and port from the URI here:

{code}
    String result = makeAbsolute(file).toUri().getPath();
{code}

This is called from o.a.h.dfs.DistributedFileSystem::listStatus(Path), used to build the source list. The source is written as a FileStatus object, not a Path, so this information is never restored.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545659 ] 

Chris Douglas commented on HADOOP-2129:
---------------------------------------

+1

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>         Attachments: 2129-0.patch, HADOOP-2129-2.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-2129:
----------------------------------

    Attachment: 2129-0.patch

Sorry, I didn't mean to imply that getPathName was doing the wrong thing, only that the information was discarded there.

I'd been picturing something closer to Path::makeQualified. Since the method on FileStatus must be public, it might as well be restricted to adding information instead of changing it.

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>         Attachments: 2129-0.patch, HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Christian Kunz (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Kunz reassigned HADOOP-2129:
--------------------------------------

    Assignee: Chris Douglas

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2129) distcp between two clusters does not work if it is run on the target cluster

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated HADOOP-2129:
---------------------------------

    Attachment: HADOOP-2129.patch

Here's a patch that should cause DFS#listStatus() to return fully-qualified paths.  Does this fix the distcp issue?

> distcp between two clusters does not work if it is run on the target cluster
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2129
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2129
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/718/
> With patches for HADOOP-2033 and HADOOP-2048.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>         Attachments: HADOOP-2129.patch
>
>
> I am trying to copy a directory (~100k files, ~500GB) between two clusters A and B (~70 nodes), using a command like:
> hadoop distcp -log /logdir hdfs://namenode-of-A:8600/srcdir hdfs://namenode-of-B:8600/targetdir
> I tried 4 ways of doing it:
> 1) Copy from A to B, by running distcp on A
> 2) Copy from A to B, by running distcp on B
> 3) Copy from B to A, by running distcp on B
> 4) Copy from B to A, by running distcp on A
> Invocations 1 and 3 succeeded, but 2 and 4 failed.
> I got a lot of errors of the type below:
> 07/10/30 20:52:11 INFO mapred.JobClient: Running job: job_200710180049_0115
> 07/10/30 20:52:12 INFO mapred.JobClient:  map 0% reduce 0%
> 07/10/30 20:54:41 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/30 20:56:52 INFO mapred.JobClient:  map 2% reduce 0%
> 07/10/30 20:57:41 INFO mapred.JobClient: Task Id : task_200710180049_0115_m_000184_0, Status : FAILED
> java.io.IOException: Some copies could not complete. See log for details.
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.close(CopyFiles.java:407)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:53)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> followed by the job failing:
> 07/10/30 22:07:41 INFO mapred.JobClient:  map 99% reduce 100%
> Copy failed: java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:688)
>         at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:481)
>         at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:555)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:67)
>         at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:566)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.