You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Siddhartha Reddy <si...@grok.in> on 2008/04/04 10:32:33 UTC
distcp fails when copying from s3 to hdfs
I am trying to run a Hadoop cluster on Amazon EC2 and backup all the data on
Amazon S3 between the runs. I am using Hadoop 0.16.1 on a cluster made up of
CentOS 5 images (ami-08f41161).
I am able to copy from hdfs to S3 using the following command:
bin/hadoop distcp file.txt s3://id:secret@bucket-name/file.txt
But copying from S3 to hdfs with the following command fails:
bin/hadoop distcp s3://id:secret@bucket-name/file.txt file2.txt
with the following error:
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.lang.IllegalArgumentException: Hook previously registered
at
java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:45)
at java.lang.Runtime.addShutdownHook(Runtime.java:192)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1194)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
at org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:81)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1180)
at org.apache.hadoop.fs.FileSystem.access$400(FileSystem.java:53)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1197)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:482)
at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:504)
at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:580)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:596)
Can someone please point out if and what I am doing wrong?
Thanks,
Siddhartha Reddy
Re: distcp fails when copying from s3 to hdfs
Posted by Siddhartha Reddy <si...@grok.in>.
Thanks for the quick response, Tom.
I have just switched to Hadoop 0.16.2 and tried this again. Now I am getting
the following error:
Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input source
s3://id:secret@bucket-name/file.txt does not exist.
I copied the file to S3 using the following command:
bin/hadoop distcp file.txt s3://id:secret@bucket-name/file.txt
To check that the file actually exists on S3, I tried the following
commands:
bin/hadoop fs -fs s3://id:secret@bucket-name -ls
bin/hadoop fs -fs s3://id:secret@bucket-name -ls
The first returned nothing, while the second returned the following:
Found 1 items
/_distcp_logs_5vzva5 <dir> 1969-12-31 19:00 rwxrwxrwx
And when I tried to copy it back to hdfs using the following command:
bin/hadoop distcp s3://id:secret@bucket-name/file.txt file2.txt
I got this error:
Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input source
s3://id:secret@bucket-name/file.txt does not exist.
at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:504)
at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:520)
at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:596)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:612)
Any pointers on why this could be happening?
Thanks,
Siddhartha
On Fri, Apr 4, 2008 at 2:13 PM, Tom White <to...@gmail.com> wrote:
> Hi Siddhartha,
>
> This is a problem in 0.16.1
> (https://issues.apache.org/jira/browse/HADOOP-3027) that is fixed in
> 0.16.2, which was released yesterday.
>
> Tom
>
> On 04/04/2008, Siddhartha Reddy <si...@grok.in> wrote:
> > I am trying to run a Hadoop cluster on Amazon EC2 and backup all the
> data on
> > Amazon S3 between the runs. I am using Hadoop 0.16.1 on a cluster made
> up of
> > CentOS 5 images (ami-08f41161).
> >
> >
> > I am able to copy from hdfs to S3 using the following command:
> >
> > bin/hadoop distcp file.txt s3://id:secret@bucket-name/file.txt
> >
> >
> > But copying from S3 to hdfs with the following command fails:
> >
> > bin/hadoop distcp s3://id:secret@bucket-name/file.txt file2.txt
> >
> >
> > with the following error:
> >
> > With failures, global counters are inaccurate; consider running with -i
> > Copy failed: java.lang.IllegalArgumentException: Hook previously
> registered
> > at
> >
> java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:45)
> > at java.lang.Runtime.addShutdownHook(Runtime.java:192)
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1194)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
> > at
> org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:81)
> > at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1180)
> > at org.apache.hadoop.fs.FileSystem.access$400(FileSystem.java:53)
> > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1197)
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
> > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> > at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:482)
> > at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:504)
> > at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:580)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> > at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:596)
> >
> >
> > Can someone please point out if and what I am doing wrong?
> >
> > Thanks,
> >
> > Siddhartha Reddy
> >
>
--
http://sids.in
"If you are not having fun, you are not doing it right."
Re: distcp fails when copying from s3 to hdfs
Posted by Tom White <to...@gmail.com>.
Hi Siddhartha,
This is a problem in 0.16.1
(https://issues.apache.org/jira/browse/HADOOP-3027) that is fixed in
0.16.2, which was released yesterday.
Tom
On 04/04/2008, Siddhartha Reddy <si...@grok.in> wrote:
> I am trying to run a Hadoop cluster on Amazon EC2 and backup all the data on
> Amazon S3 between the runs. I am using Hadoop 0.16.1 on a cluster made up of
> CentOS 5 images (ami-08f41161).
>
>
> I am able to copy from hdfs to S3 using the following command:
>
> bin/hadoop distcp file.txt s3://id:secret@bucket-name/file.txt
>
>
> But copying from S3 to hdfs with the following command fails:
>
> bin/hadoop distcp s3://id:secret@bucket-name/file.txt file2.txt
>
>
> with the following error:
>
> With failures, global counters are inaccurate; consider running with -i
> Copy failed: java.lang.IllegalArgumentException: Hook previously registered
> at
> java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:45)
> at java.lang.Runtime.addShutdownHook(Runtime.java:192)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1194)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
> at org.apache.hadoop.fs.s3.S3FileSystem.initialize(S3FileSystem.java:81)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1180)
> at org.apache.hadoop.fs.FileSystem.access$400(FileSystem.java:53)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1197)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:148)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> at org.apache.hadoop.util.CopyFiles.checkSrcPath(CopyFiles.java:482)
> at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:504)
> at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:580)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:596)
>
>
> Can someone please point out if and what I am doing wrong?
>
> Thanks,
>
> Siddhartha Reddy
>