You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Stephen Armstrong <st...@linqia.com> on 2015/05/09 00:33:15 UTC

distcp fails with s3n or s3a in 2.6.0

Hello,

I've been stuck on a classpath problem for a while now, and am hoping for
some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I
posted this on Stack Overflow (
http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0)
but the only advice I got was something I'd already tried that wasn't
working for me.

It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX)
and is running Java and Pig jobs just fine. For distcp, I get no filesystem
for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/*
and now hadoop classpath shows that folder, but I now get
ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Could anyone give me some clues where to look? Since the error changed, I
think the aws jar is being included (otherwise what caused the filesystem
to be registered), but when run on the actual cluster it seems like it's
not packaging the right jars.

Thanks
Steve

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Stephen Armstrong <st...@linqia.com>.
Thanks Chris,

I don't know why I couldn't find that e-mail chain, but the
"mapreduce.application.classpath" property is what I needed to change.
Thanks for the help.

Steve

On Mon, May 11, 2015 at 9:59 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

>  Hello Steve,
>
>  There was a similar discussion about this on the mailing list last month:
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E
>
>  I recommend reading through all of the information in that thread and
> implementing the suggestions.  You'll also see reference to a jira issue
> that tracks improvements we need to make in the documentation to make this
> more obvious.  Feel free to watch that issue if you'd like to receive
> status updates as the documentation progresses.
>
>  I hope this helps.
>
>  --Chris Nauroth
>
>   From: Stephen Armstrong <st...@linqia.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, May 8, 2015 at 3:33 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: distcp fails with s3n or s3a in 2.6.0
>
>     Hello,
>
>  I've been stuck on a classpath problem for a while now, and am hoping for
> some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I
> posted this on Stack Overflow (
> http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0)
> but the only advice I got was something I'd already tried that wasn't
> working for me.
>
>  It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX)
> and is running Java and Pig jobs just fine. For distcp, I get no filesystem
> for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/*
> and now hadoop classpath shows that folder, but I now get
> ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem
>
>  Could anyone give me some clues where to look? Since the error changed, I
> think the aws jar is being included (otherwise what caused the filesystem
> to be registered), but when run on the actual cluster it seems like it's
> not packaging the right jars.
>
>  Thanks
>  Steve
>



-- 
Stephen Armstrong
Senior Software Engineer
Linqia, Inc - Matching Marketers with Storytellers

www.linqia.com
Like <http://www.facebook.com/linqia> | Follow
<https://twitter.com/#%21/linqia> | Blog <http://www.linqia.com/blog/>

Email. stephen.armstrong@linqia.com

The information in this e-mail and in any attachments is confidential and
solely for the attention and use of the named addressee(s). You are hereby
notified that any dissemination, distribution or copy of this communication
is prohibited without the prior written consent of Linqia, Inc. If you have
received this communication in error, please, notify the sender immediately
by reply e-mail and delete this e-mail from your system as well as any
files transmitted with it.

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Stephen Armstrong <st...@linqia.com>.
Thanks Chris,

I don't know why I couldn't find that e-mail chain, but the
"mapreduce.application.classpath" property is what I needed to change.
Thanks for the help.

Steve

On Mon, May 11, 2015 at 9:59 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

>  Hello Steve,
>
>  There was a similar discussion about this on the mailing list last month:
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E
>
>  I recommend reading through all of the information in that thread and
> implementing the suggestions.  You'll also see reference to a jira issue
> that tracks improvements we need to make in the documentation to make this
> more obvious.  Feel free to watch that issue if you'd like to receive
> status updates as the documentation progresses.
>
>  I hope this helps.
>
>  --Chris Nauroth
>
>   From: Stephen Armstrong <st...@linqia.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, May 8, 2015 at 3:33 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: distcp fails with s3n or s3a in 2.6.0
>
>     Hello,
>
>  I've been stuck on a classpath problem for a while now, and am hoping for
> some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I
> posted this on Stack Overflow (
> http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0)
> but the only advice I got was something I'd already tried that wasn't
> working for me.
>
>  It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX)
> and is running Java and Pig jobs just fine. For distcp, I get no filesystem
> for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/*
> and now hadoop classpath shows that folder, but I now get
> ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem
>
>  Could anyone give me some clues where to look? Since the error changed, I
> think the aws jar is being included (otherwise what caused the filesystem
> to be registered), but when run on the actual cluster it seems like it's
> not packaging the right jars.
>
>  Thanks
>  Steve
>



-- 
Stephen Armstrong
Senior Software Engineer
Linqia, Inc - Matching Marketers with Storytellers

www.linqia.com
Like <http://www.facebook.com/linqia> | Follow
<https://twitter.com/#%21/linqia> | Blog <http://www.linqia.com/blog/>

Email. stephen.armstrong@linqia.com

The information in this e-mail and in any attachments is confidential and
solely for the attention and use of the named addressee(s). You are hereby
notified that any dissemination, distribution or copy of this communication
is prohibited without the prior written consent of Linqia, Inc. If you have
received this communication in error, please, notify the sender immediately
by reply e-mail and delete this e-mail from your system as well as any
files transmitted with it.

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Stephen Armstrong <st...@linqia.com>.
Thanks Chris,

I don't know why I couldn't find that e-mail chain, but the
"mapreduce.application.classpath" property is what I needed to change.
Thanks for the help.

Steve

On Mon, May 11, 2015 at 9:59 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

>  Hello Steve,
>
>  There was a similar discussion about this on the mailing list last month:
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E
>
>  I recommend reading through all of the information in that thread and
> implementing the suggestions.  You'll also see reference to a jira issue
> that tracks improvements we need to make in the documentation to make this
> more obvious.  Feel free to watch that issue if you'd like to receive
> status updates as the documentation progresses.
>
>  I hope this helps.
>
>  --Chris Nauroth
>
>   From: Stephen Armstrong <st...@linqia.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, May 8, 2015 at 3:33 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: distcp fails with s3n or s3a in 2.6.0
>
>     Hello,
>
>  I've been stuck on a classpath problem for a while now, and am hoping for
> some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I
> posted this on Stack Overflow (
> http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0)
> but the only advice I got was something I'd already tried that wasn't
> working for me.
>
>  It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX)
> and is running Java and Pig jobs just fine. For distcp, I get no filesystem
> for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/*
> and now hadoop classpath shows that folder, but I now get
> ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem
>
>  Could anyone give me some clues where to look? Since the error changed, I
> think the aws jar is being included (otherwise what caused the filesystem
> to be registered), but when run on the actual cluster it seems like it's
> not packaging the right jars.
>
>  Thanks
>  Steve
>



-- 
Stephen Armstrong
Senior Software Engineer
Linqia, Inc - Matching Marketers with Storytellers

www.linqia.com
Like <http://www.facebook.com/linqia> | Follow
<https://twitter.com/#%21/linqia> | Blog <http://www.linqia.com/blog/>

Email. stephen.armstrong@linqia.com

The information in this e-mail and in any attachments is confidential and
solely for the attention and use of the named addressee(s). You are hereby
notified that any dissemination, distribution or copy of this communication
is prohibited without the prior written consent of Linqia, Inc. If you have
received this communication in error, please, notify the sender immediately
by reply e-mail and delete this e-mail from your system as well as any
files transmitted with it.

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Stephen Armstrong <st...@linqia.com>.
Thanks Chris,

I don't know why I couldn't find that e-mail chain, but the
"mapreduce.application.classpath" property is what I needed to change.
Thanks for the help.

Steve

On Mon, May 11, 2015 at 9:59 PM, Chris Nauroth <cn...@hortonworks.com>
wrote:

>  Hello Steve,
>
>  There was a similar discussion about this on the mailing list last month:
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E
>
>  I recommend reading through all of the information in that thread and
> implementing the suggestions.  You'll also see reference to a jira issue
> that tracks improvements we need to make in the documentation to make this
> more obvious.  Feel free to watch that issue if you'd like to receive
> status updates as the documentation progresses.
>
>  I hope this helps.
>
>  --Chris Nauroth
>
>   From: Stephen Armstrong <st...@linqia.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Date: Friday, May 8, 2015 at 3:33 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: distcp fails with s3n or s3a in 2.6.0
>
>     Hello,
>
>  I've been stuck on a classpath problem for a while now, and am hoping for
> some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I
> posted this on Stack Overflow (
> http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0)
> but the only advice I got was something I'd already tried that wasn't
> working for me.
>
>  It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX)
> and is running Java and Pig jobs just fine. For distcp, I get no filesystem
> for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/*
> and now hadoop classpath shows that folder, but I now get
> ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem
>
>  Could anyone give me some clues where to look? Since the error changed, I
> think the aws jar is being included (otherwise what caused the filesystem
> to be registered), but when run on the actual cluster it seems like it's
> not packaging the right jars.
>
>  Thanks
>  Steve
>



-- 
Stephen Armstrong
Senior Software Engineer
Linqia, Inc - Matching Marketers with Storytellers

www.linqia.com
Like <http://www.facebook.com/linqia> | Follow
<https://twitter.com/#%21/linqia> | Blog <http://www.linqia.com/blog/>

Email. stephen.armstrong@linqia.com

The information in this e-mail and in any attachments is confidential and
solely for the attention and use of the named addressee(s). You are hereby
notified that any dissemination, distribution or copy of this communication
is prohibited without the prior written consent of Linqia, Inc. If you have
received this communication in error, please, notify the sender immediately
by reply e-mail and delete this e-mail from your system as well as any
files transmitted with it.

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Steve,

There was a similar discussion about this on the mailing list last month:

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E

I recommend reading through all of the information in that thread and implementing the suggestions.  You'll also see reference to a jira issue that tracks improvements we need to make in the documentation to make this more obvious.  Feel free to watch that issue if you'd like to receive status updates as the documentation progresses.

I hope this helps.

--Chris Nauroth

From: Stephen Armstrong <st...@linqia.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Friday, May 8, 2015 at 3:33 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: distcp fails with s3n or s3a in 2.6.0

Hello,

I've been stuck on a classpath problem for a while now, and am hoping for some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I posted this on Stack Overflow (http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0) but the only advice I got was something I'd already tried that wasn't working for me.

It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX) and is running Java and Pig jobs just fine. For distcp, I get no filesystem for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/* and now hadoop classpath shows that folder, but I now get ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Could anyone give me some clues where to look? Since the error changed, I think the aws jar is being included (otherwise what caused the filesystem to be registered), but when run on the actual cluster it seems like it's not packaging the right jars.

Thanks
Steve

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Steve,

There was a similar discussion about this on the mailing list last month:

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E

I recommend reading through all of the information in that thread and implementing the suggestions.  You'll also see reference to a jira issue that tracks improvements we need to make in the documentation to make this more obvious.  Feel free to watch that issue if you'd like to receive status updates as the documentation progresses.

I hope this helps.

--Chris Nauroth

From: Stephen Armstrong <st...@linqia.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Friday, May 8, 2015 at 3:33 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: distcp fails with s3n or s3a in 2.6.0

Hello,

I've been stuck on a classpath problem for a while now, and am hoping for some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I posted this on Stack Overflow (http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0) but the only advice I got was something I'd already tried that wasn't working for me.

It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX) and is running Java and Pig jobs just fine. For distcp, I get no filesystem for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/* and now hadoop classpath shows that folder, but I now get ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Could anyone give me some clues where to look? Since the error changed, I think the aws jar is being included (otherwise what caused the filesystem to be registered), but when run on the actual cluster it seems like it's not packaging the right jars.

Thanks
Steve

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Steve,

There was a similar discussion about this on the mailing list last month:

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E

I recommend reading through all of the information in that thread and implementing the suggestions.  You'll also see reference to a jira issue that tracks improvements we need to make in the documentation to make this more obvious.  Feel free to watch that issue if you'd like to receive status updates as the documentation progresses.

I hope this helps.

--Chris Nauroth

From: Stephen Armstrong <st...@linqia.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Friday, May 8, 2015 at 3:33 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: distcp fails with s3n or s3a in 2.6.0

Hello,

I've been stuck on a classpath problem for a while now, and am hoping for some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I posted this on Stack Overflow (http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0) but the only advice I got was something I'd already tried that wasn't working for me.

It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX) and is running Java and Pig jobs just fine. For distcp, I get no filesystem for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/* and now hadoop classpath shows that folder, but I now get ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Could anyone give me some clues where to look? Since the error changed, I think the aws jar is being included (otherwise what caused the filesystem to be registered), but when run on the actual cluster it seems like it's not packaging the right jars.

Thanks
Steve

Re: distcp fails with s3n or s3a in 2.6.0

Posted by Chris Nauroth <cn...@hortonworks.com>.
Hello Steve,

There was a similar discussion about this on the mailing list last month:

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CD15D2069.20165%25cnauroth%40hortonworks.com%3E

I recommend reading through all of the information in that thread and implementing the suggestions.  You'll also see reference to a jira issue that tracks improvements we need to make in the documentation to make this more obvious.  Feel free to watch that issue if you'd like to receive status updates as the documentation progresses.

I hope this helps.

--Chris Nauroth

From: Stephen Armstrong <st...@linqia.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Date: Friday, May 8, 2015 at 3:33 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: distcp fails with s3n or s3a in 2.6.0

Hello,

I've been stuck on a classpath problem for a while now, and am hoping for some help. Hadoop 2.6.0 no longer seems to support distcp to S3 for me. I posted this on Stack Overflow (http://stackoverflow.com/questions/30108760/s3n-and-s3a-distcp-not-working-in-hadoop-2-6-0) but the only advice I got was something I'd already tried that wasn't working for me.

It's a stock install (unpacked the hadoop archive, and set HADOOP_PREFIX) and is running Java and Pig jobs just fine. For distcp, I get no filesystem for scheme. I set HADOOP_CLASSPATH to /opt/hadoop/share/hadoop/tools/lib/* and now hadoop classpath shows that folder, but I now get ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem

Could anyone give me some clues where to look? Since the error changed, I think the aws jar is being included (otherwise what caused the filesystem to be registered), but when run on the actual cluster it seems like it's not packaging the right jars.

Thanks
Steve