You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Dhiraj <ja...@gmail.com> on 2014/09/10 09:45:07 UTC

S3 with Hadoop 2.5.0 - Not working

Hi,

I have downloaded hadoop-2.5.0 and am trying to get it working for s3
backend *(single-node in a pseudo-distributed mode)*.
I have made changes to the core-site.xml according to
https://wiki.apache.org/hadoop/AmazonS3

I have an backend object store running on my machine that supports S3.

I get the following message when i try to start the daemons
*Incorrect configuration: namenode address dfs.namenode.servicerpc-address
or dfs.namenode.rpc-address is not configured.*


root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
Incorrect configuration: namenode address dfs.namenode.servicerpc-address
or dfs.namenode.rpc-address is not configured.
Starting namenodes on []
localhost: starting namenode, logging to
/build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
localhost: starting datanode, logging to
/build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to
/build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
root@ubuntu:/build/hadoop/hadoop-2.5.0#

The deamons dont start after the above.
i get the same error if i add the property "fs.defaultFS" and set its value
to the s3 bucket but if i change the defaultFS to *hdfs://* it works fine -
am able to launch the daemons.

my core-site.xml:
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>s3://bucket1</value>
    </property>
    <property>
        <name>fs.s3.awsAccessKeyId</name>
        <value>abcd</value>
    </property>
    <property>
        <name>fs.s3.awsSecretAccessKey</name>
        <value>1234</value>
    </property>
</configuration>


I am able to list the buckets and its contents via s3cmd and boto; but
unable to get an s3 config started via hadoop

Also from the following core-file.xml listed on the website; i dont see an
implementation for s3
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support
s3 or do i need to do anything else.

cheers,
Dhiraj

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Dhiraj <ja...@gmail.com>.
Hi Harsh,

I am a newbie to hadoop.
I am able to start the nodes with hadoop 1.1.x  and hadoop 1.2.x versions
for the following property; but not with 2.5.0(fs.defaultFS)
for 1.*.* releases i dont need to specify hdfs:// like you suggested; it
works with s3://

<property>
        <name>fs.default.name</name>
        <value>s3://bucket1</value>
    </property>

Is there any configuration that i need to do with 2.5.0 release. classpath;
etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there
any way of increasing the log level. I want to know what address and port
is the following command resolving to. I have my s3 object store server
running on some address and port; so would like to know if "hadoop fs -ls
s3://" connects to amazon or to my server.
I dont see any information in namenode/datanode logs.

cheers,
Dhiraj







On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <ha...@cloudera.com> wrote:

> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
> configured.
> > Starting namenodes on []
>
> NameNode/DataNode are part of a HDFS service. It makes no sense to try
> and run them over an S3 URL default, which is a distributed filesystem
> in itself. The services need fs.defaultFS to be set to a HDFS URI to
> be able to start up.
>
> > but unable to get an s3 config started via hadoop
>
> You can run jobs over S3 input and output data by running a regular MR
> cluster on HDFS - just pass the right URI as input and output
> parameters of the job. Set your S3 properties in core-site.xml but let
> the fs.defaultFS be of HDFS type, to do this.
>
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support s3 or do i need to do anything else.
>
> In Apache Hadoop 2 we dynamically load the FS classes, so we do not
> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.
>
> On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> > Hi,
> >
> > I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> > backend (single-node in a pseudo-distributed mode).
> > I have made changes to the core-site.xml according to
> > https://wiki.apache.org/hadoop/AmazonS3
> >
> > I have an backend object store running on my machine that supports S3.
> >
> > I get the following message when i try to start the daemons
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> >
> >
> > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> > Starting namenodes on []
> > localhost: starting namenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> > localhost: starting datanode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> > Starting secondary namenodes [0.0.0.0]
> > 0.0.0.0: starting secondarynamenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> > root@ubuntu:/build/hadoop/hadoop-2.5.0#
> >
> > The deamons dont start after the above.
> > i get the same error if i add the property "fs.defaultFS" and set its
> value
> > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine
> - am
> > able to launch the daemons.
> >
> > my core-site.xml:
> > <configuration>
> >     <property>
> >         <name>fs.defaultFS</name>
> >         <value>s3://bucket1</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsAccessKeyId</name>
> >         <value>abcd</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsSecretAccessKey</name>
> >         <value>1234</value>
> >     </property>
> > </configuration>
> >
> >
> > I am able to list the buckets and its contents via s3cmd and boto; but
> > unable to get an s3 config started via hadoop
> >
> > Also from the following core-file.xml listed on the website; i dont see
> an
> > implementation for s3
> >
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
> >
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support
> > s3 or do i need to do anything else.
> >
> > cheers,
> > Dhiraj
> >
> >
> >
>
>
>
> --
> Harsh J
>

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Dhiraj <ja...@gmail.com>.
Hi Harsh,

I am a newbie to hadoop.
I am able to start the nodes with hadoop 1.1.x  and hadoop 1.2.x versions
for the following property; but not with 2.5.0(fs.defaultFS)
for 1.*.* releases i dont need to specify hdfs:// like you suggested; it
works with s3://

<property>
        <name>fs.default.name</name>
        <value>s3://bucket1</value>
    </property>

Is there any configuration that i need to do with 2.5.0 release. classpath;
etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there
any way of increasing the log level. I want to know what address and port
is the following command resolving to. I have my s3 object store server
running on some address and port; so would like to know if "hadoop fs -ls
s3://" connects to amazon or to my server.
I dont see any information in namenode/datanode logs.

cheers,
Dhiraj







On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <ha...@cloudera.com> wrote:

> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
> configured.
> > Starting namenodes on []
>
> NameNode/DataNode are part of a HDFS service. It makes no sense to try
> and run them over an S3 URL default, which is a distributed filesystem
> in itself. The services need fs.defaultFS to be set to a HDFS URI to
> be able to start up.
>
> > but unable to get an s3 config started via hadoop
>
> You can run jobs over S3 input and output data by running a regular MR
> cluster on HDFS - just pass the right URI as input and output
> parameters of the job. Set your S3 properties in core-site.xml but let
> the fs.defaultFS be of HDFS type, to do this.
>
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support s3 or do i need to do anything else.
>
> In Apache Hadoop 2 we dynamically load the FS classes, so we do not
> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.
>
> On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> > Hi,
> >
> > I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> > backend (single-node in a pseudo-distributed mode).
> > I have made changes to the core-site.xml according to
> > https://wiki.apache.org/hadoop/AmazonS3
> >
> > I have an backend object store running on my machine that supports S3.
> >
> > I get the following message when i try to start the daemons
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> >
> >
> > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> > Starting namenodes on []
> > localhost: starting namenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> > localhost: starting datanode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> > Starting secondary namenodes [0.0.0.0]
> > 0.0.0.0: starting secondarynamenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> > root@ubuntu:/build/hadoop/hadoop-2.5.0#
> >
> > The deamons dont start after the above.
> > i get the same error if i add the property "fs.defaultFS" and set its
> value
> > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine
> - am
> > able to launch the daemons.
> >
> > my core-site.xml:
> > <configuration>
> >     <property>
> >         <name>fs.defaultFS</name>
> >         <value>s3://bucket1</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsAccessKeyId</name>
> >         <value>abcd</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsSecretAccessKey</name>
> >         <value>1234</value>
> >     </property>
> > </configuration>
> >
> >
> > I am able to list the buckets and its contents via s3cmd and boto; but
> > unable to get an s3 config started via hadoop
> >
> > Also from the following core-file.xml listed on the website; i dont see
> an
> > implementation for s3
> >
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
> >
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support
> > s3 or do i need to do anything else.
> >
> > cheers,
> > Dhiraj
> >
> >
> >
>
>
>
> --
> Harsh J
>

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Dhiraj <ja...@gmail.com>.
Hi Harsh,

I am a newbie to hadoop.
I am able to start the nodes with hadoop 1.1.x  and hadoop 1.2.x versions
for the following property; but not with 2.5.0(fs.defaultFS)
for 1.*.* releases i dont need to specify hdfs:// like you suggested; it
works with s3://

<property>
        <name>fs.default.name</name>
        <value>s3://bucket1</value>
    </property>

Is there any configuration that i need to do with 2.5.0 release. classpath;
etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there
any way of increasing the log level. I want to know what address and port
is the following command resolving to. I have my s3 object store server
running on some address and port; so would like to know if "hadoop fs -ls
s3://" connects to amazon or to my server.
I dont see any information in namenode/datanode logs.

cheers,
Dhiraj







On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <ha...@cloudera.com> wrote:

> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
> configured.
> > Starting namenodes on []
>
> NameNode/DataNode are part of a HDFS service. It makes no sense to try
> and run them over an S3 URL default, which is a distributed filesystem
> in itself. The services need fs.defaultFS to be set to a HDFS URI to
> be able to start up.
>
> > but unable to get an s3 config started via hadoop
>
> You can run jobs over S3 input and output data by running a regular MR
> cluster on HDFS - just pass the right URI as input and output
> parameters of the job. Set your S3 properties in core-site.xml but let
> the fs.defaultFS be of HDFS type, to do this.
>
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support s3 or do i need to do anything else.
>
> In Apache Hadoop 2 we dynamically load the FS classes, so we do not
> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.
>
> On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> > Hi,
> >
> > I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> > backend (single-node in a pseudo-distributed mode).
> > I have made changes to the core-site.xml according to
> > https://wiki.apache.org/hadoop/AmazonS3
> >
> > I have an backend object store running on my machine that supports S3.
> >
> > I get the following message when i try to start the daemons
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> >
> >
> > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> > Starting namenodes on []
> > localhost: starting namenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> > localhost: starting datanode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> > Starting secondary namenodes [0.0.0.0]
> > 0.0.0.0: starting secondarynamenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> > root@ubuntu:/build/hadoop/hadoop-2.5.0#
> >
> > The deamons dont start after the above.
> > i get the same error if i add the property "fs.defaultFS" and set its
> value
> > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine
> - am
> > able to launch the daemons.
> >
> > my core-site.xml:
> > <configuration>
> >     <property>
> >         <name>fs.defaultFS</name>
> >         <value>s3://bucket1</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsAccessKeyId</name>
> >         <value>abcd</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsSecretAccessKey</name>
> >         <value>1234</value>
> >     </property>
> > </configuration>
> >
> >
> > I am able to list the buckets and its contents via s3cmd and boto; but
> > unable to get an s3 config started via hadoop
> >
> > Also from the following core-file.xml listed on the website; i dont see
> an
> > implementation for s3
> >
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
> >
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support
> > s3 or do i need to do anything else.
> >
> > cheers,
> > Dhiraj
> >
> >
> >
>
>
>
> --
> Harsh J
>

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Dhiraj <ja...@gmail.com>.
Hi Harsh,

I am a newbie to hadoop.
I am able to start the nodes with hadoop 1.1.x  and hadoop 1.2.x versions
for the following property; but not with 2.5.0(fs.defaultFS)
for 1.*.* releases i dont need to specify hdfs:// like you suggested; it
works with s3://

<property>
        <name>fs.default.name</name>
        <value>s3://bucket1</value>
    </property>

Is there any configuration that i need to do with 2.5.0 release. classpath;
etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there
any way of increasing the log level. I want to know what address and port
is the following command resolving to. I have my s3 object store server
running on some address and port; so would like to know if "hadoop fs -ls
s3://" connects to amazon or to my server.
I dont see any information in namenode/datanode logs.

cheers,
Dhiraj







On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <ha...@cloudera.com> wrote:

> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
> configured.
> > Starting namenodes on []
>
> NameNode/DataNode are part of a HDFS service. It makes no sense to try
> and run them over an S3 URL default, which is a distributed filesystem
> in itself. The services need fs.defaultFS to be set to a HDFS URI to
> be able to start up.
>
> > but unable to get an s3 config started via hadoop
>
> You can run jobs over S3 input and output data by running a regular MR
> cluster on HDFS - just pass the right URI as input and output
> parameters of the job. Set your S3 properties in core-site.xml but let
> the fs.defaultFS be of HDFS type, to do this.
>
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support s3 or do i need to do anything else.
>
> In Apache Hadoop 2 we dynamically load the FS classes, so we do not
> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.
>
> On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> > Hi,
> >
> > I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> > backend (single-node in a pseudo-distributed mode).
> > I have made changes to the core-site.xml according to
> > https://wiki.apache.org/hadoop/AmazonS3
> >
> > I have an backend object store running on my machine that supports S3.
> >
> > I get the following message when i try to start the daemons
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> >
> >
> > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> > Starting namenodes on []
> > localhost: starting namenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> > localhost: starting datanode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> > Starting secondary namenodes [0.0.0.0]
> > 0.0.0.0: starting secondarynamenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> > root@ubuntu:/build/hadoop/hadoop-2.5.0#
> >
> > The deamons dont start after the above.
> > i get the same error if i add the property "fs.defaultFS" and set its
> value
> > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine
> - am
> > able to launch the daemons.
> >
> > my core-site.xml:
> > <configuration>
> >     <property>
> >         <name>fs.defaultFS</name>
> >         <value>s3://bucket1</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsAccessKeyId</name>
> >         <value>abcd</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsSecretAccessKey</name>
> >         <value>1234</value>
> >     </property>
> > </configuration>
> >
> >
> > I am able to list the buckets and its contents via s3cmd and boto; but
> > unable to get an s3 config started via hadoop
> >
> > Also from the following core-file.xml listed on the website; i dont see
> an
> > implementation for s3
> >
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
> >
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support
> > s3 or do i need to do anything else.
> >
> > cheers,
> > Dhiraj
> >
> >
> >
>
>
>
> --
> Harsh J
>

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Harsh J <ha...@cloudera.com>.
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
> Starting namenodes on []

NameNode/DataNode are part of a HDFS service. It makes no sense to try
and run them over an S3 URL default, which is a distributed filesystem
in itself. The services need fs.defaultFS to be set to a HDFS URI to
be able to start up.

> but unable to get an s3 config started via hadoop

You can run jobs over S3 input and output data by running a regular MR
cluster on HDFS - just pass the right URI as input and output
parameters of the job. Set your S3 properties in core-site.xml but let
the fs.defaultFS be of HDFS type, to do this.

> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else.

In Apache Hadoop 2 we dynamically load the FS classes, so we do not
need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.

On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> Hi,
>
> I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> backend (single-node in a pseudo-distributed mode).
> I have made changes to the core-site.xml according to
> https://wiki.apache.org/hadoop/AmazonS3
>
> I have an backend object store running on my machine that supports S3.
>
> I get the following message when i try to start the daemons
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
>
>
> root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
> Starting namenodes on []
> localhost: starting namenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> localhost: starting datanode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> Starting secondary namenodes [0.0.0.0]
> 0.0.0.0: starting secondarynamenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> root@ubuntu:/build/hadoop/hadoop-2.5.0#
>
> The deamons dont start after the above.
> i get the same error if i add the property "fs.defaultFS" and set its value
> to the s3 bucket but if i change the defaultFS to hdfs:// it works fine - am
> able to launch the daemons.
>
> my core-site.xml:
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>s3://bucket1</value>
>     </property>
>     <property>
>         <name>fs.s3.awsAccessKeyId</name>
>         <value>abcd</value>
>     </property>
>     <property>
>         <name>fs.s3.awsSecretAccessKey</name>
>         <value>1234</value>
>     </property>
> </configuration>
>
>
> I am able to list the buckets and its contents via s3cmd and boto; but
> unable to get an s3 config started via hadoop
>
> Also from the following core-file.xml listed on the website; i dont see an
> implementation for s3
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support
> s3 or do i need to do anything else.
>
> cheers,
> Dhiraj
>
>
>



-- 
Harsh J

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Harsh J <ha...@cloudera.com>.
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
> Starting namenodes on []

NameNode/DataNode are part of a HDFS service. It makes no sense to try
and run them over an S3 URL default, which is a distributed filesystem
in itself. The services need fs.defaultFS to be set to a HDFS URI to
be able to start up.

> but unable to get an s3 config started via hadoop

You can run jobs over S3 input and output data by running a regular MR
cluster on HDFS - just pass the right URI as input and output
parameters of the job. Set your S3 properties in core-site.xml but let
the fs.defaultFS be of HDFS type, to do this.

> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else.

In Apache Hadoop 2 we dynamically load the FS classes, so we do not
need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.

On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> Hi,
>
> I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> backend (single-node in a pseudo-distributed mode).
> I have made changes to the core-site.xml according to
> https://wiki.apache.org/hadoop/AmazonS3
>
> I have an backend object store running on my machine that supports S3.
>
> I get the following message when i try to start the daemons
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
>
>
> root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
> Starting namenodes on []
> localhost: starting namenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> localhost: starting datanode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> Starting secondary namenodes [0.0.0.0]
> 0.0.0.0: starting secondarynamenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> root@ubuntu:/build/hadoop/hadoop-2.5.0#
>
> The deamons dont start after the above.
> i get the same error if i add the property "fs.defaultFS" and set its value
> to the s3 bucket but if i change the defaultFS to hdfs:// it works fine - am
> able to launch the daemons.
>
> my core-site.xml:
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>s3://bucket1</value>
>     </property>
>     <property>
>         <name>fs.s3.awsAccessKeyId</name>
>         <value>abcd</value>
>     </property>
>     <property>
>         <name>fs.s3.awsSecretAccessKey</name>
>         <value>1234</value>
>     </property>
> </configuration>
>
>
> I am able to list the buckets and its contents via s3cmd and boto; but
> unable to get an s3 config started via hadoop
>
> Also from the following core-file.xml listed on the website; i dont see an
> implementation for s3
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support
> s3 or do i need to do anything else.
>
> cheers,
> Dhiraj
>
>
>



-- 
Harsh J

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Harsh J <ha...@cloudera.com>.
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
> Starting namenodes on []

NameNode/DataNode are part of a HDFS service. It makes no sense to try
and run them over an S3 URL default, which is a distributed filesystem
in itself. The services need fs.defaultFS to be set to a HDFS URI to
be able to start up.

> but unable to get an s3 config started via hadoop

You can run jobs over S3 input and output data by running a regular MR
cluster on HDFS - just pass the right URI as input and output
parameters of the job. Set your S3 properties in core-site.xml but let
the fs.defaultFS be of HDFS type, to do this.

> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else.

In Apache Hadoop 2 we dynamically load the FS classes, so we do not
need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.

On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> Hi,
>
> I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> backend (single-node in a pseudo-distributed mode).
> I have made changes to the core-site.xml according to
> https://wiki.apache.org/hadoop/AmazonS3
>
> I have an backend object store running on my machine that supports S3.
>
> I get the following message when i try to start the daemons
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
>
>
> root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
> Starting namenodes on []
> localhost: starting namenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> localhost: starting datanode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> Starting secondary namenodes [0.0.0.0]
> 0.0.0.0: starting secondarynamenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> root@ubuntu:/build/hadoop/hadoop-2.5.0#
>
> The deamons dont start after the above.
> i get the same error if i add the property "fs.defaultFS" and set its value
> to the s3 bucket but if i change the defaultFS to hdfs:// it works fine - am
> able to launch the daemons.
>
> my core-site.xml:
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>s3://bucket1</value>
>     </property>
>     <property>
>         <name>fs.s3.awsAccessKeyId</name>
>         <value>abcd</value>
>     </property>
>     <property>
>         <name>fs.s3.awsSecretAccessKey</name>
>         <value>1234</value>
>     </property>
> </configuration>
>
>
> I am able to list the buckets and its contents via s3cmd and boto; but
> unable to get an s3 config started via hadoop
>
> Also from the following core-file.xml listed on the website; i dont see an
> implementation for s3
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support
> s3 or do i need to do anything else.
>
> cheers,
> Dhiraj
>
>
>



-- 
Harsh J

Re: S3 with Hadoop 2.5.0 - Not working

Posted by Harsh J <ha...@cloudera.com>.
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
> Starting namenodes on []

NameNode/DataNode are part of a HDFS service. It makes no sense to try
and run them over an S3 URL default, which is a distributed filesystem
in itself. The services need fs.defaultFS to be set to a HDFS URI to
be able to start up.

> but unable to get an s3 config started via hadoop

You can run jobs over S3 input and output data by running a regular MR
cluster on HDFS - just pass the right URI as input and output
parameters of the job. Set your S3 properties in core-site.xml but let
the fs.defaultFS be of HDFS type, to do this.

> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else.

In Apache Hadoop 2 we dynamically load the FS classes, so we do not
need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.

On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <ja...@gmail.com> wrote:
> Hi,
>
> I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> backend (single-node in a pseudo-distributed mode).
> I have made changes to the core-site.xml according to
> https://wiki.apache.org/hadoop/AmazonS3
>
> I have an backend object store running on my machine that supports S3.
>
> I get the following message when i try to start the daemons
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
>
>
> root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> Incorrect configuration: namenode address dfs.namenode.servicerpc-address or
> dfs.namenode.rpc-address is not configured.
> Starting namenodes on []
> localhost: starting namenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> localhost: starting datanode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> Starting secondary namenodes [0.0.0.0]
> 0.0.0.0: starting secondarynamenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> root@ubuntu:/build/hadoop/hadoop-2.5.0#
>
> The deamons dont start after the above.
> i get the same error if i add the property "fs.defaultFS" and set its value
> to the s3 bucket but if i change the defaultFS to hdfs:// it works fine - am
> able to launch the daemons.
>
> my core-site.xml:
> <configuration>
>     <property>
>         <name>fs.defaultFS</name>
>         <value>s3://bucket1</value>
>     </property>
>     <property>
>         <name>fs.s3.awsAccessKeyId</name>
>         <value>abcd</value>
>     </property>
>     <property>
>         <name>fs.s3.awsSecretAccessKey</name>
>         <value>1234</value>
>     </property>
> </configuration>
>
>
> I am able to list the buckets and its contents via s3cmd and boto; but
> unable to get an s3 config started via hadoop
>
> Also from the following core-file.xml listed on the website; i dont see an
> implementation for s3
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support
> s3 or do i need to do anything else.
>
> cheers,
> Dhiraj
>
>
>



-- 
Harsh J