You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Divij Durve <di...@gmail.com> on 2009/06/17 18:30:57 UTC

Trying to setup Cluster

Im trying to setup a cluster with 3 different machines running Fedora. I
cant get them to log into the localhost without the password but thats the
least of my worries at the moment.

I am posting my config files and the master and slave files let me know if
anyone can spot a problem with the configs...


Hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
    <name>dfs.data.dir</name>
        <value>$HADOOP_HOME/dfs-data</value>
            <final>true</final>
              </property>

 <property>
     <name>dfs.name.dir</name>
         <value>$HADOOP_HOME/dfs-name</value>
             <final>true</final>
               </property>


<property>
  <name>hadoop.tmp.dir</name>
    <value>$HADOOP_HOME/hadoop-tmp</value>
      <description>A base for other temporary directories.</description>
      </property>


<property>
  <name>fs.default.name</name>
    <value>hdfs://gobi.<something>.<something>:54310</value>
      <description>The name of the default file system.  A URI whose
        scheme and authority determine the FileSystem implementation.  The
          uri's scheme determines the config property (fs.SCHEME.impl)
naming
            the FileSystem implementation class.  The uri's authority is
used to
              determine the host, port, etc. for a FileSystem.</description>
              </property>

<property>
  <name>mapred.job.tracker</name>
    <value>kalahari.<something>.<something>:54311</value>
      <description>The host and port that the MapReduce job tracker runs
        at.  If "local", then jobs are run in-process as a single map
          and reduce task.
            </description>
            </property>

 <property>
     <name>mapred.system.dir</name>
         <value>$HADOOP_HOME/mapred-system</value>
             <final>true</final>
               </property>

<property>
  <name>dfs.replication</name>
    <value>1</value>
      <description>Default block replication.
        The actual number of replications can be specified when the file is
created.
          The default is used if replication is not specified in create
time.
            </description>
            </property>


<property>
  <name>mapred.local.dir</name>
    <value>$HADOOP_HOME/mapred-local</value>
  <name>dfs.replication</name>
    <value>1</value>
</property>


</configuration>


Slave:
kongur.something.something

master:
kalahari.something.something

i execute the dfs-start.sh command from gobi.something.something.

is there any other info that i should provide in order to help? Also Kongur
is where im running the data node the master file on kongur should have
localhost in it rite? thanks for the help

Divij

Re: Trying to setup Cluster

Posted by Dmitriy Ryaboy <dv...@cloudera.com>.

Divij,
In regards to your ssh problem --
1) make sure that your authorized_keys file contains the public key (not the
private key).
2) make sure the permissions on the .ssh directory and the files within it
are correct. They should look something like this:

dvryaboy@ubuntu:~$ ls -la .ssh/
total 24
drwx------  2 dvryaboy dvryaboy 4096 2009-06-18 09:28 .
drwxr-xr-x 58 dvryaboy dvryaboy 4096 2009-06-18 17:18 ..
-rw-r--r--  1 dvryaboy dvryaboy  397 2009-06-18 09:28 authorized_keys
-rw-------  1 dvryaboy dvryaboy 1675 2009-06-18 09:27 id_rsa
-rw-r--r--  1 dvryaboy dvryaboy  397 2009-06-18 09:27 id_rsa.pub
-rw-r--r--  1 dvryaboy dvryaboy 1768 2009-06-18 09:31 known_hosts

(note that the private key and the directory are restricted to my user).

-D

On Fri, Jun 19, 2009 at 9:04 AM, Divij Durve <di...@gmail.com> wrote:

> Thanks for the info aaron. I think the $HADOOP_HOME does get resolved but i
> will change it anyway. I have tried all possible methods of getting the
> passwordless ssh to work even done cat <file where generated key is saved>
> >> <authorized keys file>
>
> It still asks for the pass for "ssh localhost". I moved the job tracker to
> the main  node and then the cluster started working i did a data load but
> when i sent out a query like select count(1) from <table name> it gave me
> an
> error. the query select * from <table name> worked just fine. I really cant
> figure out whats going wrong. I sent a mail out with the error a after this
> mail. Also let me know if there is any added info i need to give to help
> with a solution.
>
> Thanks
> Divij
>
>
> On Thu, Jun 18, 2009 at 4:32 PM, Aaron Kimball <aa...@cloudera.com> wrote:
>
> > Are you encountering specific problems?
> >
> > I don't think that hadoop's config files will evaluate environment
> > variables. So $HADOOP_HOME won't be interpreted correctly.
> >
> > For passwordless ssh, see
> >
> http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.htmlor
> > just check the manpage for ssh-keygen.
> >
> > - Aaron
> >
> > On Wed, Jun 17, 2009 at 9:30 AM, Divij Durve <di...@gmail.com>
> wrote:
> >
> > > Im trying to setup a cluster with 3 different machines running Fedora.
> I
> > > cant get them to log into the localhost without the password but thats
> > the
> > > least of my worries at the moment.
> > >
> > > I am posting my config files and the master and slave files let me know
> > if
> > > anyone can spot a problem with the configs...
> > >
> > >
> > > Hadoop-site.xml
> > > <?xml version="1.0"?>
> > > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> > >
> > > <!-- Put site-specific property overrides in this file. -->
> > >
> > > <configuration>
> > > <property>
> > >    <name>dfs.data.dir</name>
> > >        <value>$HADOOP_HOME/dfs-data</value>
> > >            <final>true</final>
> > >              </property>
> > >
> > >  <property>
> > >     <name>dfs.name.dir</name>
> > >         <value>$HADOOP_HOME/dfs-name</value>
> > >             <final>true</final>
> > >               </property>
> > >
> > >
> > > <property>
> > >  <name>hadoop.tmp.dir</name>
> > >    <value>$HADOOP_HOME/hadoop-tmp</value>
> > >      <description>A base for other temporary directories.</description>
> > >      </property>
> > >
> > >
> > > <property>
> > >  <name>fs.default.name</name>
> > >    <value>hdfs://gobi.<something>.<something>:54310</value>
> > >      <description>The name of the default file system.  A URI whose
> > >        scheme and authority determine the FileSystem implementation.
>  The
> > >          uri's scheme determines the config property (fs.SCHEME.impl)
> > > naming
> > >            the FileSystem implementation class.  The uri's authority is
> > > used to
> > >              determine the host, port, etc. for a
> > FileSystem.</description>
> > >              </property>
> > >
> > > <property>
> > >  <name>mapred.job.tracker</name>
> > >    <value>kalahari.<something>.<something>:54311</value>
> > >      <description>The host and port that the MapReduce job tracker runs
> > >        at.  If "local", then jobs are run in-process as a single map
> > >          and reduce task.
> > >            </description>
> > >            </property>
> > >
> > >  <property>
> > >     <name>mapred.system.dir</name>
> > >         <value>$HADOOP_HOME/mapred-system</value>
> > >             <final>true</final>
> > >               </property>
> > >
> > > <property>
> > >  <name>dfs.replication</name>
> > >    <value>1</value>
> > >      <description>Default block replication.
> > >        The actual number of replications can be specified when the file
> > is
> > > created.
> > >          The default is used if replication is not specified in create
> > > time.
> > >            </description>
> > >            </property>
> > >
> > >
> > > <property>
> > >  <name>mapred.local.dir</name>
> > >    <value>$HADOOP_HOME/mapred-local</value>
> > >  <name>dfs.replication</name>
> > >    <value>1</value>
> > > </property>
> > >
> > >
> > > </configuration>
> > >
> > >
> > > Slave:
> > > kongur.something.something
> > >
> > > master:
> > > kalahari.something.something
> > >
> > > i execute the dfs-start.sh command from gobi.something.something.
> > >
> > > is there any other info that i should provide in order to help? Also
> > Kongur
> > > is where im running the data node the master file on kongur should have
> > > localhost in it rite? thanks for the help
> > >
> > > Divij
> > >
> >
>

Re: Trying to setup Cluster

Posted by Divij Durve <di...@gmail.com>.

Thanks for the info aaron. I think the $HADOOP_HOME does get resolved but i
will change it anyway. I have tried all possible methods of getting the
passwordless ssh to work even done cat <file where generated key is saved>
>> <authorized keys file>

It still asks for the pass for "ssh localhost". I moved the job tracker to
the main  node and then the cluster started working i did a data load but
when i sent out a query like select count(1) from <table name> it gave me an
error. the query select * from <table name> worked just fine. I really cant
figure out whats going wrong. I sent a mail out with the error a after this
mail. Also let me know if there is any added info i need to give to help
with a solution.

Thanks
Divij


On Thu, Jun 18, 2009 at 4:32 PM, Aaron Kimball <aa...@cloudera.com> wrote:

> Are you encountering specific problems?
>
> I don't think that hadoop's config files will evaluate environment
> variables. So $HADOOP_HOME won't be interpreted correctly.
>
> For passwordless ssh, see
> http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.htmlor
> just check the manpage for ssh-keygen.
>
> - Aaron
>
> On Wed, Jun 17, 2009 at 9:30 AM, Divij Durve <di...@gmail.com> wrote:
>
> > Im trying to setup a cluster with 3 different machines running Fedora. I
> > cant get them to log into the localhost without the password but thats
> the
> > least of my worries at the moment.
> >
> > I am posting my config files and the master and slave files let me know
> if
> > anyone can spot a problem with the configs...
> >
> >
> > Hadoop-site.xml
> > <?xml version="1.0"?>
> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >
> > <!-- Put site-specific property overrides in this file. -->
> >
> > <configuration>
> > <property>
> >    <name>dfs.data.dir</name>
> >        <value>$HADOOP_HOME/dfs-data</value>
> >            <final>true</final>
> >              </property>
> >
> >  <property>
> >     <name>dfs.name.dir</name>
> >         <value>$HADOOP_HOME/dfs-name</value>
> >             <final>true</final>
> >               </property>
> >
> >
> > <property>
> >  <name>hadoop.tmp.dir</name>
> >    <value>$HADOOP_HOME/hadoop-tmp</value>
> >      <description>A base for other temporary directories.</description>
> >      </property>
> >
> >
> > <property>
> >  <name>fs.default.name</name>
> >    <value>hdfs://gobi.<something>.<something>:54310</value>
> >      <description>The name of the default file system.  A URI whose
> >        scheme and authority determine the FileSystem implementation.  The
> >          uri's scheme determines the config property (fs.SCHEME.impl)
> > naming
> >            the FileSystem implementation class.  The uri's authority is
> > used to
> >              determine the host, port, etc. for a
> FileSystem.</description>
> >              </property>
> >
> > <property>
> >  <name>mapred.job.tracker</name>
> >    <value>kalahari.<something>.<something>:54311</value>
> >      <description>The host and port that the MapReduce job tracker runs
> >        at.  If "local", then jobs are run in-process as a single map
> >          and reduce task.
> >            </description>
> >            </property>
> >
> >  <property>
> >     <name>mapred.system.dir</name>
> >         <value>$HADOOP_HOME/mapred-system</value>
> >             <final>true</final>
> >               </property>
> >
> > <property>
> >  <name>dfs.replication</name>
> >    <value>1</value>
> >      <description>Default block replication.
> >        The actual number of replications can be specified when the file
> is
> > created.
> >          The default is used if replication is not specified in create
> > time.
> >            </description>
> >            </property>
> >
> >
> > <property>
> >  <name>mapred.local.dir</name>
> >    <value>$HADOOP_HOME/mapred-local</value>
> >  <name>dfs.replication</name>
> >    <value>1</value>
> > </property>
> >
> >
> > </configuration>
> >
> >
> > Slave:
> > kongur.something.something
> >
> > master:
> > kalahari.something.something
> >
> > i execute the dfs-start.sh command from gobi.something.something.
> >
> > is there any other info that i should provide in order to help? Also
> Kongur
> > is where im running the data node the master file on kongur should have
> > localhost in it rite? thanks for the help
> >
> > Divij
> >
>

Re: Trying to setup Cluster

Posted by Aaron Kimball <aa...@cloudera.com>.

Are you encountering specific problems?

I don't think that hadoop's config files will evaluate environment
variables. So $HADOOP_HOME won't be interpreted correctly.

For passwordless ssh, see
http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.html or
just check the manpage for ssh-keygen.

- Aaron

On Wed, Jun 17, 2009 at 9:30 AM, Divij Durve <di...@gmail.com> wrote:

> Im trying to setup a cluster with 3 different machines running Fedora. I
> cant get them to log into the localhost without the password but thats the
> least of my worries at the moment.
>
> I am posting my config files and the master and slave files let me know if
> anyone can spot a problem with the configs...
>
>
> Hadoop-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
> <property>
>    <name>dfs.data.dir</name>
>        <value>$HADOOP_HOME/dfs-data</value>
>            <final>true</final>
>              </property>
>
>  <property>
>     <name>dfs.name.dir</name>
>         <value>$HADOOP_HOME/dfs-name</value>
>             <final>true</final>
>               </property>
>
>
> <property>
>  <name>hadoop.tmp.dir</name>
>    <value>$HADOOP_HOME/hadoop-tmp</value>
>      <description>A base for other temporary directories.</description>
>      </property>
>
>
> <property>
>  <name>fs.default.name</name>
>    <value>hdfs://gobi.<something>.<something>:54310</value>
>      <description>The name of the default file system.  A URI whose
>        scheme and authority determine the FileSystem implementation.  The
>          uri's scheme determines the config property (fs.SCHEME.impl)
> naming
>            the FileSystem implementation class.  The uri's authority is
> used to
>              determine the host, port, etc. for a FileSystem.</description>
>              </property>
>
> <property>
>  <name>mapred.job.tracker</name>
>    <value>kalahari.<something>.<something>:54311</value>
>      <description>The host and port that the MapReduce job tracker runs
>        at.  If "local", then jobs are run in-process as a single map
>          and reduce task.
>            </description>
>            </property>
>
>  <property>
>     <name>mapred.system.dir</name>
>         <value>$HADOOP_HOME/mapred-system</value>
>             <final>true</final>
>               </property>
>
> <property>
>  <name>dfs.replication</name>
>    <value>1</value>
>      <description>Default block replication.
>        The actual number of replications can be specified when the file is
> created.
>          The default is used if replication is not specified in create
> time.
>            </description>
>            </property>
>
>
> <property>
>  <name>mapred.local.dir</name>
>    <value>$HADOOP_HOME/mapred-local</value>
>  <name>dfs.replication</name>
>    <value>1</value>
> </property>
>
>
> </configuration>
>
>
> Slave:
> kongur.something.something
>
> master:
> kalahari.something.something
>
> i execute the dfs-start.sh command from gobi.something.something.
>
> is there any other info that i should provide in order to help? Also Kongur
> is where im running the data node the master file on kongur should have
> localhost in it rite? thanks for the help
>
> Divij
>