You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Ivan Ryndin <ir...@gmail.com> on 2012/12/17 18:04:49 UTC

Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
$nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
$dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only
(and machines in cluster do not have much RAM and disk space), and thus
don't want to run unnecessary processes.

-- 
Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.
Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.
Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.
Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.
Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much, Bryan!

It is now clear for me, that in development mode I'll not start secondary
namenode.
But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Bryan Beaudreault <bb...@hubspot.com>

> You don't need a secondary name node.  It creates snapshots of the name
> node metadata periodically, which helps to keep down the size of the edits
> files.  If you don't run one, over time your edits files will grow.  The
> next time you go to restart your namenode, it could take a very long time
> to start up if your edits are large.  I recommend running one in
> production, to reduce the amount of downtime if you need to replace or
> restart your namenode.  If that isn't a concern for you then you don't need
> it.
>
>
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
>
>> Hi all,
>>
>> is it necessary to run secondary namenode when starting HDFS?
>> I am dealing with Hadoop 1.1.1.
>> Looking at script $HADOOP_HOME/bin/start_dfs.sh
>> There are next lines in this file:
>>
>> # start dfs daemons
>> # start namenode after datanodes, to minimize time namenode is up w/o data
>> # note: datanodes will log connection errors until namenode starts
>> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
>> $nameStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
>> $dataStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
>> secondarynamenode
>>
>>  So, will HDFS work if I turn off starting of secondarynamenode ?
>>
>> I do ask this because I am playing with Hadoop on two-node cluster only
>> (and machines in cluster do not have much RAM and disk space), and thus
>> don't want to run unnecessary processes.
>>
>> --
>> Best regards,
>> Ivan P. Ryndin,
>>
>>
>


-- 
Best regards,
Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much, Bryan!

It is now clear for me, that in development mode I'll not start secondary
namenode.
But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Bryan Beaudreault <bb...@hubspot.com>

> You don't need a secondary name node.  It creates snapshots of the name
> node metadata periodically, which helps to keep down the size of the edits
> files.  If you don't run one, over time your edits files will grow.  The
> next time you go to restart your namenode, it could take a very long time
> to start up if your edits are large.  I recommend running one in
> production, to reduce the amount of downtime if you need to replace or
> restart your namenode.  If that isn't a concern for you then you don't need
> it.
>
>
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
>
>> Hi all,
>>
>> is it necessary to run secondary namenode when starting HDFS?
>> I am dealing with Hadoop 1.1.1.
>> Looking at script $HADOOP_HOME/bin/start_dfs.sh
>> There are next lines in this file:
>>
>> # start dfs daemons
>> # start namenode after datanodes, to minimize time namenode is up w/o data
>> # note: datanodes will log connection errors until namenode starts
>> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
>> $nameStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
>> $dataStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
>> secondarynamenode
>>
>>  So, will HDFS work if I turn off starting of secondarynamenode ?
>>
>> I do ask this because I am playing with Hadoop on two-node cluster only
>> (and machines in cluster do not have much RAM and disk space), and thus
>> don't want to run unnecessary processes.
>>
>> --
>> Best regards,
>> Ivan P. Ryndin,
>>
>>
>


-- 
Best regards,
Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much, Bryan!

It is now clear for me, that in development mode I'll not start secondary
namenode.
But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Bryan Beaudreault <bb...@hubspot.com>

> You don't need a secondary name node.  It creates snapshots of the name
> node metadata periodically, which helps to keep down the size of the edits
> files.  If you don't run one, over time your edits files will grow.  The
> next time you go to restart your namenode, it could take a very long time
> to start up if your edits are large.  I recommend running one in
> production, to reduce the amount of downtime if you need to replace or
> restart your namenode.  If that isn't a concern for you then you don't need
> it.
>
>
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
>
>> Hi all,
>>
>> is it necessary to run secondary namenode when starting HDFS?
>> I am dealing with Hadoop 1.1.1.
>> Looking at script $HADOOP_HOME/bin/start_dfs.sh
>> There are next lines in this file:
>>
>> # start dfs daemons
>> # start namenode after datanodes, to minimize time namenode is up w/o data
>> # note: datanodes will log connection errors until namenode starts
>> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
>> $nameStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
>> $dataStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
>> secondarynamenode
>>
>>  So, will HDFS work if I turn off starting of secondarynamenode ?
>>
>> I do ask this because I am playing with Hadoop on two-node cluster only
>> (and machines in cluster do not have much RAM and disk space), and thus
>> don't want to run unnecessary processes.
>>
>> --
>> Best regards,
>> Ivan P. Ryndin,
>>
>>
>


-- 
Best regards,
Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much, Bryan!

It is now clear for me, that in development mode I'll not start secondary
namenode.
But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Bryan Beaudreault <bb...@hubspot.com>

> You don't need a secondary name node.  It creates snapshots of the name
> node metadata periodically, which helps to keep down the size of the edits
> files.  If you don't run one, over time your edits files will grow.  The
> next time you go to restart your namenode, it could take a very long time
> to start up if your edits are large.  I recommend running one in
> production, to reduce the amount of downtime if you need to replace or
> restart your namenode.  If that isn't a concern for you then you don't need
> it.
>
>
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
>
>> Hi all,
>>
>> is it necessary to run secondary namenode when starting HDFS?
>> I am dealing with Hadoop 1.1.1.
>> Looking at script $HADOOP_HOME/bin/start_dfs.sh
>> There are next lines in this file:
>>
>> # start dfs daemons
>> # start namenode after datanodes, to minimize time namenode is up w/o data
>> # note: datanodes will log connection errors until namenode starts
>> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
>> $nameStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
>> $dataStartOpt
>> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
>> secondarynamenode
>>
>>  So, will HDFS work if I turn off starting of secondarynamenode ?
>>
>> I do ask this because I am playing with Hadoop on two-node cluster only
>> (and machines in cluster do not have much RAM and disk space), and thus
>> don't want to run unnecessary processes.
>>
>> --
>> Best regards,
>> Ivan P. Ryndin,
>>
>>
>


-- 
Best regards,
Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Bryan Beaudreault <bb...@hubspot.com>.
You don't need a secondary name node.  It creates snapshots of the name
node metadata periodically, which helps to keep down the size of the edits
files.  If you don't run one, over time your edits files will grow.  The
next time you go to restart your namenode, it could take a very long time
to start up if your edits are large.  I recommend running one in
production, to reduce the amount of downtime if you need to replace or
restart your namenode.  If that isn't a concern for you then you don't need
it.


On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:

> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much!

It is now clear for me, that in development mode I'll not start secondary
namenode.But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Harsh J <ha...@cloudera.com>

> The SecondaryNameNode is necessary for automatic maintenance in
> long-running clusters (read: production), but is not necessary for,
> nor tied into the basic functions/operations of HDFS.
>
> On 1.x, you can remove the script's startup of SNN by removing its
> host entry from the conf/masters file.
> On 2.x, you can selectively start the NN and DNs by using the
> hadoop-daemon.sh script commands.
>
> On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> > Hi all,
> >
> > is it necessary to run secondary namenode when starting HDFS?
> > I am dealing with Hadoop 1.1.1.
> > Looking at script $HADOOP_HOME/bin/start_dfs.sh
> > There are next lines in this file:
> >
> > # start dfs daemons
> > # start namenode after datanodes, to minimize time namenode is up w/o
> data
> > # note: datanodes will log connection errors until namenode starts
> > "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> > $nameStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> > $dataStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> > secondarynamenode
> >
> >  So, will HDFS work if I turn off starting of secondarynamenode ?
> >
> > I do ask this because I am playing with Hadoop on two-node cluster only
> (and
> > machines in cluster do not have much RAM and disk space), and thus don't
> > want to run unnecessary processes.
> >
> > --
> > Best regards,
> > Ivan P. Ryndin,
> >
>
>
>
> --
> Harsh J
>



-- 

Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much!

It is now clear for me, that in development mode I'll not start secondary
namenode.But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Harsh J <ha...@cloudera.com>

> The SecondaryNameNode is necessary for automatic maintenance in
> long-running clusters (read: production), but is not necessary for,
> nor tied into the basic functions/operations of HDFS.
>
> On 1.x, you can remove the script's startup of SNN by removing its
> host entry from the conf/masters file.
> On 2.x, you can selectively start the NN and DNs by using the
> hadoop-daemon.sh script commands.
>
> On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> > Hi all,
> >
> > is it necessary to run secondary namenode when starting HDFS?
> > I am dealing with Hadoop 1.1.1.
> > Looking at script $HADOOP_HOME/bin/start_dfs.sh
> > There are next lines in this file:
> >
> > # start dfs daemons
> > # start namenode after datanodes, to minimize time namenode is up w/o
> data
> > # note: datanodes will log connection errors until namenode starts
> > "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> > $nameStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> > $dataStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> > secondarynamenode
> >
> >  So, will HDFS work if I turn off starting of secondarynamenode ?
> >
> > I do ask this because I am playing with Hadoop on two-node cluster only
> (and
> > machines in cluster do not have much RAM and disk space), and thus don't
> > want to run unnecessary processes.
> >
> > --
> > Best regards,
> > Ivan P. Ryndin,
> >
>
>
>
> --
> Harsh J
>



-- 

Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much!

It is now clear for me, that in development mode I'll not start secondary
namenode.But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Harsh J <ha...@cloudera.com>

> The SecondaryNameNode is necessary for automatic maintenance in
> long-running clusters (read: production), but is not necessary for,
> nor tied into the basic functions/operations of HDFS.
>
> On 1.x, you can remove the script's startup of SNN by removing its
> host entry from the conf/masters file.
> On 2.x, you can selectively start the NN and DNs by using the
> hadoop-daemon.sh script commands.
>
> On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> > Hi all,
> >
> > is it necessary to run secondary namenode when starting HDFS?
> > I am dealing with Hadoop 1.1.1.
> > Looking at script $HADOOP_HOME/bin/start_dfs.sh
> > There are next lines in this file:
> >
> > # start dfs daemons
> > # start namenode after datanodes, to minimize time namenode is up w/o
> data
> > # note: datanodes will log connection errors until namenode starts
> > "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> > $nameStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> > $dataStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> > secondarynamenode
> >
> >  So, will HDFS work if I turn off starting of secondarynamenode ?
> >
> > I do ask this because I am playing with Hadoop on two-node cluster only
> (and
> > machines in cluster do not have much RAM and disk space), and thus don't
> > want to run unnecessary processes.
> >
> > --
> > Best regards,
> > Ivan P. Ryndin,
> >
>
>
>
> --
> Harsh J
>



-- 

Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Ivan Ryndin <ir...@gmail.com>.
Thank you very much!

It is now clear for me, that in development mode I'll not start secondary
namenode.But in production it's better to have it.
Thanks!

Regards,
Ivan


2012/12/17 Harsh J <ha...@cloudera.com>

> The SecondaryNameNode is necessary for automatic maintenance in
> long-running clusters (read: production), but is not necessary for,
> nor tied into the basic functions/operations of HDFS.
>
> On 1.x, you can remove the script's startup of SNN by removing its
> host entry from the conf/masters file.
> On 2.x, you can selectively start the NN and DNs by using the
> hadoop-daemon.sh script commands.
>
> On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> > Hi all,
> >
> > is it necessary to run secondary namenode when starting HDFS?
> > I am dealing with Hadoop 1.1.1.
> > Looking at script $HADOOP_HOME/bin/start_dfs.sh
> > There are next lines in this file:
> >
> > # start dfs daemons
> > # start namenode after datanodes, to minimize time namenode is up w/o
> data
> > # note: datanodes will log connection errors until namenode starts
> > "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> > $nameStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> > $dataStartOpt
> > "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> > secondarynamenode
> >
> >  So, will HDFS work if I turn off starting of secondarynamenode ?
> >
> > I do ask this because I am playing with Hadoop on two-node cluster only
> (and
> > machines in cluster do not have much RAM and disk space), and thus don't
> > want to run unnecessary processes.
> >
> > --
> > Best regards,
> > Ivan P. Ryndin,
> >
>
>
>
> --
> Harsh J
>



-- 

Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Harsh J <ha...@cloudera.com>.
The SecondaryNameNode is necessary for automatic maintenance in
long-running clusters (read: production), but is not necessary for,
nor tied into the basic functions/operations of HDFS.

On 1.x, you can remove the script's startup of SNN by removing its
host entry from the conf/masters file.
On 2.x, you can selectively start the NN and DNs by using the
hadoop-daemon.sh script commands.

On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only (and
> machines in cluster do not have much RAM and disk space), and thus don't
> want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>



-- 
Harsh J

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Harsh J <ha...@cloudera.com>.
The SecondaryNameNode is necessary for automatic maintenance in
long-running clusters (read: production), but is not necessary for,
nor tied into the basic functions/operations of HDFS.

On 1.x, you can remove the script's startup of SNN by removing its
host entry from the conf/masters file.
On 2.x, you can selectively start the NN and DNs by using the
hadoop-daemon.sh script commands.

On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only (and
> machines in cluster do not have much RAM and disk space), and thus don't
> want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>



-- 
Harsh J

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Harsh J <ha...@cloudera.com>.
The SecondaryNameNode is necessary for automatic maintenance in
long-running clusters (read: production), but is not necessary for,
nor tied into the basic functions/operations of HDFS.

On 1.x, you can remove the script's startup of SNN by removing its
host entry from the conf/masters file.
On 2.x, you can selectively start the NN and DNs by using the
hadoop-daemon.sh script commands.

On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only (and
> machines in cluster do not have much RAM and disk space), and thus don't
> want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>



-- 
Harsh J

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Harsh J <ha...@cloudera.com>.
The SecondaryNameNode is necessary for automatic maintenance in
long-running clusters (read: production), but is not necessary for,
nor tied into the basic functions/operations of HDFS.

On 1.x, you can remove the script's startup of SNN by removing its
host entry from the conf/masters file.
On 2.x, you can selectively start the NN and DNs by using the
hadoop-daemon.sh script commands.

On Mon, Dec 17, 2012 at 10:34 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only (and
> machines in cluster do not have much RAM and disk space), and thus don't
> want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>



-- 
Harsh J

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Bryan Beaudreault <bb...@hubspot.com>.
You don't need a secondary name node.  It creates snapshots of the name
node metadata periodically, which helps to keep down the size of the edits
files.  If you don't run one, over time your edits files will grow.  The
next time you go to restart your namenode, it could take a very long time
to start up if your edits are large.  I recommend running one in
production, to reduce the amount of downtime if you need to replace or
restart your namenode.  If that isn't a concern for you then you don't need
it.


On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:

> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Bryan Beaudreault <bb...@hubspot.com>.
You don't need a secondary name node.  It creates snapshots of the name
node metadata periodically, which helps to keep down the size of the edits
files.  If you don't run one, over time your edits files will grow.  The
next time you go to restart your namenode, it could take a very long time
to start up if your edits are large.  I recommend running one in
production, to reduce the amount of downtime if you need to replace or
restart your namenode.  If that isn't a concern for you then you don't need
it.


On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:

> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Bryan Beaudreault <bb...@hubspot.com>.
You don't need a secondary name node.  It creates snapshots of the name
node metadata periodically, which helps to keep down the size of the edits
files.  If you don't run one, over time your edits files will grow.  The
next time you go to restart your namenode, it could take a very long time
to start up if your edits are large.  I recommend running one in
production, to reduce the amount of downtime if you need to replace or
restart your namenode.  If that isn't a concern for you then you don't need
it.


On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:

> Hi all,
>
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>  So, will HDFS work if I turn off starting of secondarynamenode ?
>
> I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
> --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Mohammad Tariq <do...@gmail.com>.
I agree with Michael. Skipping the SNN daemon is really a bad idea when you
are dealing something real.

Best Regards,
Tariq
+91-9741563634



On Tue, Dec 18, 2012 at 12:22 AM, Patai Sangbutsarakum <
Patai.Sangbutsarakum@turn.com> wrote:

>  > is it necessary to run secondary namenode when starting HDFS?
> I would say it's not necessary. I did skip it when I first played with
> Hadoop.
>
>   From: Ivan Ryndin <ir...@gmail.com>
> Reply-To: <us...@hadoop.apache.org>
> Date: Mon, 17 Dec 2012 21:04:49 +0400
> To: <us...@hadoop.apache.org>
> Subject: Is it necessary to run secondary namenode when starting HDFS?
>
>  Hi all,
>
>  is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
>  # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>   So, will HDFS work if I turn off starting of secondarynamenode ?
>
>  I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
>  --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Mohammad Tariq <do...@gmail.com>.
I agree with Michael. Skipping the SNN daemon is really a bad idea when you
are dealing something real.

Best Regards,
Tariq
+91-9741563634



On Tue, Dec 18, 2012 at 12:22 AM, Patai Sangbutsarakum <
Patai.Sangbutsarakum@turn.com> wrote:

>  > is it necessary to run secondary namenode when starting HDFS?
> I would say it's not necessary. I did skip it when I first played with
> Hadoop.
>
>   From: Ivan Ryndin <ir...@gmail.com>
> Reply-To: <us...@hadoop.apache.org>
> Date: Mon, 17 Dec 2012 21:04:49 +0400
> To: <us...@hadoop.apache.org>
> Subject: Is it necessary to run secondary namenode when starting HDFS?
>
>  Hi all,
>
>  is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
>  # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>   So, will HDFS work if I turn off starting of secondarynamenode ?
>
>  I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
>  --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Mohammad Tariq <do...@gmail.com>.
I agree with Michael. Skipping the SNN daemon is really a bad idea when you
are dealing something real.

Best Regards,
Tariq
+91-9741563634



On Tue, Dec 18, 2012 at 12:22 AM, Patai Sangbutsarakum <
Patai.Sangbutsarakum@turn.com> wrote:

>  > is it necessary to run secondary namenode when starting HDFS?
> I would say it's not necessary. I did skip it when I first played with
> Hadoop.
>
>   From: Ivan Ryndin <ir...@gmail.com>
> Reply-To: <us...@hadoop.apache.org>
> Date: Mon, 17 Dec 2012 21:04:49 +0400
> To: <us...@hadoop.apache.org>
> Subject: Is it necessary to run secondary namenode when starting HDFS?
>
>  Hi all,
>
>  is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
>  # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>   So, will HDFS work if I turn off starting of secondarynamenode ?
>
>  I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
>  --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Mohammad Tariq <do...@gmail.com>.
I agree with Michael. Skipping the SNN daemon is really a bad idea when you
are dealing something real.

Best Regards,
Tariq
+91-9741563634



On Tue, Dec 18, 2012 at 12:22 AM, Patai Sangbutsarakum <
Patai.Sangbutsarakum@turn.com> wrote:

>  > is it necessary to run secondary namenode when starting HDFS?
> I would say it's not necessary. I did skip it when I first played with
> Hadoop.
>
>   From: Ivan Ryndin <ir...@gmail.com>
> Reply-To: <us...@hadoop.apache.org>
> Date: Mon, 17 Dec 2012 21:04:49 +0400
> To: <us...@hadoop.apache.org>
> Subject: Is it necessary to run secondary namenode when starting HDFS?
>
>  Hi all,
>
>  is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file:
>
>  # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
> $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
> $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
> secondarynamenode
>
>   So, will HDFS work if I turn off starting of secondarynamenode ?
>
>  I do ask this because I am playing with Hadoop on two-node cluster only
> (and machines in cluster do not have much RAM and disk space), and thus
> don't want to run unnecessary processes.
>
>  --
> Best regards,
> Ivan P. Ryndin,
>
>

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Patai Sangbutsarakum <Pa...@turn.com>.
> is it necessary to run secondary namenode when starting HDFS?
I would say it's not necessary. I did skip it when I first played with Hadoop.

From: Ivan Ryndin <ir...@gmail.com>>
Reply-To: <us...@hadoop.apache.org>>
Date: Mon, 17 Dec 2012 21:04:49 +0400
To: <us...@hadoop.apache.org>>
Subject: Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.

--
Best regards,
Ivan P. Ryndin,


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Patai Sangbutsarakum <Pa...@turn.com>.
> is it necessary to run secondary namenode when starting HDFS?
I would say it's not necessary. I did skip it when I first played with Hadoop.

From: Ivan Ryndin <ir...@gmail.com>>
Reply-To: <us...@hadoop.apache.org>>
Date: Mon, 17 Dec 2012 21:04:49 +0400
To: <us...@hadoop.apache.org>>
Subject: Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.

--
Best regards,
Ivan P. Ryndin,


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Patai Sangbutsarakum <Pa...@turn.com>.
> is it necessary to run secondary namenode when starting HDFS?
I would say it's not necessary. I did skip it when I first played with Hadoop.

From: Ivan Ryndin <ir...@gmail.com>>
Reply-To: <us...@hadoop.apache.org>>
Date: Mon, 17 Dec 2012 21:04:49 +0400
To: <us...@hadoop.apache.org>>
Subject: Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.

--
Best regards,
Ivan P. Ryndin,


Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Patai Sangbutsarakum <Pa...@turn.com>.
> is it necessary to run secondary namenode when starting HDFS?
I would say it's not necessary. I did skip it when I first played with Hadoop.

From: Ivan Ryndin <ir...@gmail.com>>
Reply-To: <us...@hadoop.apache.org>>
Date: Mon, 17 Dec 2012 21:04:49 +0400
To: <us...@hadoop.apache.org>>
Subject: Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.

--
Best regards,
Ivan P. Ryndin,