You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Ivan Ryndin <ir...@gmail.com> on 2012/12/17 18:04:49 UTC

Is it necessary to run secondary namenode when starting HDFS?

Hi all,

is it necessary to run secondary namenode when starting HDFS?
I am dealing with Hadoop 1.1.1.
Looking at script $HADOOP_HOME/bin/start_dfs.sh
There are next lines in this file:

# start dfs daemons
# start namenode after datanodes, to minimize time namenode is up w/o data
# note: datanodes will log connection errors until namenode starts
"$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
$nameStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode
$dataStartOpt
"$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start
secondarynamenode

 So, will HDFS work if I turn off starting of secondarynamenode ?

I do ask this because I am playing with Hadoop on two-node cluster only
(and machines in cluster do not have much RAM and disk space), and thus
don't want to run unnecessary processes.

-- 
Best regards,
Ivan P. Ryndin,

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.

Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.

Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin

Re: Is it necessary to run secondary namenode when starting HDFS?

Posted by Michael Segel <mi...@hotmail.com>.

Hi, 

Just a reminder... just because you can do something or rather in this case, not do something, doesn't mean that its a good idea. 

The SN is there for a reason. Maybe if you're on an EMR cluster that will be taken down at the end of the job or end of the day not having the SN running is OK. 
Outside of that... its pretty much a good idea. 

-Just saying...



On Dec 17, 2012, at 11:23 AM, Ivan Ryndin <ir...@gmail.com> wrote:

> Thank you very much, Bryan!
> 
> It is now clear for me, that in development mode I'll not start secondary namenode.
> But in production it's better to have it.
> Thanks!
> 
> Regards,
> Ivan
> 
> 
> 2012/12/17 Bryan Beaudreault <bb...@hubspot.com>
> You don't need a secondary name node.  It creates snapshots of the name node metadata periodically, which helps to keep down the size of the edits files.  If you don't run one, over time your edits files will grow.  The next time you go to restart your namenode, it could take a very long time to start up if your edits are large.  I recommend running one in production, to reduce the amount of downtime if you need to replace or restart your namenode.  If that isn't a concern for you then you don't need it.
> 
> 
> On Mon, Dec 17, 2012 at 12:04 PM, Ivan Ryndin <ir...@gmail.com> wrote:
> Hi all,
> 
> is it necessary to run secondary namenode when starting HDFS?
> I am dealing with Hadoop 1.1.1.
> Looking at script $HADOOP_HOME/bin/start_dfs.sh
> There are next lines in this file: 
> 
> # start dfs daemons
> # start namenode after datanodes, to minimize time namenode is up w/o data
> # note: datanodes will log connection errors until namenode starts
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode $nameStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start datanode $dataStartOpt
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR --hosts masters start secondarynamenode
> 
>  So, will HDFS work if I turn off starting of secondarynamenode ?
> 
> I do ask this because I am playing with Hadoop on two-node cluster only (and machines in cluster do not have much RAM and disk space), and thus don't want to run unnecessary processes.
> 
> -- 
> Best regards,
> Ivan P. Ryndin,
> 
> 
> 
> 
> 
> -- 
> Best regards,
> Ivan P. Ryndin