You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jason Huang <ja...@icare.com> on 2012/09/14 17:01:26 UTC

HDFS Namenode Format Question.

Hello,

I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
pseudo-distributed mode.

After download / install / setup config files I ran the following
namenode format command as suggested in the user guide:

$bin/hadoop namenode -format

Here is the output:
************************************************************/
12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/09/14 10:46:42 INFO namenode.FSNamesystem:
isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
accessTokenLifetime=0 min(s)
12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
more than 10 times
12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
0 seconds.
12/09/14 10:46:42 INFO common.Storage: Storage directory
/tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************

It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name

However, in my config file I've assigned a different directory (see
hdfs-site.xml below):
<configuration>
  <property>
     <name>dfs.replication</name>
     <value>1</value>
  </property>
  <property>
     <name>dfs.namenode.name.dir</name>
     <value>/Users/jasonhuang/hdfs/name</value>
  </property>
  <property>
     <name>dfs.datanode.data.dir</name>
     <value>/Users/jasonhuang/hdfs/data</value>
  </property>

Does anyone know why the hdfs-site.xml might not be respected?

Also, after formatting the name node, I did a search for the fsimage
file in my local file directories (from root dir) and here is what I
found:
$ sudo find / -name fsimage
/private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
/private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage

I don't understand why the name node format picked (and created) these
two directories...

Any thoughts?

Thanks!

Jason

Re: HDFS Namenode Format Question.

Posted by Shumin Wu <sh...@gmail.com>.
It looks like your hadoop picked up another configuration directory (most
likely the default one) other than the one you've customized. You may print
the path to your hadoop conf when start the namenode, and check which
hadoop conf it reads from.

-Shumin

On Fri, Sep 14, 2012 at 8:01 AM, Jason Huang <ja...@icare.com> wrote:

> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason
>

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks for the clear explanation, Harsh!

Jason

On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <ha...@cloudera.com> wrote:
> Jason,
>
> So far you've made sure your HDFS data is securely placed that it
> doesn't get wiped. This much is sufficient for going ahead with
> running HBase.
>
> For the rest of the files that are going to /tmp, you will need to
> tweak the config of "hadoop.tmp.dir" to make it not do so, and also
> change HADOOP_OPTS in hadoop-env.sh to include a
> -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
> a new path under your $HOME.
>
> However, doing this is not absolutely necessary. For HDFS, all that
> really matters is the name and data directories, which you have
> already moved to a persistent zone.
>
> On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
>> Thanks.
>>
>> This makes sense - checking hdfs-default.xml found the same property
>> named dfs.name.dir and dfs.data.dir.
>>
>> Now I am no longer formatting the default tmp folders taken from
>> hdfs-default.xml.
>>
>> However, after formatting the name node, hadoop automatically created
>> another folder:
>> /tmp/hsperfdata_jasonhuang
>>
>> Does anyone know what that directory is for?
>>
>> And after I started hadoop (running ./start-all.sh), another folder
>> /tmp/hadoop-jasonhuang was created, together with a few files:
>> /tmp/hadoop-jasonhuang-datanode.pid
>> /tmp/hadoop-jasonhuang-jobtracker.pid
>> /tmp/hadoop-jasonhuang-namenode.pid
>> /tmp/hadoop-jasonhuang-secondarynamenode.pid
>> /tmp/hadoop-jasonhuang-tasktracker.pid
>>
>> Are those files generated at the correct location?
>>
>> I've looked at the logs for both name node and master node and there
>> seemed to be no error. However, I am not sure if these files are
>> generated at the correct place or not. I am installing HBase on top of
>> this and want to make sure Hadoop is working correctly before going
>> further.
>>
>> thanks!
>>
>> Jason
>>
>> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> If you are using 1.0.3, then the config names are wrong. You need
>>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>>> 2.x based releases.
>>>
>>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>>> portable/templatey config :)
>>>
>>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>>> Hello,
>>>>
>>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>>> pseudo-distributed mode.
>>>>
>>>> After download / install / setup config files I ran the following
>>>> namenode format command as suggested in the user guide:
>>>>
>>>> $bin/hadoop namenode -format
>>>>
>>>> Here is the output:
>>>> ************************************************************/
>>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>>> more than 10 times
>>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>>> 0 seconds.
>>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>>
>>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>>
>>>> However, in my config file I've assigned a different directory (see
>>>> hdfs-site.xml below):
>>>> <configuration>
>>>>   <property>
>>>>      <name>dfs.replication</name>
>>>>      <value>1</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.namenode.name.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.datanode.data.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>>   </property>
>>>>
>>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>>
>>>> Also, after formatting the name node, I did a search for the fsimage
>>>> file in my local file directories (from root dir) and here is what I
>>>> found:
>>>> $ sudo find / -name fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>>
>>>> I don't understand why the name node format picked (and created) these
>>>> two directories...
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!
>>>>
>>>> Jason
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks for the clear explanation, Harsh!

Jason

On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <ha...@cloudera.com> wrote:
> Jason,
>
> So far you've made sure your HDFS data is securely placed that it
> doesn't get wiped. This much is sufficient for going ahead with
> running HBase.
>
> For the rest of the files that are going to /tmp, you will need to
> tweak the config of "hadoop.tmp.dir" to make it not do so, and also
> change HADOOP_OPTS in hadoop-env.sh to include a
> -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
> a new path under your $HOME.
>
> However, doing this is not absolutely necessary. For HDFS, all that
> really matters is the name and data directories, which you have
> already moved to a persistent zone.
>
> On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
>> Thanks.
>>
>> This makes sense - checking hdfs-default.xml found the same property
>> named dfs.name.dir and dfs.data.dir.
>>
>> Now I am no longer formatting the default tmp folders taken from
>> hdfs-default.xml.
>>
>> However, after formatting the name node, hadoop automatically created
>> another folder:
>> /tmp/hsperfdata_jasonhuang
>>
>> Does anyone know what that directory is for?
>>
>> And after I started hadoop (running ./start-all.sh), another folder
>> /tmp/hadoop-jasonhuang was created, together with a few files:
>> /tmp/hadoop-jasonhuang-datanode.pid
>> /tmp/hadoop-jasonhuang-jobtracker.pid
>> /tmp/hadoop-jasonhuang-namenode.pid
>> /tmp/hadoop-jasonhuang-secondarynamenode.pid
>> /tmp/hadoop-jasonhuang-tasktracker.pid
>>
>> Are those files generated at the correct location?
>>
>> I've looked at the logs for both name node and master node and there
>> seemed to be no error. However, I am not sure if these files are
>> generated at the correct place or not. I am installing HBase on top of
>> this and want to make sure Hadoop is working correctly before going
>> further.
>>
>> thanks!
>>
>> Jason
>>
>> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> If you are using 1.0.3, then the config names are wrong. You need
>>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>>> 2.x based releases.
>>>
>>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>>> portable/templatey config :)
>>>
>>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>>> Hello,
>>>>
>>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>>> pseudo-distributed mode.
>>>>
>>>> After download / install / setup config files I ran the following
>>>> namenode format command as suggested in the user guide:
>>>>
>>>> $bin/hadoop namenode -format
>>>>
>>>> Here is the output:
>>>> ************************************************************/
>>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>>> more than 10 times
>>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>>> 0 seconds.
>>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>>
>>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>>
>>>> However, in my config file I've assigned a different directory (see
>>>> hdfs-site.xml below):
>>>> <configuration>
>>>>   <property>
>>>>      <name>dfs.replication</name>
>>>>      <value>1</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.namenode.name.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.datanode.data.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>>   </property>
>>>>
>>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>>
>>>> Also, after formatting the name node, I did a search for the fsimage
>>>> file in my local file directories (from root dir) and here is what I
>>>> found:
>>>> $ sudo find / -name fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>>
>>>> I don't understand why the name node format picked (and created) these
>>>> two directories...
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!
>>>>
>>>> Jason
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks for the clear explanation, Harsh!

Jason

On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <ha...@cloudera.com> wrote:
> Jason,
>
> So far you've made sure your HDFS data is securely placed that it
> doesn't get wiped. This much is sufficient for going ahead with
> running HBase.
>
> For the rest of the files that are going to /tmp, you will need to
> tweak the config of "hadoop.tmp.dir" to make it not do so, and also
> change HADOOP_OPTS in hadoop-env.sh to include a
> -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
> a new path under your $HOME.
>
> However, doing this is not absolutely necessary. For HDFS, all that
> really matters is the name and data directories, which you have
> already moved to a persistent zone.
>
> On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
>> Thanks.
>>
>> This makes sense - checking hdfs-default.xml found the same property
>> named dfs.name.dir and dfs.data.dir.
>>
>> Now I am no longer formatting the default tmp folders taken from
>> hdfs-default.xml.
>>
>> However, after formatting the name node, hadoop automatically created
>> another folder:
>> /tmp/hsperfdata_jasonhuang
>>
>> Does anyone know what that directory is for?
>>
>> And after I started hadoop (running ./start-all.sh), another folder
>> /tmp/hadoop-jasonhuang was created, together with a few files:
>> /tmp/hadoop-jasonhuang-datanode.pid
>> /tmp/hadoop-jasonhuang-jobtracker.pid
>> /tmp/hadoop-jasonhuang-namenode.pid
>> /tmp/hadoop-jasonhuang-secondarynamenode.pid
>> /tmp/hadoop-jasonhuang-tasktracker.pid
>>
>> Are those files generated at the correct location?
>>
>> I've looked at the logs for both name node and master node and there
>> seemed to be no error. However, I am not sure if these files are
>> generated at the correct place or not. I am installing HBase on top of
>> this and want to make sure Hadoop is working correctly before going
>> further.
>>
>> thanks!
>>
>> Jason
>>
>> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> If you are using 1.0.3, then the config names are wrong. You need
>>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>>> 2.x based releases.
>>>
>>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>>> portable/templatey config :)
>>>
>>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>>> Hello,
>>>>
>>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>>> pseudo-distributed mode.
>>>>
>>>> After download / install / setup config files I ran the following
>>>> namenode format command as suggested in the user guide:
>>>>
>>>> $bin/hadoop namenode -format
>>>>
>>>> Here is the output:
>>>> ************************************************************/
>>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>>> more than 10 times
>>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>>> 0 seconds.
>>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>>
>>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>>
>>>> However, in my config file I've assigned a different directory (see
>>>> hdfs-site.xml below):
>>>> <configuration>
>>>>   <property>
>>>>      <name>dfs.replication</name>
>>>>      <value>1</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.namenode.name.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.datanode.data.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>>   </property>
>>>>
>>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>>
>>>> Also, after formatting the name node, I did a search for the fsimage
>>>> file in my local file directories (from root dir) and here is what I
>>>> found:
>>>> $ sudo find / -name fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>>
>>>> I don't understand why the name node format picked (and created) these
>>>> two directories...
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!
>>>>
>>>> Jason
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks for the clear explanation, Harsh!

Jason

On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <ha...@cloudera.com> wrote:
> Jason,
>
> So far you've made sure your HDFS data is securely placed that it
> doesn't get wiped. This much is sufficient for going ahead with
> running HBase.
>
> For the rest of the files that are going to /tmp, you will need to
> tweak the config of "hadoop.tmp.dir" to make it not do so, and also
> change HADOOP_OPTS in hadoop-env.sh to include a
> -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
> a new path under your $HOME.
>
> However, doing this is not absolutely necessary. For HDFS, all that
> really matters is the name and data directories, which you have
> already moved to a persistent zone.
>
> On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
>> Thanks.
>>
>> This makes sense - checking hdfs-default.xml found the same property
>> named dfs.name.dir and dfs.data.dir.
>>
>> Now I am no longer formatting the default tmp folders taken from
>> hdfs-default.xml.
>>
>> However, after formatting the name node, hadoop automatically created
>> another folder:
>> /tmp/hsperfdata_jasonhuang
>>
>> Does anyone know what that directory is for?
>>
>> And after I started hadoop (running ./start-all.sh), another folder
>> /tmp/hadoop-jasonhuang was created, together with a few files:
>> /tmp/hadoop-jasonhuang-datanode.pid
>> /tmp/hadoop-jasonhuang-jobtracker.pid
>> /tmp/hadoop-jasonhuang-namenode.pid
>> /tmp/hadoop-jasonhuang-secondarynamenode.pid
>> /tmp/hadoop-jasonhuang-tasktracker.pid
>>
>> Are those files generated at the correct location?
>>
>> I've looked at the logs for both name node and master node and there
>> seemed to be no error. However, I am not sure if these files are
>> generated at the correct place or not. I am installing HBase on top of
>> this and want to make sure Hadoop is working correctly before going
>> further.
>>
>> thanks!
>>
>> Jason
>>
>> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>>> If you are using 1.0.3, then the config names are wrong. You need
>>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>>> 2.x based releases.
>>>
>>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>>> portable/templatey config :)
>>>
>>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>>> Hello,
>>>>
>>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>>> pseudo-distributed mode.
>>>>
>>>> After download / install / setup config files I ran the following
>>>> namenode format command as suggested in the user guide:
>>>>
>>>> $bin/hadoop namenode -format
>>>>
>>>> Here is the output:
>>>> ************************************************************/
>>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>>> more than 10 times
>>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>>> 0 seconds.
>>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>>
>>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>>
>>>> However, in my config file I've assigned a different directory (see
>>>> hdfs-site.xml below):
>>>> <configuration>
>>>>   <property>
>>>>      <name>dfs.replication</name>
>>>>      <value>1</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.namenode.name.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.datanode.data.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>>   </property>
>>>>
>>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>>
>>>> Also, after formatting the name node, I did a search for the fsimage
>>>> file in my local file directories (from root dir) and here is what I
>>>> found:
>>>> $ sudo find / -name fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>>
>>>> I don't understand why the name node format picked (and created) these
>>>> two directories...
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!
>>>>
>>>> Jason
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
Jason,

So far you've made sure your HDFS data is securely placed that it
doesn't get wiped. This much is sufficient for going ahead with
running HBase.

For the rest of the files that are going to /tmp, you will need to
tweak the config of "hadoop.tmp.dir" to make it not do so, and also
change HADOOP_OPTS in hadoop-env.sh to include a
-Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
a new path under your $HOME.

However, doing this is not absolutely necessary. For HDFS, all that
really matters is the name and data directories, which you have
already moved to a persistent zone.

On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
> Thanks.
>
> This makes sense - checking hdfs-default.xml found the same property
> named dfs.name.dir and dfs.data.dir.
>
> Now I am no longer formatting the default tmp folders taken from
> hdfs-default.xml.
>
> However, after formatting the name node, hadoop automatically created
> another folder:
> /tmp/hsperfdata_jasonhuang
>
> Does anyone know what that directory is for?
>
> And after I started hadoop (running ./start-all.sh), another folder
> /tmp/hadoop-jasonhuang was created, together with a few files:
> /tmp/hadoop-jasonhuang-datanode.pid
> /tmp/hadoop-jasonhuang-jobtracker.pid
> /tmp/hadoop-jasonhuang-namenode.pid
> /tmp/hadoop-jasonhuang-secondarynamenode.pid
> /tmp/hadoop-jasonhuang-tasktracker.pid
>
> Are those files generated at the correct location?
>
> I've looked at the logs for both name node and master node and there
> seemed to be no error. However, I am not sure if these files are
> generated at the correct place or not. I am installing HBase on top of
> this and want to make sure Hadoop is working correctly before going
> further.
>
> thanks!
>
> Jason
>
> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> If you are using 1.0.3, then the config names are wrong. You need
>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>> 2.x based releases.
>>
>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>> portable/templatey config :)
>>
>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>> Hello,
>>>
>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>> pseudo-distributed mode.
>>>
>>> After download / install / setup config files I ran the following
>>> namenode format command as suggested in the user guide:
>>>
>>> $bin/hadoop namenode -format
>>>
>>> Here is the output:
>>> ************************************************************/
>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>> accessTokenLifetime=0 min(s)
>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>> more than 10 times
>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>> 0 seconds.
>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>>
>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>
>>> However, in my config file I've assigned a different directory (see
>>> hdfs-site.xml below):
>>> <configuration>
>>>   <property>
>>>      <name>dfs.replication</name>
>>>      <value>1</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>   </property>
>>>
>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>
>>> Also, after formatting the name node, I did a search for the fsimage
>>> file in my local file directories (from root dir) and here is what I
>>> found:
>>> $ sudo find / -name fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>
>>> I don't understand why the name node format picked (and created) these
>>> two directories...
>>>
>>> Any thoughts?
>>>
>>> Thanks!
>>>
>>> Jason
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
Jason,

So far you've made sure your HDFS data is securely placed that it
doesn't get wiped. This much is sufficient for going ahead with
running HBase.

For the rest of the files that are going to /tmp, you will need to
tweak the config of "hadoop.tmp.dir" to make it not do so, and also
change HADOOP_OPTS in hadoop-env.sh to include a
-Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
a new path under your $HOME.

However, doing this is not absolutely necessary. For HDFS, all that
really matters is the name and data directories, which you have
already moved to a persistent zone.

On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
> Thanks.
>
> This makes sense - checking hdfs-default.xml found the same property
> named dfs.name.dir and dfs.data.dir.
>
> Now I am no longer formatting the default tmp folders taken from
> hdfs-default.xml.
>
> However, after formatting the name node, hadoop automatically created
> another folder:
> /tmp/hsperfdata_jasonhuang
>
> Does anyone know what that directory is for?
>
> And after I started hadoop (running ./start-all.sh), another folder
> /tmp/hadoop-jasonhuang was created, together with a few files:
> /tmp/hadoop-jasonhuang-datanode.pid
> /tmp/hadoop-jasonhuang-jobtracker.pid
> /tmp/hadoop-jasonhuang-namenode.pid
> /tmp/hadoop-jasonhuang-secondarynamenode.pid
> /tmp/hadoop-jasonhuang-tasktracker.pid
>
> Are those files generated at the correct location?
>
> I've looked at the logs for both name node and master node and there
> seemed to be no error. However, I am not sure if these files are
> generated at the correct place or not. I am installing HBase on top of
> this and want to make sure Hadoop is working correctly before going
> further.
>
> thanks!
>
> Jason
>
> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> If you are using 1.0.3, then the config names are wrong. You need
>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>> 2.x based releases.
>>
>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>> portable/templatey config :)
>>
>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>> Hello,
>>>
>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>> pseudo-distributed mode.
>>>
>>> After download / install / setup config files I ran the following
>>> namenode format command as suggested in the user guide:
>>>
>>> $bin/hadoop namenode -format
>>>
>>> Here is the output:
>>> ************************************************************/
>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>> accessTokenLifetime=0 min(s)
>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>> more than 10 times
>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>> 0 seconds.
>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>>
>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>
>>> However, in my config file I've assigned a different directory (see
>>> hdfs-site.xml below):
>>> <configuration>
>>>   <property>
>>>      <name>dfs.replication</name>
>>>      <value>1</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>   </property>
>>>
>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>
>>> Also, after formatting the name node, I did a search for the fsimage
>>> file in my local file directories (from root dir) and here is what I
>>> found:
>>> $ sudo find / -name fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>
>>> I don't understand why the name node format picked (and created) these
>>> two directories...
>>>
>>> Any thoughts?
>>>
>>> Thanks!
>>>
>>> Jason
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
Jason,

So far you've made sure your HDFS data is securely placed that it
doesn't get wiped. This much is sufficient for going ahead with
running HBase.

For the rest of the files that are going to /tmp, you will need to
tweak the config of "hadoop.tmp.dir" to make it not do so, and also
change HADOOP_OPTS in hadoop-env.sh to include a
-Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
a new path under your $HOME.

However, doing this is not absolutely necessary. For HDFS, all that
really matters is the name and data directories, which you have
already moved to a persistent zone.

On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
> Thanks.
>
> This makes sense - checking hdfs-default.xml found the same property
> named dfs.name.dir and dfs.data.dir.
>
> Now I am no longer formatting the default tmp folders taken from
> hdfs-default.xml.
>
> However, after formatting the name node, hadoop automatically created
> another folder:
> /tmp/hsperfdata_jasonhuang
>
> Does anyone know what that directory is for?
>
> And after I started hadoop (running ./start-all.sh), another folder
> /tmp/hadoop-jasonhuang was created, together with a few files:
> /tmp/hadoop-jasonhuang-datanode.pid
> /tmp/hadoop-jasonhuang-jobtracker.pid
> /tmp/hadoop-jasonhuang-namenode.pid
> /tmp/hadoop-jasonhuang-secondarynamenode.pid
> /tmp/hadoop-jasonhuang-tasktracker.pid
>
> Are those files generated at the correct location?
>
> I've looked at the logs for both name node and master node and there
> seemed to be no error. However, I am not sure if these files are
> generated at the correct place or not. I am installing HBase on top of
> this and want to make sure Hadoop is working correctly before going
> further.
>
> thanks!
>
> Jason
>
> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> If you are using 1.0.3, then the config names are wrong. You need
>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>> 2.x based releases.
>>
>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>> portable/templatey config :)
>>
>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>> Hello,
>>>
>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>> pseudo-distributed mode.
>>>
>>> After download / install / setup config files I ran the following
>>> namenode format command as suggested in the user guide:
>>>
>>> $bin/hadoop namenode -format
>>>
>>> Here is the output:
>>> ************************************************************/
>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>> accessTokenLifetime=0 min(s)
>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>> more than 10 times
>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>> 0 seconds.
>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>>
>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>
>>> However, in my config file I've assigned a different directory (see
>>> hdfs-site.xml below):
>>> <configuration>
>>>   <property>
>>>      <name>dfs.replication</name>
>>>      <value>1</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>   </property>
>>>
>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>
>>> Also, after formatting the name node, I did a search for the fsimage
>>> file in my local file directories (from root dir) and here is what I
>>> found:
>>> $ sudo find / -name fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>
>>> I don't understand why the name node format picked (and created) these
>>> two directories...
>>>
>>> Any thoughts?
>>>
>>> Thanks!
>>>
>>> Jason
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
Jason,

So far you've made sure your HDFS data is securely placed that it
doesn't get wiped. This much is sufficient for going ahead with
running HBase.

For the rest of the files that are going to /tmp, you will need to
tweak the config of "hadoop.tmp.dir" to make it not do so, and also
change HADOOP_OPTS in hadoop-env.sh to include a
-Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
a new path under your $HOME.

However, doing this is not absolutely necessary. For HDFS, all that
really matters is the name and data directories, which you have
already moved to a persistent zone.

On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <ja...@icare.com> wrote:
> Thanks.
>
> This makes sense - checking hdfs-default.xml found the same property
> named dfs.name.dir and dfs.data.dir.
>
> Now I am no longer formatting the default tmp folders taken from
> hdfs-default.xml.
>
> However, after formatting the name node, hadoop automatically created
> another folder:
> /tmp/hsperfdata_jasonhuang
>
> Does anyone know what that directory is for?
>
> And after I started hadoop (running ./start-all.sh), another folder
> /tmp/hadoop-jasonhuang was created, together with a few files:
> /tmp/hadoop-jasonhuang-datanode.pid
> /tmp/hadoop-jasonhuang-jobtracker.pid
> /tmp/hadoop-jasonhuang-namenode.pid
> /tmp/hadoop-jasonhuang-secondarynamenode.pid
> /tmp/hadoop-jasonhuang-tasktracker.pid
>
> Are those files generated at the correct location?
>
> I've looked at the logs for both name node and master node and there
> seemed to be no error. However, I am not sure if these files are
> generated at the correct place or not. I am installing HBase on top of
> this and want to make sure Hadoop is working correctly before going
> further.
>
> thanks!
>
> Jason
>
> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
>> If you are using 1.0.3, then the config names are wrong. You need
>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>> 2.x based releases.
>>
>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>> portable/templatey config :)
>>
>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>>> Hello,
>>>
>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>> pseudo-distributed mode.
>>>
>>> After download / install / setup config files I ran the following
>>> namenode format command as suggested in the user guide:
>>>
>>> $bin/hadoop namenode -format
>>>
>>> Here is the output:
>>> ************************************************************/
>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>> accessTokenLifetime=0 min(s)
>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>> more than 10 times
>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>> 0 seconds.
>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>>
>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>
>>> However, in my config file I've assigned a different directory (see
>>> hdfs-site.xml below):
>>> <configuration>
>>>   <property>
>>>      <name>dfs.replication</name>
>>>      <value>1</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>   </property>
>>>
>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>
>>> Also, after formatting the name node, I did a search for the fsimage
>>> file in my local file directories (from root dir) and here is what I
>>> found:
>>> $ sudo find / -name fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>
>>> I don't understand why the name node format picked (and created) these
>>> two directories...
>>>
>>> Any thoughts?
>>>
>>> Thanks!
>>>
>>> Jason
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks.

This makes sense - checking hdfs-default.xml found the same property
named dfs.name.dir and dfs.data.dir.

Now I am no longer formatting the default tmp folders taken from
hdfs-default.xml.

However, after formatting the name node, hadoop automatically created
another folder:
/tmp/hsperfdata_jasonhuang

Does anyone know what that directory is for?

And after I started hadoop (running ./start-all.sh), another folder
/tmp/hadoop-jasonhuang was created, together with a few files:
/tmp/hadoop-jasonhuang-datanode.pid
/tmp/hadoop-jasonhuang-jobtracker.pid
/tmp/hadoop-jasonhuang-namenode.pid
/tmp/hadoop-jasonhuang-secondarynamenode.pid
/tmp/hadoop-jasonhuang-tasktracker.pid

Are those files generated at the correct location?

I've looked at the logs for both name node and master node and there
seemed to be no error. However, I am not sure if these files are
generated at the correct place or not. I am installing HBase on top of
this and want to make sure Hadoop is working correctly before going
further.

thanks!

Jason

On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
> If you are using 1.0.3, then the config names are wrong. You need
> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
> 2.x based releases.
>
> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
> portable/templatey config :)
>
> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>> Hello,
>>
>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>> pseudo-distributed mode.
>>
>> After download / install / setup config files I ran the following
>> namenode format command as suggested in the user guide:
>>
>> $bin/hadoop namenode -format
>>
>> Here is the output:
>> ************************************************************/
>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>> accessTokenLifetime=0 min(s)
>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>> more than 10 times
>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>> 0 seconds.
>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>>
>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>
>> However, in my config file I've assigned a different directory (see
>> hdfs-site.xml below):
>> <configuration>
>>   <property>
>>      <name>dfs.replication</name>
>>      <value>1</value>
>>   </property>
>>   <property>
>>      <name>dfs.namenode.name.dir</name>
>>      <value>/Users/jasonhuang/hdfs/name</value>
>>   </property>
>>   <property>
>>      <name>dfs.datanode.data.dir</name>
>>      <value>/Users/jasonhuang/hdfs/data</value>
>>   </property>
>>
>> Does anyone know why the hdfs-site.xml might not be respected?
>>
>> Also, after formatting the name node, I did a search for the fsimage
>> file in my local file directories (from root dir) and here is what I
>> found:
>> $ sudo find / -name fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>
>> I don't understand why the name node format picked (and created) these
>> two directories...
>>
>> Any thoughts?
>>
>> Thanks!
>>
>> Jason
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks.

This makes sense - checking hdfs-default.xml found the same property
named dfs.name.dir and dfs.data.dir.

Now I am no longer formatting the default tmp folders taken from
hdfs-default.xml.

However, after formatting the name node, hadoop automatically created
another folder:
/tmp/hsperfdata_jasonhuang

Does anyone know what that directory is for?

And after I started hadoop (running ./start-all.sh), another folder
/tmp/hadoop-jasonhuang was created, together with a few files:
/tmp/hadoop-jasonhuang-datanode.pid
/tmp/hadoop-jasonhuang-jobtracker.pid
/tmp/hadoop-jasonhuang-namenode.pid
/tmp/hadoop-jasonhuang-secondarynamenode.pid
/tmp/hadoop-jasonhuang-tasktracker.pid

Are those files generated at the correct location?

I've looked at the logs for both name node and master node and there
seemed to be no error. However, I am not sure if these files are
generated at the correct place or not. I am installing HBase on top of
this and want to make sure Hadoop is working correctly before going
further.

thanks!

Jason

On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
> If you are using 1.0.3, then the config names are wrong. You need
> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
> 2.x based releases.
>
> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
> portable/templatey config :)
>
> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>> Hello,
>>
>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>> pseudo-distributed mode.
>>
>> After download / install / setup config files I ran the following
>> namenode format command as suggested in the user guide:
>>
>> $bin/hadoop namenode -format
>>
>> Here is the output:
>> ************************************************************/
>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>> accessTokenLifetime=0 min(s)
>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>> more than 10 times
>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>> 0 seconds.
>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>>
>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>
>> However, in my config file I've assigned a different directory (see
>> hdfs-site.xml below):
>> <configuration>
>>   <property>
>>      <name>dfs.replication</name>
>>      <value>1</value>
>>   </property>
>>   <property>
>>      <name>dfs.namenode.name.dir</name>
>>      <value>/Users/jasonhuang/hdfs/name</value>
>>   </property>
>>   <property>
>>      <name>dfs.datanode.data.dir</name>
>>      <value>/Users/jasonhuang/hdfs/data</value>
>>   </property>
>>
>> Does anyone know why the hdfs-site.xml might not be respected?
>>
>> Also, after formatting the name node, I did a search for the fsimage
>> file in my local file directories (from root dir) and here is what I
>> found:
>> $ sudo find / -name fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>
>> I don't understand why the name node format picked (and created) these
>> two directories...
>>
>> Any thoughts?
>>
>> Thanks!
>>
>> Jason
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks.

This makes sense - checking hdfs-default.xml found the same property
named dfs.name.dir and dfs.data.dir.

Now I am no longer formatting the default tmp folders taken from
hdfs-default.xml.

However, after formatting the name node, hadoop automatically created
another folder:
/tmp/hsperfdata_jasonhuang

Does anyone know what that directory is for?

And after I started hadoop (running ./start-all.sh), another folder
/tmp/hadoop-jasonhuang was created, together with a few files:
/tmp/hadoop-jasonhuang-datanode.pid
/tmp/hadoop-jasonhuang-jobtracker.pid
/tmp/hadoop-jasonhuang-namenode.pid
/tmp/hadoop-jasonhuang-secondarynamenode.pid
/tmp/hadoop-jasonhuang-tasktracker.pid

Are those files generated at the correct location?

I've looked at the logs for both name node and master node and there
seemed to be no error. However, I am not sure if these files are
generated at the correct place or not. I am installing HBase on top of
this and want to make sure Hadoop is working correctly before going
further.

thanks!

Jason

On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
> If you are using 1.0.3, then the config names are wrong. You need
> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
> 2.x based releases.
>
> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
> portable/templatey config :)
>
> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>> Hello,
>>
>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>> pseudo-distributed mode.
>>
>> After download / install / setup config files I ran the following
>> namenode format command as suggested in the user guide:
>>
>> $bin/hadoop namenode -format
>>
>> Here is the output:
>> ************************************************************/
>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>> accessTokenLifetime=0 min(s)
>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>> more than 10 times
>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>> 0 seconds.
>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>>
>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>
>> However, in my config file I've assigned a different directory (see
>> hdfs-site.xml below):
>> <configuration>
>>   <property>
>>      <name>dfs.replication</name>
>>      <value>1</value>
>>   </property>
>>   <property>
>>      <name>dfs.namenode.name.dir</name>
>>      <value>/Users/jasonhuang/hdfs/name</value>
>>   </property>
>>   <property>
>>      <name>dfs.datanode.data.dir</name>
>>      <value>/Users/jasonhuang/hdfs/data</value>
>>   </property>
>>
>> Does anyone know why the hdfs-site.xml might not be respected?
>>
>> Also, after formatting the name node, I did a search for the fsimage
>> file in my local file directories (from root dir) and here is what I
>> found:
>> $ sudo find / -name fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>
>> I don't understand why the name node format picked (and created) these
>> two directories...
>>
>> Any thoughts?
>>
>> Thanks!
>>
>> Jason
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Jason Huang <ja...@icare.com>.
Thanks.

This makes sense - checking hdfs-default.xml found the same property
named dfs.name.dir and dfs.data.dir.

Now I am no longer formatting the default tmp folders taken from
hdfs-default.xml.

However, after formatting the name node, hadoop automatically created
another folder:
/tmp/hsperfdata_jasonhuang

Does anyone know what that directory is for?

And after I started hadoop (running ./start-all.sh), another folder
/tmp/hadoop-jasonhuang was created, together with a few files:
/tmp/hadoop-jasonhuang-datanode.pid
/tmp/hadoop-jasonhuang-jobtracker.pid
/tmp/hadoop-jasonhuang-namenode.pid
/tmp/hadoop-jasonhuang-secondarynamenode.pid
/tmp/hadoop-jasonhuang-tasktracker.pid

Are those files generated at the correct location?

I've looked at the logs for both name node and master node and there
seemed to be no error. However, I am not sure if these files are
generated at the correct place or not. I am installing HBase on top of
this and want to make sure Hadoop is working correctly before going
further.

thanks!

Jason

On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <ha...@cloudera.com> wrote:
> If you are using 1.0.3, then the config names are wrong. You need
> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
> 2.x based releases.
>
> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
> portable/templatey config :)
>
> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
>> Hello,
>>
>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>> pseudo-distributed mode.
>>
>> After download / install / setup config files I ran the following
>> namenode format command as suggested in the user guide:
>>
>> $bin/hadoop namenode -format
>>
>> Here is the output:
>> ************************************************************/
>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>> accessTokenLifetime=0 min(s)
>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>> more than 10 times
>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>> 0 seconds.
>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>> /************************************************************
>>
>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>
>> However, in my config file I've assigned a different directory (see
>> hdfs-site.xml below):
>> <configuration>
>>   <property>
>>      <name>dfs.replication</name>
>>      <value>1</value>
>>   </property>
>>   <property>
>>      <name>dfs.namenode.name.dir</name>
>>      <value>/Users/jasonhuang/hdfs/name</value>
>>   </property>
>>   <property>
>>      <name>dfs.datanode.data.dir</name>
>>      <value>/Users/jasonhuang/hdfs/data</value>
>>   </property>
>>
>> Does anyone know why the hdfs-site.xml might not be respected?
>>
>> Also, after formatting the name node, I did a search for the fsimage
>> file in my local file directories (from root dir) and here is what I
>> found:
>> $ sudo find / -name fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>
>> I don't understand why the name node format picked (and created) these
>> two directories...
>>
>> Any thoughts?
>>
>> Thanks!
>>
>> Jason
>
>
>
> --
> Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
If you are using 1.0.3, then the config names are wrong. You need
dfs.name.dir and dfs.data.dir instead. Those configs you have are for
2.x based releases.

Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
portable/templatey config :)

On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Shumin Wu <sh...@gmail.com>.
It looks like your hadoop picked up another configuration directory (most
likely the default one) other than the one you've customized. You may print
the path to your hadoop conf when start the namenode, and check which
hadoop conf it reads from.

-Shumin

On Fri, Sep 14, 2012 at 8:01 AM, Jason Huang <ja...@icare.com> wrote:

> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason
>

Re: HDFS Namenode Format Question.

Posted by Shumin Wu <sh...@gmail.com>.
It looks like your hadoop picked up another configuration directory (most
likely the default one) other than the one you've customized. You may print
the path to your hadoop conf when start the namenode, and check which
hadoop conf it reads from.

-Shumin

On Fri, Sep 14, 2012 at 8:01 AM, Jason Huang <ja...@icare.com> wrote:

> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason
>

Re: HDFS Namenode Format Question.

Posted by Shumin Wu <sh...@gmail.com>.
It looks like your hadoop picked up another configuration directory (most
likely the default one) other than the one you've customized. You may print
the path to your hadoop conf when start the namenode, and check which
hadoop conf it reads from.

-Shumin

On Fri, Sep 14, 2012 at 8:01 AM, Jason Huang <ja...@icare.com> wrote:

> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason
>

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
If you are using 1.0.3, then the config names are wrong. You need
dfs.name.dir and dfs.data.dir instead. Those configs you have are for
2.x based releases.

Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
portable/templatey config :)

On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
If you are using 1.0.3, then the config names are wrong. You need
dfs.name.dir and dfs.data.dir instead. Those configs you have are for
2.x based releases.

Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
portable/templatey config :)

On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason



-- 
Harsh J

Re: HDFS Namenode Format Question.

Posted by Harsh J <ha...@cloudera.com>.
If you are using 1.0.3, then the config names are wrong. You need
dfs.name.dir and dfs.data.dir instead. Those configs you have are for
2.x based releases.

Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
portable/templatey config :)

On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <ja...@icare.com> wrote:
> Hello,
>
> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
> pseudo-distributed mode.
>
> After download / install / setup config files I ran the following
> namenode format command as suggested in the user guide:
>
> $bin/hadoop namenode -format
>
> Here is the output:
> ************************************************************/
> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
> accessTokenLifetime=0 min(s)
> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
> more than 10 times
> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
> 0 seconds.
> 12/09/14 10:46:42 INFO common.Storage: Storage directory
> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
>
> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>
> However, in my config file I've assigned a different directory (see
> hdfs-site.xml below):
> <configuration>
>   <property>
>      <name>dfs.replication</name>
>      <value>1</value>
>   </property>
>   <property>
>      <name>dfs.namenode.name.dir</name>
>      <value>/Users/jasonhuang/hdfs/name</value>
>   </property>
>   <property>
>      <name>dfs.datanode.data.dir</name>
>      <value>/Users/jasonhuang/hdfs/data</value>
>   </property>
>
> Does anyone know why the hdfs-site.xml might not be respected?
>
> Also, after formatting the name node, I did a search for the fsimage
> file in my local file directories (from root dir) and here is what I
> found:
> $ sudo find / -name fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>
> I don't understand why the name node format picked (and created) these
> two directories...
>
> Any thoughts?
>
> Thanks!
>
> Jason



-- 
Harsh J