You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Martinus Martinus <ma...@gmail.com> on 2012/01/02 04:23:49 UTC

Why total node just 1

Hi,

I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
checked in every node, there are tasktracker and datanode run, but when I
run hadoop dfsadmin -report it's said like this :

Configured Capacity: 30352158720 (28.27 GB)
Present Capacity: 3756392448 (3.5 GB)
DFS Remaining: 3756355584 (3.5 GB)
DFS Used: 36864 (36 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Name: 192.168.1.1:50010
Decommission Status : Normal
Configured Capacity: 30352158720 (28.27 GB)
DFS Used: 36864 (36 KB)
Non DFS Used: 26595766272 (24.77 GB)
DFS Remaining: 3756355584(3.5 GB)
DFS Used%: 0%
DFS Remaining%: 12.38%
Last contact: Mon Jan 02 11:19:44 CST 2012

Why is there only total 1 node available? How to fix this problem?

Thanks.

Re: Why total node just 1

Posted by Martinus Martinus <ma...@gmail.com>.

Hi Prashant,

Thanks also for your advice. I have it works right now, I deleted the data
folder inside the hadoop.tmp.dir and I run it again and it's now have total
4 nodes.

Thanks and Happy New Year 2012.

On Mon, Jan 2, 2012 at 2:15 PM, Martinus Martinus <ma...@gmail.com>wrote:

> Hi Harsh J,
>
> Thanks for your advice. I have it works right now, I deleted the data
> folder inside the hadoop.tmp.dir and I run it again and it's now have total
> 4 nodes.
>
> Thanks and Happy New Year 2012.
>
>
> On Mon, Jan 2, 2012 at 11:34 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Check your other 3 DN's logs. Could be that you may not have propagated
>> configurations properly, or could be that you have a firewall you need to
>> turn off/configure, to let the DataNodes communicate with the NameNode.
>>
>> On 02-Jan-2012, at 8:53 AM, Martinus Martinus wrote:
>>
>> Hi,
>>
>> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
>> checked in every node, there are tasktracker and datanode run, but when I
>> run hadoop dfsadmin -report it's said like this :
>>
>> Configured Capacity: 30352158720 (28.27 GB)
>> Present Capacity: 3756392448 (3.5 GB)
>> DFS Remaining: 3756355584 (3.5 GB)
>> DFS Used: 36864 (36 KB)
>> DFS Used%: 0%
>> Under replicated blocks: 1
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.1.1:50010
>> Decommission Status : Normal
>> Configured Capacity: 30352158720 (28.27 GB)
>> DFS Used: 36864 (36 KB)
>> Non DFS Used: 26595766272 (24.77 GB)
>> DFS Remaining: 3756355584(3.5 GB)
>> DFS Used%: 0%
>> DFS Remaining%: 12.38%
>> Last contact: Mon Jan 02 11:19:44 CST 2012
>>
>> Why is there only total 1 node available? How to fix this problem?
>>
>> Thanks.
>>
>>
>>
>

Re: Why total node just 1

Posted by Martinus Martinus <ma...@gmail.com>.

Hi Harsh J,

Thanks for your advice. I have it works right now, I deleted the data
folder inside the hadoop.tmp.dir and I run it again and it's now have total
4 nodes.

Thanks and Happy New Year 2012.

On Mon, Jan 2, 2012 at 11:34 AM, Harsh J <ha...@cloudera.com> wrote:

> Check your other 3 DN's logs. Could be that you may not have propagated
> configurations properly, or could be that you have a firewall you need to
> turn off/configure, to let the DataNodes communicate with the NameNode.
>
> On 02-Jan-2012, at 8:53 AM, Martinus Martinus wrote:
>
> Hi,
>
> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
> checked in every node, there are tasktracker and datanode run, but when I
> run hadoop dfsadmin -report it's said like this :
>
> Configured Capacity: 30352158720 (28.27 GB)
> Present Capacity: 3756392448 (3.5 GB)
> DFS Remaining: 3756355584 (3.5 GB)
> DFS Used: 36864 (36 KB)
> DFS Used%: 0%
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.1.1:50010
> Decommission Status : Normal
> Configured Capacity: 30352158720 (28.27 GB)
> DFS Used: 36864 (36 KB)
> Non DFS Used: 26595766272 (24.77 GB)
> DFS Remaining: 3756355584(3.5 GB)
> DFS Used%: 0%
> DFS Remaining%: 12.38%
> Last contact: Mon Jan 02 11:19:44 CST 2012
>
> Why is there only total 1 node available? How to fix this problem?
>
> Thanks.
>
>
>

Re: Why total node just 1

Posted by Harsh J <ha...@cloudera.com>.

Check your other 3 DN's logs. Could be that you may not have propagated configurations properly, or could be that you have a firewall you need to turn off/configure, to let the DataNodes communicate with the NameNode.

On 02-Jan-2012, at 8:53 AM, Martinus Martinus wrote:

> Hi,
> 
> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and checked in every node, there are tasktracker and datanode run, but when I run hadoop dfsadmin -report it's said like this : 
> 
> Configured Capacity: 30352158720 (28.27 GB)
> Present Capacity: 3756392448 (3.5 GB)
> DFS Remaining: 3756355584 (3.5 GB)
> DFS Used: 36864 (36 KB)
> DFS Used%: 0%
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> 
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
> 
> Name: 192.168.1.1:50010
> Decommission Status : Normal
> Configured Capacity: 30352158720 (28.27 GB)
> DFS Used: 36864 (36 KB)
> Non DFS Used: 26595766272 (24.77 GB)
> DFS Remaining: 3756355584(3.5 GB)
> DFS Used%: 0%
> DFS Remaining%: 12.38%
> Last contact: Mon Jan 02 11:19:44 CST 2012
> 
> Why is there only total 1 node available? How to fix this problem?
> 
> Thanks.

dfs.name.dir and fs.checkpoint.dir

Posted by Dave Shine <Da...@channelintelligence.com>.

Per recommendations I received in Cloudera's Hadoop Administrator training, I configured our dfs.name.dir property with 3 directories, one on the NN, one on an nfs mount to a Hadoop client machine (in the same rack as the NN), and one to an nfs mount to a NAS (different rack, same datacenter). I also configured the fs.checkpoint.dir with 3 directories, one on the 2NN (the NN is one 1 machine, the JT and 2NN are on a second machine), one on an nfs mount to the same Hadoop client machine, and one to an nfs mount to the same NAS.

With this configuration we experienced sever delays in the delivery of updated fsimage files from the 2NN to the NN (several hours for an fsimage file under 2GB). I've since removed the NAS from the fs.checkpoint.dir property and our network guys "optimized the hell out of the nfs mount" and the updated fsimage file now get delivered to the NN in minutes.

My question is, is there really any reason at all for specifying more than one directory in fs.checkpoint.dir? I probably did it out of paranoia when I was first configuring the cluster. How is this property configured in other Hadoop environments?

Thanks,
Dave Shine

The information contained in this email message is considered confidential and proprietary to the sender and is intended solely for review and use by the named recipient. Any unauthorized review, use or distribution is strictly prohibited. If you have received this message in error, please advise the sender by reply email and delete the message.

Re: Why total node just 1

Posted by Martinus Martinus <ma...@gmail.com>.

Hi Bharath / Harsh,

How about this facebook-hadoop :

https://github.com/facebook/hadoop-20

or

https://github.com/gnawux/hadoop-cmri/tree/master/bin

or

http://de-de.facebook.com/note.php?note_id=106157472002

Have you tried one of these? I'm not really understand hadoop too deep, so
I'm thinking that you could make a suggestion for me about above link.

Thanks.

On Thu, Jan 5, 2012 at 1:45 AM, Bharath Mundlapudi <mu...@gmail.com>wrote:

> Hi Martinus,
>
> As Harsha mentioned, HA is under development.
>
> Couple of things you can do for HOT-COLD setup are:
>
> 1. Multiple dirs for ${dfs.name.dir}
> 2. Place ${dfs.name.dir} on a RAID 1 mirror setup
> 3. NFS as one of the ${dfs.name.dir}
>
>
> -Bharath
>
>
>
>
> On Wed, Jan 4, 2012 at 1:19 AM, Harsh J <ha...@cloudera.com> wrote:
>
>> Martinus,
>>
>> High-Availability NameNode is being worked upon and an initial version
>> will be out soon. Check out the
>> https://issues.apache.org/jira/browse/HDFS-1623 JIRA for its
>> state/discussions.
>>
>> You can also clone the Hadoop repo and switch to branch 'HDFS-1623' to
>> give it a whirl, although it is still being worked upon presently.
>>
>> For now, we recommend using multiple ${dfs.name.dir} directories
>> (across mounts), preferably one of them being a reliable-enough NFS
>> point.
>>
>> On Wed, Jan 4, 2012 at 2:26 PM, Martinus Martinus <ma...@gmail.com>
>> wrote:
>> > Hi Bharath,
>> >
>> > Thanks for your answer. I remembered hadoop has single point of failure,
>> > which is it's namenode. Is there a way to make my hadoop clusters to
>> become
>> > fault tolerant, even when the master node (namenode) fail?
>> >
>> >
>> > Thanks and Happy New Year 2012.
>> >
>> > On Tue, Jan 3, 2012 at 2:20 AM, Bharath Mundlapudi <
>> mundlapudi@gmail.com>
>> > wrote:
>> >>
>> >> You might want to check the datanode logs. Go to the 3 remaining nodes
>> >> which didn't start and restart the datanode.
>> >>
>> >> -Bharath
>> >>
>> >>
>> >> On Sun, Jan 1, 2012 at 7:23 PM, Martinus Martinus <
>> martinus787@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I have setup a hadoop clusters with 4 nodes and I have start-all.sh
>> and
>> >>> checked in every node, there are tasktracker and datanode run, but
>> when I
>> >>> run hadoop dfsadmin -report it's said like this :
>> >>>
>> >>> Configured Capacity: 30352158720 (28.27 GB)
>> >>> Present Capacity: 3756392448 (3.5 GB)
>> >>> DFS Remaining: 3756355584 (3.5 GB)
>> >>> DFS Used: 36864 (36 KB)
>> >>> DFS Used%: 0%
>> >>> Under replicated blocks: 1
>> >>> Blocks with corrupt replicas: 0
>> >>> Missing blocks: 0
>> >>>
>> >>> -------------------------------------------------
>> >>> Datanodes available: 1 (1 total, 0 dead)
>> >>>
>> >>> Name: 192.168.1.1:50010
>> >>> Decommission Status : Normal
>> >>> Configured Capacity: 30352158720 (28.27 GB)
>> >>> DFS Used: 36864 (36 KB)
>> >>> Non DFS Used: 26595766272 (24.77 GB)
>> >>> DFS Remaining: 3756355584(3.5 GB)
>> >>> DFS Used%: 0%
>> >>> DFS Remaining%: 12.38%
>> >>> Last contact: Mon Jan 02 11:19:44 CST 2012
>> >>>
>> >>> Why is there only total 1 node available? How to fix this problem?
>> >>>
>> >>> Thanks.
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Why total node just 1

Posted by Bharath Mundlapudi <mu...@gmail.com>.

Hi Martinus,

As Harsha mentioned, HA is under development.

Couple of things you can do for HOT-COLD setup are:

1. Multiple dirs for ${dfs.name.dir}
2. Place ${dfs.name.dir} on a RAID 1 mirror setup
3. NFS as one of the ${dfs.name.dir}


-Bharath



On Wed, Jan 4, 2012 at 1:19 AM, Harsh J <ha...@cloudera.com> wrote:

> Martinus,
>
> High-Availability NameNode is being worked upon and an initial version
> will be out soon. Check out the
> https://issues.apache.org/jira/browse/HDFS-1623 JIRA for its
> state/discussions.
>
> You can also clone the Hadoop repo and switch to branch 'HDFS-1623' to
> give it a whirl, although it is still being worked upon presently.
>
> For now, we recommend using multiple ${dfs.name.dir} directories
> (across mounts), preferably one of them being a reliable-enough NFS
> point.
>
> On Wed, Jan 4, 2012 at 2:26 PM, Martinus Martinus <ma...@gmail.com>
> wrote:
> > Hi Bharath,
> >
> > Thanks for your answer. I remembered hadoop has single point of failure,
> > which is it's namenode. Is there a way to make my hadoop clusters to
> become
> > fault tolerant, even when the master node (namenode) fail?
> >
> >
> > Thanks and Happy New Year 2012.
> >
> > On Tue, Jan 3, 2012 at 2:20 AM, Bharath Mundlapudi <mundlapudi@gmail.com
> >
> > wrote:
> >>
> >> You might want to check the datanode logs. Go to the 3 remaining nodes
> >> which didn't start and restart the datanode.
> >>
> >> -Bharath
> >>
> >>
> >> On Sun, Jan 1, 2012 at 7:23 PM, Martinus Martinus <
> martinus787@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
> >>> checked in every node, there are tasktracker and datanode run, but
> when I
> >>> run hadoop dfsadmin -report it's said like this :
> >>>
> >>> Configured Capacity: 30352158720 (28.27 GB)
> >>> Present Capacity: 3756392448 (3.5 GB)
> >>> DFS Remaining: 3756355584 (3.5 GB)
> >>> DFS Used: 36864 (36 KB)
> >>> DFS Used%: 0%
> >>> Under replicated blocks: 1
> >>> Blocks with corrupt replicas: 0
> >>> Missing blocks: 0
> >>>
> >>> -------------------------------------------------
> >>> Datanodes available: 1 (1 total, 0 dead)
> >>>
> >>> Name: 192.168.1.1:50010
> >>> Decommission Status : Normal
> >>> Configured Capacity: 30352158720 (28.27 GB)
> >>> DFS Used: 36864 (36 KB)
> >>> Non DFS Used: 26595766272 (24.77 GB)
> >>> DFS Remaining: 3756355584(3.5 GB)
> >>> DFS Used%: 0%
> >>> DFS Remaining%: 12.38%
> >>> Last contact: Mon Jan 02 11:19:44 CST 2012
> >>>
> >>> Why is there only total 1 node available? How to fix this problem?
> >>>
> >>> Thanks.
> >>
> >>
> >
>
>
>
> --
> Harsh J
>

Re: Why total node just 1

Posted by Harsh J <ha...@cloudera.com>.

Martinus,

High-Availability NameNode is being worked upon and an initial version
will be out soon. Check out the
https://issues.apache.org/jira/browse/HDFS-1623 JIRA for its
state/discussions.

You can also clone the Hadoop repo and switch to branch 'HDFS-1623' to
give it a whirl, although it is still being worked upon presently.

For now, we recommend using multiple ${dfs.name.dir} directories
(across mounts), preferably one of them being a reliable-enough NFS
point.

On Wed, Jan 4, 2012 at 2:26 PM, Martinus Martinus <ma...@gmail.com> wrote:
> Hi Bharath,
>
> Thanks for your answer. I remembered hadoop has single point of failure,
> which is it's namenode. Is there a way to make my hadoop clusters to become
> fault tolerant, even when the master node (namenode) fail?
>
>
> Thanks and Happy New Year 2012.
>
> On Tue, Jan 3, 2012 at 2:20 AM, Bharath Mundlapudi <mu...@gmail.com>
> wrote:
>>
>> You might want to check the datanode logs. Go to the 3 remaining nodes
>> which didn't start and restart the datanode.
>>
>> -Bharath
>>
>>
>> On Sun, Jan 1, 2012 at 7:23 PM, Martinus Martinus <ma...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
>>> checked in every node, there are tasktracker and datanode run, but when I
>>> run hadoop dfsadmin -report it's said like this :
>>>
>>> Configured Capacity: 30352158720 (28.27 GB)
>>> Present Capacity: 3756392448 (3.5 GB)
>>> DFS Remaining: 3756355584 (3.5 GB)
>>> DFS Used: 36864 (36 KB)
>>> DFS Used%: 0%
>>> Under replicated blocks: 1
>>> Blocks with corrupt replicas: 0
>>> Missing blocks: 0
>>>
>>> -------------------------------------------------
>>> Datanodes available: 1 (1 total, 0 dead)
>>>
>>> Name: 192.168.1.1:50010
>>> Decommission Status : Normal
>>> Configured Capacity: 30352158720 (28.27 GB)
>>> DFS Used: 36864 (36 KB)
>>> Non DFS Used: 26595766272 (24.77 GB)
>>> DFS Remaining: 3756355584(3.5 GB)
>>> DFS Used%: 0%
>>> DFS Remaining%: 12.38%
>>> Last contact: Mon Jan 02 11:19:44 CST 2012
>>>
>>> Why is there only total 1 node available? How to fix this problem?
>>>
>>> Thanks.
>>
>>
>



-- 
Harsh J

Re: Why total node just 1

Posted by Martinus Martinus <ma...@gmail.com>.

Hi Bharath,

Thanks for your answer. I remembered hadoop has single point of failure,
which is it's namenode. Is there a way to make my hadoop clusters to become
fault tolerant, even when the master node (namenode) fail?

Thanks and Happy New Year 2012.

On Tue, Jan 3, 2012 at 2:20 AM, Bharath Mundlapudi <mu...@gmail.com>wrote:

> You might want to check the datanode logs. Go to the 3 remaining nodes
> which didn't start and restart the datanode.
>
> -Bharath
>
>
> On Sun, Jan 1, 2012 at 7:23 PM, Martinus Martinus <ma...@gmail.com>wrote:
>
>> Hi,
>>
>> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
>> checked in every node, there are tasktracker and datanode run, but when I
>> run hadoop dfsadmin -report it's said like this :
>>
>> Configured Capacity: 30352158720 (28.27 GB)
>> Present Capacity: 3756392448 (3.5 GB)
>> DFS Remaining: 3756355584 (3.5 GB)
>> DFS Used: 36864 (36 KB)
>> DFS Used%: 0%
>> Under replicated blocks: 1
>> Blocks with corrupt replicas: 0
>> Missing blocks: 0
>>
>> -------------------------------------------------
>> Datanodes available: 1 (1 total, 0 dead)
>>
>> Name: 192.168.1.1:50010
>> Decommission Status : Normal
>> Configured Capacity: 30352158720 (28.27 GB)
>> DFS Used: 36864 (36 KB)
>> Non DFS Used: 26595766272 (24.77 GB)
>> DFS Remaining: 3756355584(3.5 GB)
>> DFS Used%: 0%
>> DFS Remaining%: 12.38%
>> Last contact: Mon Jan 02 11:19:44 CST 2012
>>
>> Why is there only total 1 node available? How to fix this problem?
>>
>> Thanks.
>>
>
>

Re: Why total node just 1

Posted by Bharath Mundlapudi <mu...@gmail.com>.

You might want to check the datanode logs. Go to the 3 remaining nodes
which didn't start and restart the datanode.

-Bharath

On Sun, Jan 1, 2012 at 7:23 PM, Martinus Martinus <ma...@gmail.com>wrote:

> Hi,
>
> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
> checked in every node, there are tasktracker and datanode run, but when I
> run hadoop dfsadmin -report it's said like this :
>
> Configured Capacity: 30352158720 (28.27 GB)
> Present Capacity: 3756392448 (3.5 GB)
> DFS Remaining: 3756355584 (3.5 GB)
> DFS Used: 36864 (36 KB)
> DFS Used%: 0%
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.1.1:50010
> Decommission Status : Normal
> Configured Capacity: 30352158720 (28.27 GB)
> DFS Used: 36864 (36 KB)
> Non DFS Used: 26595766272 (24.77 GB)
> DFS Remaining: 3756355584(3.5 GB)
> DFS Used%: 0%
> DFS Remaining%: 12.38%
> Last contact: Mon Jan 02 11:19:44 CST 2012
>
> Why is there only total 1 node available? How to fix this problem?
>
> Thanks.
>

Re: Why total node just 1

Posted by Prashant Sharma <pr...@imaginea.com>.

You can check (datanode) logs on every system. Most probably datanodes are
not able to join the namenode.

-P

On Mon, Jan 2, 2012 at 8:53 AM, Martinus Martinus <ma...@gmail.com>wrote:

> Hi,
>
> I have setup a hadoop clusters with 4 nodes and I have start-all.sh and
> checked in every node, there are tasktracker and datanode run, but when I
> run hadoop dfsadmin -report it's said like this :
>
> Configured Capacity: 30352158720 (28.27 GB)
> Present Capacity: 3756392448 (3.5 GB)
> DFS Remaining: 3756355584 (3.5 GB)
> DFS Used: 36864 (36 KB)
> DFS Used%: 0%
> Under replicated blocks: 1
> Blocks with corrupt replicas: 0
> Missing blocks: 0
>
> -------------------------------------------------
> Datanodes available: 1 (1 total, 0 dead)
>
> Name: 192.168.1.1:50010
> Decommission Status : Normal
> Configured Capacity: 30352158720 (28.27 GB)
> DFS Used: 36864 (36 KB)
> Non DFS Used: 26595766272 (24.77 GB)
> DFS Remaining: 3756355584(3.5 GB)
> DFS Used%: 0%
> DFS Remaining%: 12.38%
> Last contact: Mon Jan 02 11:19:44 CST 2012
>
> Why is there only total 1 node available? How to fix this problem?
>
> Thanks.
>