You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Arv Mistry <ar...@kindsight.net> on 2010/01/21 17:25:29 UTC

HDFS data storage across multiple disks on slave

Hi,

I'm using hadoop 0.20 and trying to understand how hadoop stores the
data.
The setup I have is a single slave, with two disks, 500G each. In the
hdfs-site.xml file I specify for the dfs.data.dir the two disks i.e.
/opt/dfs/data,opt1/dfs/data.

Now, a couple of things when I do a report i.e. ./hadoop dfsadmin
-report,
It only says I have a configured capacity of (500G), should that not be
twice that, since there are 2 500G disks.

And when I look at the data being written, its only written to
/opt/dfs/data. There is no directory /opt1/dfs/data. Should that not
have been created when I formatted the hdfs?

Could anyone tell me is there an easy way to add this second disk to the
HDFS and preserve the existing data. And any ideas what I did wrong that
it didn't get created/used.

Any insight would be appreciated.

Cheers Arv 

RE: HDFS data storage across multiple disks on slave

Posted by Arv Mistry <ar...@kindsight.net>.
Sorry that was a typo, it is actually "/opt1/dfs/data"

Cheers Arv

-----Original Message-----
From: Wang Xu [mailto:gnawux@gmail.com] 
Sent: January 21, 2010 11:36 AM
To: common-user@hadoop.apache.org
Subject: Re: HDFS data storage across multiple disks on slave

On Fri, Jan 22, 2010 at 12:25 AM, Arv Mistry <ar...@kindsight.net> wrote:
> The setup I have is a single slave, with two disks, 500G each. In the
> hdfs-site.xml file I specify for the dfs.data.dir the two disks i.e.
> /opt/dfs/data,opt1/dfs/data.

It looks you should configure "/opt1/dfs/data" rather than "opt1/dfs/data"

-- 
Wang Xu
Samuel Goldwyn  - "I'm willing to admit that I may not always be
right, but I am never wrong." -
http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html

Re: HDFS data storage across multiple disks on slave

Posted by Wang Xu <gn...@gmail.com>.
On Fri, Jan 22, 2010 at 12:25 AM, Arv Mistry <ar...@kindsight.net> wrote:
> The setup I have is a single slave, with two disks, 500G each. In the
> hdfs-site.xml file I specify for the dfs.data.dir the two disks i.e.
> /opt/dfs/data,opt1/dfs/data.

It looks you should configure "/opt1/dfs/data" rather than "opt1/dfs/data"

-- 
Wang Xu
Samuel Goldwyn  - "I'm willing to admit that I may not always be
right, but I am never wrong." -
http://www.brainyquote.com/quotes/authors/s/samuel_goldwyn.html

Re: HDFS data storage across multiple disks on slave

Posted by Allen Wittenauer <aw...@linkedin.com>.

You should be able to modify the dfs.data.dir property and bounce the
datanode process.

On 1/21/10 9:30 AM, "Arv Mistry" <ar...@kindsight.net> wrote:

> I had a typo in my original email, which I've corrected below.
> But assuming I wanted to add another disk to my slave, should I be able
> to do that without losing my current data? Anyone have any documentation
> or link they could send me that describes this.
> 
> Appreciate your help,
> 
> Cheers Arv
> 
> -----Original Message-----
> From: Arv Mistry [mailto:arv@kindsight.net]
> Sent: January 21, 2010 11:25 AM
> To: common-user@hadoop.apache.org; common-dev-info@hadoop.apache.org
> Subject: HDFS data storage across multiple disks on slave
> 
> Hi,
> 
> I'm using hadoop 0.20 and trying to understand how hadoop stores the
> data.
> The setup I have is a single slave, with two disks, 500G each. In the
> hdfs-site.xml file I specify for the dfs.data.dir the two disks i.e.
> /opt/dfs/data,/opt1/dfs/data.
> 
> Now, a couple of things when I do a report i.e. ./hadoop dfsadmin
> -report,
> It only says I have a configured capacity of (500G), should that not be
> twice that, since there are 2 500G disks.
> 
> And when I look at the data being written, its only written to
> /opt/dfs/data. There is no directory /opt1/dfs/data. Should that not
> have been created when I formatted the hdfs?
> 
> Could anyone tell me is there an easy way to add this second disk to the
> HDFS and preserve the existing data. And any ideas what I did wrong that
> it didn't get created/used.
> 
> Any insight would be appreciated.
> 
> Cheers Arv 


RE: HDFS data storage across multiple disks on slave

Posted by Arv Mistry <ar...@kindsight.net>.
I had a typo in my original email, which I've corrected below.
But assuming I wanted to add another disk to my slave, should I be able
to do that without losing my current data? Anyone have any documentation
or link they could send me that describes this.

Appreciate your help,

Cheers Arv

-----Original Message-----
From: Arv Mistry [mailto:arv@kindsight.net] 
Sent: January 21, 2010 11:25 AM
To: common-user@hadoop.apache.org; common-dev-info@hadoop.apache.org
Subject: HDFS data storage across multiple disks on slave

Hi,

I'm using hadoop 0.20 and trying to understand how hadoop stores the
data.
The setup I have is a single slave, with two disks, 500G each. In the
hdfs-site.xml file I specify for the dfs.data.dir the two disks i.e.
/opt/dfs/data,/opt1/dfs/data.

Now, a couple of things when I do a report i.e. ./hadoop dfsadmin
-report,
It only says I have a configured capacity of (500G), should that not be
twice that, since there are 2 500G disks.

And when I look at the data being written, its only written to
/opt/dfs/data. There is no directory /opt1/dfs/data. Should that not
have been created when I formatted the hdfs?

Could anyone tell me is there an easy way to add this second disk to the
HDFS and preserve the existing data. And any ideas what I did wrong that
it didn't get created/used.

Any insight would be appreciated.

Cheers Arv