You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by jeba earnest <je...@yahoo.com> on 2013/01/30 09:08:56 UTC

Maximum Storage size in a Single datanode


Hi,


Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?
 Regards,
M. Jeba

RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Hi Jeba,

 

There are other considerations too, for example, if a single node holds 1 PB
of data and if it were to die this would cause a significant amount of
traffic as NameNode arranges for new replicas to be created.

 

Vijay

 

From: Bertrand Dechoux [mailto:dechouxb@gmail.com] 
Sent: 30 January 2013 09:14
To: user@hadoop.apache.org; jeba earnest
Subject: Re: Maximum Storage size in a Single datanode

 

I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node?

 

Regards,
M. Jeba

 


RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Hi Jeba,

 

There are other considerations too, for example, if a single node holds 1 PB
of data and if it were to die this would cause a significant amount of
traffic as NameNode arranges for new replicas to be created.

 

Vijay

 

From: Bertrand Dechoux [mailto:dechouxb@gmail.com] 
Sent: 30 January 2013 09:14
To: user@hadoop.apache.org; jeba earnest
Subject: Re: Maximum Storage size in a Single datanode

 

I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node?

 

Regards,
M. Jeba

 


RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Hi Jeba,

 

There are other considerations too, for example, if a single node holds 1 PB
of data and if it were to die this would cause a significant amount of
traffic as NameNode arranges for new replicas to be created.

 

Vijay

 

From: Bertrand Dechoux [mailto:dechouxb@gmail.com] 
Sent: 30 January 2013 09:14
To: user@hadoop.apache.org; jeba earnest
Subject: Re: Maximum Storage size in a Single datanode

 

I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node?

 

Regards,
M. Jeba

 


RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Hi Jeba,

 

There are other considerations too, for example, if a single node holds 1 PB
of data and if it were to die this would cause a significant amount of
traffic as NameNode arranges for new replicas to be created.

 

Vijay

 

From: Bertrand Dechoux [mailto:dechouxb@gmail.com] 
Sent: 30 January 2013 09:14
To: user@hadoop.apache.org; jeba earnest
Subject: Re: Maximum Storage size in a Single datanode

 

I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node?

 

Regards,
M. Jeba

 


Re: Maximum Storage size in a Single datanode

Posted by Bertrand Dechoux <de...@gmail.com>.
I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

Re: Maximum Storage size in a Single datanode

Posted by Bertrand Dechoux <de...@gmail.com>.
I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

Re: Maximum Storage size in a Single datanode

Posted by Bertrand Dechoux <de...@gmail.com>.
I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

Re: Maximum Storage size in a Single datanode

Posted by Bertrand Dechoux <de...@gmail.com>.
I would say the hard limit is due to the OS local file system (and your
budget).

So short answer for ext3 : it doesn't seems so.
http://en.wikipedia.org/wiki/Ext3

And I am not sure the answer is the most interesting. Even if you could put
1 Peta on one node, what is usually interesting is the ratio
storage/compute.

Bertrand

On Wed, Jan 30, 2013 at 9:08 AM, jeba earnest <je...@yahoo.com> wrote:

>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

Re: Maximum Storage size in a Single datanode

Posted by Michel Segel <mi...@hotmail.com>.
Can you say Centos?
:-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 30, 2013, at 4:21 AM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi,
> 
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
> 
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
> 
> JM
> 
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
>> Jeba,
>> 
>> 
>> 
>> I'm not aware of any hadoop limitations in this respect (others may be able
>> to comment on this); since blocks are just files on the OS, the datanode
>> will create subdirectories to store blocks to avoid problems with large
>> numbers of files in a single directory. So I would think the limitations
>> are
>> primarily around the type of file system you select, for ext3 it
>> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
>> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
>> probably already planning on deploying 64-bit servers, I believe for large
>> FS on ext4 you'd be better off with a 64-bit server.
>> 
>> 
>> 
>> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
>> etc) I believe there are more production deployments using RHEL and/or
>> CentOS than Ubuntu.
>> 
>> 
>> 
>> It's probably not practical to have nodes with 1PB of data for the reasons
>> that others have mentioned and due to the replication traffic that will be
>> generated if the node dies. Not to mention fsck times with large file
>> systems.
>> 
>> 
>> 
>> Vijay
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest [mailto:jebaearnest@yahoo.com]
>> Sent: 30 January 2013 10:40
>> To: user@hadoop.apache.org
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> I want to use either UBUNTU or REDHAT .
>> 
>> I just want to know how much storage space we can allocate in a single data
>> node.
>> 
>> 
>> 
>> Is there any limitations in hadoop for storage in single node?
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> 
>> Jeba
>> 
>>  _____
>> 
>> From: "Pamecha, Abhishek" <ap...@ebay.com>
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
>> <je...@yahoo.com>
>> Sent: Wednesday, 30 January 2013 2:45 PM
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> What would be the reason you would do that?
>> 
>> 
>> 
>> You would want to leverage distributed dataset for higher availability and
>> better response times.
>> 
>> 
>> 
>> The maximum storage depends completely on the disks  capacity of your nodes
>> and what your OS supports. Typically I have heard of about 1-2 TB/node to
>> start with, but I may be wrong.
>> 
>> -abhishek
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest <je...@yahoo.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
>> <je...@yahoo.com>
>> Date: Wednesday, January 30, 2013 1:38 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> Is it possible to keep 1 Petabyte in a single data node?
>> 
>> If not, How much is the maximum storage for a particular data node?
>> 
>> 
>> 
>> Regards,
>> M. Jeba
> 

Re: Maximum Storage size in a Single datanode

Posted by Fatih Haltas <fa...@nyu.edu>.
I think, he just wants to learn the approximate storage capacity of each
datanode that he should configure , 1 PB is just a made up amount of
storage, I guess. Probably, in my opinion, he already knows that even not
caring hadoop, this is too much for any server, but he made just a made up
mistake :)

Dont keep on at him that much :)

30 Ocak 2013 Çarşamba tarihinde Jean-Marc Spaggiari adlı kullanıcı şöyle
yazdı:

> Hi,
>
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
>
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
>
> JM
>
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> > Jeba,
> >
> >
> >
> > I'm not aware of any hadoop limitations in this respect (others may be
> able
> > to comment on this); since blocks are just files on the OS, the datanode
> > will create subdirectories to store blocks to avoid problems with large
> > numbers of files in a single directory. So I would think the limitations
> > are
> > primarily around the type of file system you select, for ext3 it
> > theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3)
> and
> > for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> > probably already planning on deploying 64-bit servers, I believe for
> large
> > FS on ext4 you'd be better off with a 64-bit server.
> >
> >
> >
> > As far as OS is concerned anecdotally (based on blogs, hadoop mailing
> lists
> > etc) I believe there are more production deployments using RHEL and/or
> > CentOS than Ubuntu.
> >
> >
> >
> > It's probably not practical to have nodes with 1PB of data for the
> reasons
> > that others have mentioned and due to the replication traffic that will
> be
> > generated if the node dies. Not to mention fsck times with large file
> > systems.
> >
> >
> >
> > Vijay
> >
> >
> >
> >
> >
> >
> >
> > From: jeba earnest [mailto:jebaearnest@yahoo.com]
> > Sent: 30 January 2013 10:40
> > To: user@hadoop.apache.org
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > I want to use either UBUNTU or REDHAT .
> >
> > I just want to know how much storage space we can allocate in a single
> data
> > node.
> >
> >
> >
> > Is there any limitations in hadoop for storage in single node?
> >
> >
> >
> >
> >
> >
> >
> > Regards,
> >
> > Jeba
> >
> >   _____
> >
> > From: "Pamecha, Abhishek" <ap...@ebay.com>
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> > <je...@yahoo.com>
> > Sent: Wednesday, 30 January 2013 2:45 PM
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> > What would be the reason you would do that?
> >
> >
> >
> > You would want to leverage distributed dataset for higher availability
> and
> > better response times.
> >
> >
> >
> > The maximum storage depends completely on the disks  capacity of your
> nodes
> > and what your OS supports. Typically I have heard of about 1-2 TB/node to
> > start with, but I may be wrong.
> >
> > -abhishek
> >
> >
> >
> >
> >
> > From: jeba earnest <je...@yahoo.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba
> earnest
> > <je...@yahoo.com>
> > Date: Wednesday, January 30, 2013 1:38 PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > Hi,
> >
> >
> >
> > Is it possible to keep 1 Petabyte in a single data node?
> >
> > If not, How much is the maximum storage for a particular data node?
> >
> >
> >
> > Regards,
> > M. Jeba
> >
> >
> >
> >
>

Re: Maximum Storage size in a Single datanode

Posted by Fatih Haltas <fa...@nyu.edu>.
I think, he just wants to learn the approximate storage capacity of each
datanode that he should configure , 1 PB is just a made up amount of
storage, I guess. Probably, in my opinion, he already knows that even not
caring hadoop, this is too much for any server, but he made just a made up
mistake :)

Dont keep on at him that much :)

30 Ocak 2013 Çarşamba tarihinde Jean-Marc Spaggiari adlı kullanıcı şöyle
yazdı:

> Hi,
>
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
>
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
>
> JM
>
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> > Jeba,
> >
> >
> >
> > I'm not aware of any hadoop limitations in this respect (others may be
> able
> > to comment on this); since blocks are just files on the OS, the datanode
> > will create subdirectories to store blocks to avoid problems with large
> > numbers of files in a single directory. So I would think the limitations
> > are
> > primarily around the type of file system you select, for ext3 it
> > theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3)
> and
> > for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> > probably already planning on deploying 64-bit servers, I believe for
> large
> > FS on ext4 you'd be better off with a 64-bit server.
> >
> >
> >
> > As far as OS is concerned anecdotally (based on blogs, hadoop mailing
> lists
> > etc) I believe there are more production deployments using RHEL and/or
> > CentOS than Ubuntu.
> >
> >
> >
> > It's probably not practical to have nodes with 1PB of data for the
> reasons
> > that others have mentioned and due to the replication traffic that will
> be
> > generated if the node dies. Not to mention fsck times with large file
> > systems.
> >
> >
> >
> > Vijay
> >
> >
> >
> >
> >
> >
> >
> > From: jeba earnest [mailto:jebaearnest@yahoo.com]
> > Sent: 30 January 2013 10:40
> > To: user@hadoop.apache.org
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > I want to use either UBUNTU or REDHAT .
> >
> > I just want to know how much storage space we can allocate in a single
> data
> > node.
> >
> >
> >
> > Is there any limitations in hadoop for storage in single node?
> >
> >
> >
> >
> >
> >
> >
> > Regards,
> >
> > Jeba
> >
> >   _____
> >
> > From: "Pamecha, Abhishek" <ap...@ebay.com>
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> > <je...@yahoo.com>
> > Sent: Wednesday, 30 January 2013 2:45 PM
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> > What would be the reason you would do that?
> >
> >
> >
> > You would want to leverage distributed dataset for higher availability
> and
> > better response times.
> >
> >
> >
> > The maximum storage depends completely on the disks  capacity of your
> nodes
> > and what your OS supports. Typically I have heard of about 1-2 TB/node to
> > start with, but I may be wrong.
> >
> > -abhishek
> >
> >
> >
> >
> >
> > From: jeba earnest <je...@yahoo.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba
> earnest
> > <je...@yahoo.com>
> > Date: Wednesday, January 30, 2013 1:38 PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > Hi,
> >
> >
> >
> > Is it possible to keep 1 Petabyte in a single data node?
> >
> > If not, How much is the maximum storage for a particular data node?
> >
> >
> >
> > Regards,
> > M. Jeba
> >
> >
> >
> >
>

Re: Maximum Storage size in a Single datanode

Posted by Fatih Haltas <fa...@nyu.edu>.
I think, he just wants to learn the approximate storage capacity of each
datanode that he should configure , 1 PB is just a made up amount of
storage, I guess. Probably, in my opinion, he already knows that even not
caring hadoop, this is too much for any server, but he made just a made up
mistake :)

Dont keep on at him that much :)

30 Ocak 2013 Çarşamba tarihinde Jean-Marc Spaggiari adlı kullanıcı şöyle
yazdı:

> Hi,
>
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
>
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
>
> JM
>
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> > Jeba,
> >
> >
> >
> > I'm not aware of any hadoop limitations in this respect (others may be
> able
> > to comment on this); since blocks are just files on the OS, the datanode
> > will create subdirectories to store blocks to avoid problems with large
> > numbers of files in a single directory. So I would think the limitations
> > are
> > primarily around the type of file system you select, for ext3 it
> > theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3)
> and
> > for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> > probably already planning on deploying 64-bit servers, I believe for
> large
> > FS on ext4 you'd be better off with a 64-bit server.
> >
> >
> >
> > As far as OS is concerned anecdotally (based on blogs, hadoop mailing
> lists
> > etc) I believe there are more production deployments using RHEL and/or
> > CentOS than Ubuntu.
> >
> >
> >
> > It's probably not practical to have nodes with 1PB of data for the
> reasons
> > that others have mentioned and due to the replication traffic that will
> be
> > generated if the node dies. Not to mention fsck times with large file
> > systems.
> >
> >
> >
> > Vijay
> >
> >
> >
> >
> >
> >
> >
> > From: jeba earnest [mailto:jebaearnest@yahoo.com]
> > Sent: 30 January 2013 10:40
> > To: user@hadoop.apache.org
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > I want to use either UBUNTU or REDHAT .
> >
> > I just want to know how much storage space we can allocate in a single
> data
> > node.
> >
> >
> >
> > Is there any limitations in hadoop for storage in single node?
> >
> >
> >
> >
> >
> >
> >
> > Regards,
> >
> > Jeba
> >
> >   _____
> >
> > From: "Pamecha, Abhishek" <ap...@ebay.com>
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> > <je...@yahoo.com>
> > Sent: Wednesday, 30 January 2013 2:45 PM
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> > What would be the reason you would do that?
> >
> >
> >
> > You would want to leverage distributed dataset for higher availability
> and
> > better response times.
> >
> >
> >
> > The maximum storage depends completely on the disks  capacity of your
> nodes
> > and what your OS supports. Typically I have heard of about 1-2 TB/node to
> > start with, but I may be wrong.
> >
> > -abhishek
> >
> >
> >
> >
> >
> > From: jeba earnest <je...@yahoo.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba
> earnest
> > <je...@yahoo.com>
> > Date: Wednesday, January 30, 2013 1:38 PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > Hi,
> >
> >
> >
> > Is it possible to keep 1 Petabyte in a single data node?
> >
> > If not, How much is the maximum storage for a particular data node?
> >
> >
> >
> > Regards,
> > M. Jeba
> >
> >
> >
> >
>

Re: Maximum Storage size in a Single datanode

Posted by Michel Segel <mi...@hotmail.com>.
Can you say Centos?
:-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 30, 2013, at 4:21 AM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi,
> 
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
> 
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
> 
> JM
> 
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
>> Jeba,
>> 
>> 
>> 
>> I'm not aware of any hadoop limitations in this respect (others may be able
>> to comment on this); since blocks are just files on the OS, the datanode
>> will create subdirectories to store blocks to avoid problems with large
>> numbers of files in a single directory. So I would think the limitations
>> are
>> primarily around the type of file system you select, for ext3 it
>> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
>> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
>> probably already planning on deploying 64-bit servers, I believe for large
>> FS on ext4 you'd be better off with a 64-bit server.
>> 
>> 
>> 
>> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
>> etc) I believe there are more production deployments using RHEL and/or
>> CentOS than Ubuntu.
>> 
>> 
>> 
>> It's probably not practical to have nodes with 1PB of data for the reasons
>> that others have mentioned and due to the replication traffic that will be
>> generated if the node dies. Not to mention fsck times with large file
>> systems.
>> 
>> 
>> 
>> Vijay
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest [mailto:jebaearnest@yahoo.com]
>> Sent: 30 January 2013 10:40
>> To: user@hadoop.apache.org
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> I want to use either UBUNTU or REDHAT .
>> 
>> I just want to know how much storage space we can allocate in a single data
>> node.
>> 
>> 
>> 
>> Is there any limitations in hadoop for storage in single node?
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> 
>> Jeba
>> 
>>  _____
>> 
>> From: "Pamecha, Abhishek" <ap...@ebay.com>
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
>> <je...@yahoo.com>
>> Sent: Wednesday, 30 January 2013 2:45 PM
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> What would be the reason you would do that?
>> 
>> 
>> 
>> You would want to leverage distributed dataset for higher availability and
>> better response times.
>> 
>> 
>> 
>> The maximum storage depends completely on the disks  capacity of your nodes
>> and what your OS supports. Typically I have heard of about 1-2 TB/node to
>> start with, but I may be wrong.
>> 
>> -abhishek
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest <je...@yahoo.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
>> <je...@yahoo.com>
>> Date: Wednesday, January 30, 2013 1:38 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> Is it possible to keep 1 Petabyte in a single data node?
>> 
>> If not, How much is the maximum storage for a particular data node?
>> 
>> 
>> 
>> Regards,
>> M. Jeba
> 

Re: Maximum Storage size in a Single datanode

Posted by Michel Segel <mi...@hotmail.com>.
Can you say Centos?
:-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 30, 2013, at 4:21 AM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi,
> 
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
> 
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
> 
> JM
> 
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
>> Jeba,
>> 
>> 
>> 
>> I'm not aware of any hadoop limitations in this respect (others may be able
>> to comment on this); since blocks are just files on the OS, the datanode
>> will create subdirectories to store blocks to avoid problems with large
>> numbers of files in a single directory. So I would think the limitations
>> are
>> primarily around the type of file system you select, for ext3 it
>> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
>> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
>> probably already planning on deploying 64-bit servers, I believe for large
>> FS on ext4 you'd be better off with a 64-bit server.
>> 
>> 
>> 
>> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
>> etc) I believe there are more production deployments using RHEL and/or
>> CentOS than Ubuntu.
>> 
>> 
>> 
>> It's probably not practical to have nodes with 1PB of data for the reasons
>> that others have mentioned and due to the replication traffic that will be
>> generated if the node dies. Not to mention fsck times with large file
>> systems.
>> 
>> 
>> 
>> Vijay
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest [mailto:jebaearnest@yahoo.com]
>> Sent: 30 January 2013 10:40
>> To: user@hadoop.apache.org
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> I want to use either UBUNTU or REDHAT .
>> 
>> I just want to know how much storage space we can allocate in a single data
>> node.
>> 
>> 
>> 
>> Is there any limitations in hadoop for storage in single node?
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> 
>> Jeba
>> 
>>  _____
>> 
>> From: "Pamecha, Abhishek" <ap...@ebay.com>
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
>> <je...@yahoo.com>
>> Sent: Wednesday, 30 January 2013 2:45 PM
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> What would be the reason you would do that?
>> 
>> 
>> 
>> You would want to leverage distributed dataset for higher availability and
>> better response times.
>> 
>> 
>> 
>> The maximum storage depends completely on the disks  capacity of your nodes
>> and what your OS supports. Typically I have heard of about 1-2 TB/node to
>> start with, but I may be wrong.
>> 
>> -abhishek
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest <je...@yahoo.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
>> <je...@yahoo.com>
>> Date: Wednesday, January 30, 2013 1:38 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> Is it possible to keep 1 Petabyte in a single data node?
>> 
>> If not, How much is the maximum storage for a particular data node?
>> 
>> 
>> 
>> Regards,
>> M. Jeba
> 

Re: Maximum Storage size in a Single datanode

Posted by Michel Segel <mi...@hotmail.com>.
Can you say Centos?
:-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jan 30, 2013, at 4:21 AM, Jean-Marc Spaggiari <je...@spaggiari.org> wrote:

> Hi,
> 
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
> 
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
> 
> JM
> 
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
>> Jeba,
>> 
>> 
>> 
>> I'm not aware of any hadoop limitations in this respect (others may be able
>> to comment on this); since blocks are just files on the OS, the datanode
>> will create subdirectories to store blocks to avoid problems with large
>> numbers of files in a single directory. So I would think the limitations
>> are
>> primarily around the type of file system you select, for ext3 it
>> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
>> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
>> probably already planning on deploying 64-bit servers, I believe for large
>> FS on ext4 you'd be better off with a 64-bit server.
>> 
>> 
>> 
>> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
>> etc) I believe there are more production deployments using RHEL and/or
>> CentOS than Ubuntu.
>> 
>> 
>> 
>> It's probably not practical to have nodes with 1PB of data for the reasons
>> that others have mentioned and due to the replication traffic that will be
>> generated if the node dies. Not to mention fsck times with large file
>> systems.
>> 
>> 
>> 
>> Vijay
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest [mailto:jebaearnest@yahoo.com]
>> Sent: 30 January 2013 10:40
>> To: user@hadoop.apache.org
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> I want to use either UBUNTU or REDHAT .
>> 
>> I just want to know how much storage space we can allocate in a single data
>> node.
>> 
>> 
>> 
>> Is there any limitations in hadoop for storage in single node?
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> 
>> Jeba
>> 
>>  _____
>> 
>> From: "Pamecha, Abhishek" <ap...@ebay.com>
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
>> <je...@yahoo.com>
>> Sent: Wednesday, 30 January 2013 2:45 PM
>> Subject: Re: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> What would be the reason you would do that?
>> 
>> 
>> 
>> You would want to leverage distributed dataset for higher availability and
>> better response times.
>> 
>> 
>> 
>> The maximum storage depends completely on the disks  capacity of your nodes
>> and what your OS supports. Typically I have heard of about 1-2 TB/node to
>> start with, but I may be wrong.
>> 
>> -abhishek
>> 
>> 
>> 
>> 
>> 
>> From: jeba earnest <je...@yahoo.com>
>> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
>> <je...@yahoo.com>
>> Date: Wednesday, January 30, 2013 1:38 PM
>> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
>> Subject: Maximum Storage size in a Single datanode
>> 
>> 
>> 
>> 
>> 
>> Hi,
>> 
>> 
>> 
>> Is it possible to keep 1 Petabyte in a single data node?
>> 
>> If not, How much is the maximum storage for a particular data node?
>> 
>> 
>> 
>> Regards,
>> M. Jeba
> 

Re: Maximum Storage size in a Single datanode

Posted by Fatih Haltas <fa...@nyu.edu>.
I think, he just wants to learn the approximate storage capacity of each
datanode that he should configure , 1 PB is just a made up amount of
storage, I guess. Probably, in my opinion, he already knows that even not
caring hadoop, this is too much for any server, but he made just a made up
mistake :)

Dont keep on at him that much :)

30 Ocak 2013 Çarşamba tarihinde Jean-Marc Spaggiari adlı kullanıcı şöyle
yazdı:

> Hi,
>
> Also, think about the memory you will need in your DataNode to serve
> all this data... I'm not sure there is any server which can take that
> today. You need a certain amount of memory per block in the DN. With
> all this data, you will have SOOOO many blocks...
>
> Regarding RH vs Ubuntu, I think Ubuntu is more an end user
> distribution than a server one. And I found RH a bit "not enought
> free". I have installed Debian on all my servers.
>
> JM
>
> 2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> > Jeba,
> >
> >
> >
> > I'm not aware of any hadoop limitations in this respect (others may be
> able
> > to comment on this); since blocks are just files on the OS, the datanode
> > will create subdirectories to store blocks to avoid problems with large
> > numbers of files in a single directory. So I would think the limitations
> > are
> > primarily around the type of file system you select, for ext3 it
> > theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3)
> and
> > for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> > probably already planning on deploying 64-bit servers, I believe for
> large
> > FS on ext4 you'd be better off with a 64-bit server.
> >
> >
> >
> > As far as OS is concerned anecdotally (based on blogs, hadoop mailing
> lists
> > etc) I believe there are more production deployments using RHEL and/or
> > CentOS than Ubuntu.
> >
> >
> >
> > It's probably not practical to have nodes with 1PB of data for the
> reasons
> > that others have mentioned and due to the replication traffic that will
> be
> > generated if the node dies. Not to mention fsck times with large file
> > systems.
> >
> >
> >
> > Vijay
> >
> >
> >
> >
> >
> >
> >
> > From: jeba earnest [mailto:jebaearnest@yahoo.com]
> > Sent: 30 January 2013 10:40
> > To: user@hadoop.apache.org
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > I want to use either UBUNTU or REDHAT .
> >
> > I just want to know how much storage space we can allocate in a single
> data
> > node.
> >
> >
> >
> > Is there any limitations in hadoop for storage in single node?
> >
> >
> >
> >
> >
> >
> >
> > Regards,
> >
> > Jeba
> >
> >   _____
> >
> > From: "Pamecha, Abhishek" <ap...@ebay.com>
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> > <je...@yahoo.com>
> > Sent: Wednesday, 30 January 2013 2:45 PM
> > Subject: Re: Maximum Storage size in a Single datanode
> >
> >
> >
> > What would be the reason you would do that?
> >
> >
> >
> > You would want to leverage distributed dataset for higher availability
> and
> > better response times.
> >
> >
> >
> > The maximum storage depends completely on the disks  capacity of your
> nodes
> > and what your OS supports. Typically I have heard of about 1-2 TB/node to
> > start with, but I may be wrong.
> >
> > -abhishek
> >
> >
> >
> >
> >
> > From: jeba earnest <je...@yahoo.com>
> > Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba
> earnest
> > <je...@yahoo.com>
> > Date: Wednesday, January 30, 2013 1:38 PM
> > To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> > Subject: Maximum Storage size in a Single datanode
> >
> >
> >
> >
> >
> > Hi,
> >
> >
> >
> > Is it possible to keep 1 Petabyte in a single data node?
> >
> > If not, How much is the maximum storage for a particular data node?
> >
> >
> >
> > Regards,
> > M. Jeba
> >
> >
> >
> >
>

Re: Maximum Storage size in a Single datanode

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi,

Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any server which can take that
today. You need a certain amount of memory per block in the DN. With
all this data, you will have SOOOO many blocks...

Regarding RH vs Ubuntu, I think Ubuntu is more an end user
distribution than a server one. And I found RH a bit "not enought
free". I have installed Debian on all my servers.

JM

2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> Jeba,
>
>
>
> I'm not aware of any hadoop limitations in this respect (others may be able
> to comment on this); since blocks are just files on the OS, the datanode
> will create subdirectories to store blocks to avoid problems with large
> numbers of files in a single directory. So I would think the limitations
> are
> primarily around the type of file system you select, for ext3 it
> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> probably already planning on deploying 64-bit servers, I believe for large
> FS on ext4 you'd be better off with a 64-bit server.
>
>
>
> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
> etc) I believe there are more production deployments using RHEL and/or
> CentOS than Ubuntu.
>
>
>
> It's probably not practical to have nodes with 1PB of data for the reasons
> that others have mentioned and due to the replication traffic that will be
> generated if the node dies. Not to mention fsck times with large file
> systems.
>
>
>
> Vijay
>
>
>
>
>
>
>
> From: jeba earnest [mailto:jebaearnest@yahoo.com]
> Sent: 30 January 2013 10:40
> To: user@hadoop.apache.org
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
>
>
> I want to use either UBUNTU or REDHAT .
>
> I just want to know how much storage space we can allocate in a single data
> node.
>
>
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
>
>
>
>
> Regards,
>
> Jeba
>
>   _____
>
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> <je...@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
> What would be the reason you would do that?
>
>
>
> You would want to leverage distributed dataset for higher availability and
> better response times.
>
>
>
> The maximum storage depends completely on the disks  capacity of your nodes
> and what your OS supports. Typically I have heard of about 1-2 TB/node to
> start with, but I may be wrong.
>
> -abhishek
>
>
>
>
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>
>
>
> Hi,
>
>
>
> Is it possible to keep 1 Petabyte in a single data node?
>
> If not, How much is the maximum storage for a particular data node?
>
>
>
> Regards,
> M. Jeba
>
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi,

Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any server which can take that
today. You need a certain amount of memory per block in the DN. With
all this data, you will have SOOOO many blocks...

Regarding RH vs Ubuntu, I think Ubuntu is more an end user
distribution than a server one. And I found RH a bit "not enought
free". I have installed Debian on all my servers.

JM

2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> Jeba,
>
>
>
> I'm not aware of any hadoop limitations in this respect (others may be able
> to comment on this); since blocks are just files on the OS, the datanode
> will create subdirectories to store blocks to avoid problems with large
> numbers of files in a single directory. So I would think the limitations
> are
> primarily around the type of file system you select, for ext3 it
> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> probably already planning on deploying 64-bit servers, I believe for large
> FS on ext4 you'd be better off with a 64-bit server.
>
>
>
> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
> etc) I believe there are more production deployments using RHEL and/or
> CentOS than Ubuntu.
>
>
>
> It's probably not practical to have nodes with 1PB of data for the reasons
> that others have mentioned and due to the replication traffic that will be
> generated if the node dies. Not to mention fsck times with large file
> systems.
>
>
>
> Vijay
>
>
>
>
>
>
>
> From: jeba earnest [mailto:jebaearnest@yahoo.com]
> Sent: 30 January 2013 10:40
> To: user@hadoop.apache.org
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
>
>
> I want to use either UBUNTU or REDHAT .
>
> I just want to know how much storage space we can allocate in a single data
> node.
>
>
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
>
>
>
>
> Regards,
>
> Jeba
>
>   _____
>
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> <je...@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
> What would be the reason you would do that?
>
>
>
> You would want to leverage distributed dataset for higher availability and
> better response times.
>
>
>
> The maximum storage depends completely on the disks  capacity of your nodes
> and what your OS supports. Typically I have heard of about 1-2 TB/node to
> start with, but I may be wrong.
>
> -abhishek
>
>
>
>
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>
>
>
> Hi,
>
>
>
> Is it possible to keep 1 Petabyte in a single data node?
>
> If not, How much is the maximum storage for a particular data node?
>
>
>
> Regards,
> M. Jeba
>
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi,

Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any server which can take that
today. You need a certain amount of memory per block in the DN. With
all this data, you will have SOOOO many blocks...

Regarding RH vs Ubuntu, I think Ubuntu is more an end user
distribution than a server one. And I found RH a bit "not enought
free". I have installed Debian on all my servers.

JM

2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> Jeba,
>
>
>
> I'm not aware of any hadoop limitations in this respect (others may be able
> to comment on this); since blocks are just files on the OS, the datanode
> will create subdirectories to store blocks to avoid problems with large
> numbers of files in a single directory. So I would think the limitations
> are
> primarily around the type of file system you select, for ext3 it
> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> probably already planning on deploying 64-bit servers, I believe for large
> FS on ext4 you'd be better off with a 64-bit server.
>
>
>
> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
> etc) I believe there are more production deployments using RHEL and/or
> CentOS than Ubuntu.
>
>
>
> It's probably not practical to have nodes with 1PB of data for the reasons
> that others have mentioned and due to the replication traffic that will be
> generated if the node dies. Not to mention fsck times with large file
> systems.
>
>
>
> Vijay
>
>
>
>
>
>
>
> From: jeba earnest [mailto:jebaearnest@yahoo.com]
> Sent: 30 January 2013 10:40
> To: user@hadoop.apache.org
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
>
>
> I want to use either UBUNTU or REDHAT .
>
> I just want to know how much storage space we can allocate in a single data
> node.
>
>
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
>
>
>
>
> Regards,
>
> Jeba
>
>   _____
>
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> <je...@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
> What would be the reason you would do that?
>
>
>
> You would want to leverage distributed dataset for higher availability and
> better response times.
>
>
>
> The maximum storage depends completely on the disks  capacity of your nodes
> and what your OS supports. Typically I have heard of about 1-2 TB/node to
> start with, but I may be wrong.
>
> -abhishek
>
>
>
>
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>
>
>
> Hi,
>
>
>
> Is it possible to keep 1 Petabyte in a single data node?
>
> If not, How much is the maximum storage for a particular data node?
>
>
>
> Regards,
> M. Jeba
>
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi,

Also, think about the memory you will need in your DataNode to serve
all this data... I'm not sure there is any server which can take that
today. You need a certain amount of memory per block in the DN. With
all this data, you will have SOOOO many blocks...

Regarding RH vs Ubuntu, I think Ubuntu is more an end user
distribution than a server one. And I found RH a bit "not enought
free". I have installed Debian on all my servers.

JM

2013/1/30, Vijay Thakorlal <vi...@hotmail.com>:
> Jeba,
>
>
>
> I'm not aware of any hadoop limitations in this respect (others may be able
> to comment on this); since blocks are just files on the OS, the datanode
> will create subdirectories to store blocks to avoid problems with large
> numbers of files in a single directory. So I would think the limitations
> are
> primarily around the type of file system you select, for ext3 it
> theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
> for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
> probably already planning on deploying 64-bit servers, I believe for large
> FS on ext4 you'd be better off with a 64-bit server.
>
>
>
> As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
> etc) I believe there are more production deployments using RHEL and/or
> CentOS than Ubuntu.
>
>
>
> It's probably not practical to have nodes with 1PB of data for the reasons
> that others have mentioned and due to the replication traffic that will be
> generated if the node dies. Not to mention fsck times with large file
> systems.
>
>
>
> Vijay
>
>
>
>
>
>
>
> From: jeba earnest [mailto:jebaearnest@yahoo.com]
> Sent: 30 January 2013 10:40
> To: user@hadoop.apache.org
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
>
>
> I want to use either UBUNTU or REDHAT .
>
> I just want to know how much storage space we can allocate in a single data
> node.
>
>
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
>
>
>
>
> Regards,
>
> Jeba
>
>   _____
>
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
> <je...@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
>
>
> What would be the reason you would do that?
>
>
>
> You would want to leverage distributed dataset for higher availability and
> better response times.
>
>
>
> The maximum storage depends completely on the disks  capacity of your nodes
> and what your OS supports. Typically I have heard of about 1-2 TB/node to
> start with, but I may be wrong.
>
> -abhishek
>
>
>
>
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>
>
>
> Hi,
>
>
>
> Is it possible to keep 1 Petabyte in a single data node?
>
> If not, How much is the maximum storage for a particular data node?
>
>
>
> Regards,
> M. Jeba
>
>
>
>

RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Jeba,

 

I'm not aware of any hadoop limitations in this respect (others may be able
to comment on this); since blocks are just files on the OS, the datanode
will create subdirectories to store blocks to avoid problems with large
numbers of files in a single directory. So I would think the limitations are
primarily around the type of file system you select, for ext3 it
theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
probably already planning on deploying 64-bit servers, I believe for large
FS on ext4 you'd be better off with a 64-bit server.

 

As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
etc) I believe there are more production deployments using RHEL and/or
CentOS than Ubuntu. 

 

It's probably not practical to have nodes with 1PB of data for the reasons
that others have mentioned and due to the replication traffic that will be
generated if the node dies. Not to mention fsck times with large file
systems.

 

Vijay

 

 

 

From: jeba earnest [mailto:jebaearnest@yahoo.com] 
Sent: 30 January 2013 10:40
To: user@hadoop.apache.org
Subject: Re: Maximum Storage size in a Single datanode

 

 

I want to use either UBUNTU or REDHAT .

I just want to know how much storage space we can allocate in a single data
node.

 

Is there any limitations in hadoop for storage in single node?

 

 

 

Regards,

Jeba

  _____  

From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
<je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode

 

What would be the reason you would do that? 

 

You would want to leverage distributed dataset for higher availability and
better response times.

 

The maximum storage depends completely on the disks  capacity of your nodes
and what your OS supports. Typically I have heard of about 1-2 TB/node to
start with, but I may be wrong.

-abhishek

 

 

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode

 

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 

 

Regards,
M. Jeba

 


Re: Maximum Storage size in a Single datanode

Posted by Mohammad Tariq <do...@gmail.com>.
I completely agree with everyone in the thread. Perhaps you are not
concerned much about the processing part, but it is still not a good idea.
Remember the power of Hadoop lies in the principle of "divide and rule" and
you are trying to go against that.

On Wednesday, January 30, 2013, Chris Embree <ce...@gmail.com> wrote:
> You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.
> As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.
> If you're just thinking of storing lots of data, you might consider
GlusterFS instead.
> I highly recommend RedHat over Ubuntu.
> Hope that helps.
>
> On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com>
wrote:
>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
data node.
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
> ________________________________
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
jebaearnest@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
> What would be the reason you would do that?
> You would want to leverage distributed dataset for higher availability
and better response times.
> The maximum storage depends completely on the disks  capacity of your
nodes and what your OS supports. Typically I have heard of about 1-2
TB/node to start with, but I may be wrong.
> -abhishek
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

-- 
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: Maximum Storage size in a Single datanode

Posted by Mohammad Tariq <do...@gmail.com>.
I completely agree with everyone in the thread. Perhaps you are not
concerned much about the processing part, but it is still not a good idea.
Remember the power of Hadoop lies in the principle of "divide and rule" and
you are trying to go against that.

On Wednesday, January 30, 2013, Chris Embree <ce...@gmail.com> wrote:
> You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.
> As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.
> If you're just thinking of storing lots of data, you might consider
GlusterFS instead.
> I highly recommend RedHat over Ubuntu.
> Hope that helps.
>
> On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com>
wrote:
>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
data node.
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
> ________________________________
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
jebaearnest@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
> What would be the reason you would do that?
> You would want to leverage distributed dataset for higher availability
and better response times.
> The maximum storage depends completely on the disks  capacity of your
nodes and what your OS supports. Typically I have heard of about 1-2
TB/node to start with, but I may be wrong.
> -abhishek
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

-- 
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: Maximum Storage size in a Single datanode

Posted by Mohammad Tariq <do...@gmail.com>.
I completely agree with everyone in the thread. Perhaps you are not
concerned much about the processing part, but it is still not a good idea.
Remember the power of Hadoop lies in the principle of "divide and rule" and
you are trying to go against that.

On Wednesday, January 30, 2013, Chris Embree <ce...@gmail.com> wrote:
> You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.
> As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.
> If you're just thinking of storing lots of data, you might consider
GlusterFS instead.
> I highly recommend RedHat over Ubuntu.
> Hope that helps.
>
> On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com>
wrote:
>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
data node.
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
> ________________________________
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
jebaearnest@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
> What would be the reason you would do that?
> You would want to leverage distributed dataset for higher availability
and better response times.
> The maximum storage depends completely on the disks  capacity of your
nodes and what your OS supports. Typically I have heard of about 1-2
TB/node to start with, but I may be wrong.
> -abhishek
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

-- 
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: Maximum Storage size in a Single datanode

Posted by Mohammad Tariq <do...@gmail.com>.
I completely agree with everyone in the thread. Perhaps you are not
concerned much about the processing part, but it is still not a good idea.
Remember the power of Hadoop lies in the principle of "divide and rule" and
you are trying to go against that.

On Wednesday, January 30, 2013, Chris Embree <ce...@gmail.com> wrote:
> You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.
> As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.
> If you're just thinking of storing lots of data, you might consider
GlusterFS instead.
> I highly recommend RedHat over Ubuntu.
> Hope that helps.
>
> On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com>
wrote:
>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
data node.
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
> ________________________________
> From: "Pamecha, Abhishek" <ap...@ebay.com>
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
jebaearnest@yahoo.com>
> Sent: Wednesday, 30 January 2013 2:45 PM
> Subject: Re: Maximum Storage size in a Single datanode
>
> What would be the reason you would do that?
> You would want to leverage distributed dataset for higher availability
and better response times.
> The maximum storage depends completely on the disks  capacity of your
nodes and what your OS supports. Typically I have heard of about 1-2
TB/node to start with, but I may be wrong.
> -abhishek
>
> From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
> Hi,
>
>
> Is it possible to keep 1 Petabyte in a single data node?
> If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>

-- 
Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: Maximum Storage size in a Single datanode

Posted by Chris Embree <ce...@gmail.com>.
You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.

As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.

If you're just thinking of storing lots of data, you might consider
GlusterFS instead.

I highly recommend RedHat over Ubuntu.

Hope that helps.

On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com> wrote:

>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
> data node.
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
>   ------------------------------
> *From:* "Pamecha, Abhishek" <ap...@ebay.com>
> *To:* "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
> jebaearnest@yahoo.com>
> *Sent:* Wednesday, 30 January 2013 2:45 PM
> *Subject:* Re: Maximum Storage size in a Single datanode
>
>  What would be the reason you would do that?
>
>  You would want to leverage distributed dataset for higher availability
> and better response times.
>
>  The maximum storage depends completely on the disks  capacity of your
> nodes and what your OS supports. Typically I have heard of about 1-2
> TB/node to start with, but I may be wrong.
> -abhishek
>
>
>   From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>  Hi,
>
>
>  Is it possible to keep 1 Petabyte in a single data node?
>  If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by Chris Embree <ce...@gmail.com>.
You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.

As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.

If you're just thinking of storing lots of data, you might consider
GlusterFS instead.

I highly recommend RedHat over Ubuntu.

Hope that helps.

On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com> wrote:

>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
> data node.
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
>   ------------------------------
> *From:* "Pamecha, Abhishek" <ap...@ebay.com>
> *To:* "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
> jebaearnest@yahoo.com>
> *Sent:* Wednesday, 30 January 2013 2:45 PM
> *Subject:* Re: Maximum Storage size in a Single datanode
>
>  What would be the reason you would do that?
>
>  You would want to leverage distributed dataset for higher availability
> and better response times.
>
>  The maximum storage depends completely on the disks  capacity of your
> nodes and what your OS supports. Typically I have heard of about 1-2
> TB/node to start with, but I may be wrong.
> -abhishek
>
>
>   From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>  Hi,
>
>
>  Is it possible to keep 1 Petabyte in a single data node?
>  If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by Chris Embree <ce...@gmail.com>.
You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.

As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.

If you're just thinking of storing lots of data, you might consider
GlusterFS instead.

I highly recommend RedHat over Ubuntu.

Hope that helps.

On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com> wrote:

>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
> data node.
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
>   ------------------------------
> *From:* "Pamecha, Abhishek" <ap...@ebay.com>
> *To:* "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
> jebaearnest@yahoo.com>
> *Sent:* Wednesday, 30 January 2013 2:45 PM
> *Subject:* Re: Maximum Storage size in a Single datanode
>
>  What would be the reason you would do that?
>
>  You would want to leverage distributed dataset for higher availability
> and better response times.
>
>  The maximum storage depends completely on the disks  capacity of your
> nodes and what your OS supports. Typically I have heard of about 1-2
> TB/node to start with, but I may be wrong.
> -abhishek
>
>
>   From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>  Hi,
>
>
>  Is it possible to keep 1 Petabyte in a single data node?
>  If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>
>
>

RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Jeba,

 

I'm not aware of any hadoop limitations in this respect (others may be able
to comment on this); since blocks are just files on the OS, the datanode
will create subdirectories to store blocks to avoid problems with large
numbers of files in a single directory. So I would think the limitations are
primarily around the type of file system you select, for ext3 it
theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
probably already planning on deploying 64-bit servers, I believe for large
FS on ext4 you'd be better off with a 64-bit server.

 

As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
etc) I believe there are more production deployments using RHEL and/or
CentOS than Ubuntu. 

 

It's probably not practical to have nodes with 1PB of data for the reasons
that others have mentioned and due to the replication traffic that will be
generated if the node dies. Not to mention fsck times with large file
systems.

 

Vijay

 

 

 

From: jeba earnest [mailto:jebaearnest@yahoo.com] 
Sent: 30 January 2013 10:40
To: user@hadoop.apache.org
Subject: Re: Maximum Storage size in a Single datanode

 

 

I want to use either UBUNTU or REDHAT .

I just want to know how much storage space we can allocate in a single data
node.

 

Is there any limitations in hadoop for storage in single node?

 

 

 

Regards,

Jeba

  _____  

From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
<je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode

 

What would be the reason you would do that? 

 

You would want to leverage distributed dataset for higher availability and
better response times.

 

The maximum storage depends completely on the disks  capacity of your nodes
and what your OS supports. Typically I have heard of about 1-2 TB/node to
start with, but I may be wrong.

-abhishek

 

 

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode

 

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 

 

Regards,
M. Jeba

 


RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Jeba,

 

I'm not aware of any hadoop limitations in this respect (others may be able
to comment on this); since blocks are just files on the OS, the datanode
will create subdirectories to store blocks to avoid problems with large
numbers of files in a single directory. So I would think the limitations are
primarily around the type of file system you select, for ext3 it
theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
probably already planning on deploying 64-bit servers, I believe for large
FS on ext4 you'd be better off with a 64-bit server.

 

As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
etc) I believe there are more production deployments using RHEL and/or
CentOS than Ubuntu. 

 

It's probably not practical to have nodes with 1PB of data for the reasons
that others have mentioned and due to the replication traffic that will be
generated if the node dies. Not to mention fsck times with large file
systems.

 

Vijay

 

 

 

From: jeba earnest [mailto:jebaearnest@yahoo.com] 
Sent: 30 January 2013 10:40
To: user@hadoop.apache.org
Subject: Re: Maximum Storage size in a Single datanode

 

 

I want to use either UBUNTU or REDHAT .

I just want to know how much storage space we can allocate in a single data
node.

 

Is there any limitations in hadoop for storage in single node?

 

 

 

Regards,

Jeba

  _____  

From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
<je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode

 

What would be the reason you would do that? 

 

You would want to leverage distributed dataset for higher availability and
better response times.

 

The maximum storage depends completely on the disks  capacity of your nodes
and what your OS supports. Typically I have heard of about 1-2 TB/node to
start with, but I may be wrong.

-abhishek

 

 

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode

 

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 

 

Regards,
M. Jeba

 


RE: Maximum Storage size in a Single datanode

Posted by Vijay Thakorlal <vi...@hotmail.com>.
Jeba,

 

I'm not aware of any hadoop limitations in this respect (others may be able
to comment on this); since blocks are just files on the OS, the datanode
will create subdirectories to store blocks to avoid problems with large
numbers of files in a single directory. So I would think the limitations are
primarily around the type of file system you select, for ext3 it
theoretically supports up to 16TB (http://en.wikipedia.org/wiki/Ext3) and
for ext4 up to 1EB (http://en.wikipedia.org/wiki/Ext4). Although you're
probably already planning on deploying 64-bit servers, I believe for large
FS on ext4 you'd be better off with a 64-bit server.

 

As far as OS is concerned anecdotally (based on blogs, hadoop mailing lists
etc) I believe there are more production deployments using RHEL and/or
CentOS than Ubuntu. 

 

It's probably not practical to have nodes with 1PB of data for the reasons
that others have mentioned and due to the replication traffic that will be
generated if the node dies. Not to mention fsck times with large file
systems.

 

Vijay

 

 

 

From: jeba earnest [mailto:jebaearnest@yahoo.com] 
Sent: 30 January 2013 10:40
To: user@hadoop.apache.org
Subject: Re: Maximum Storage size in a Single datanode

 

 

I want to use either UBUNTU or REDHAT .

I just want to know how much storage space we can allocate in a single data
node.

 

Is there any limitations in hadoop for storage in single node?

 

 

 

Regards,

Jeba

  _____  

From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest
<je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode

 

What would be the reason you would do that? 

 

You would want to leverage distributed dataset for higher availability and
better response times.

 

The maximum storage depends completely on the disks  capacity of your nodes
and what your OS supports. Typically I have heard of about 1-2 TB/node to
start with, but I may be wrong.

-abhishek

 

 

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
<je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode

 

 

Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 

 

Regards,
M. Jeba

 


Re: Maximum Storage size in a Single datanode

Posted by Chris Embree <ce...@gmail.com>.
You should probably think about this in a more cluster fashion.  A single
node with a PB of data is probably not a good allocation of CPU : Disk
ration.  In addition, you need enough RAM on your NameNode to keep track of
all of your blocks.  A few nodes with a PB each would quickly drive up NN
RAM requirements.

As others have mentioned, the local file system that HDFS sits on top of
may have limits.  We're going to use EXT4 which should handle that much,
but it's probably still not a good idea.

If you're just thinking of storing lots of data, you might consider
GlusterFS instead.

I highly recommend RedHat over Ubuntu.

Hope that helps.

On Wed, Jan 30, 2013 at 5:40 AM, jeba earnest <je...@yahoo.com> wrote:

>
> I want to use either UBUNTU or REDHAT .
> I just want to know how much storage space we can allocate in a single
> data node.
>
> Is there any limitations in hadoop for storage in single node?
>
>
>
> Regards,
> Jeba
>   ------------------------------
> *From:* "Pamecha, Abhishek" <ap...@ebay.com>
> *To:* "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <
> jebaearnest@yahoo.com>
> *Sent:* Wednesday, 30 January 2013 2:45 PM
> *Subject:* Re: Maximum Storage size in a Single datanode
>
>  What would be the reason you would do that?
>
>  You would want to leverage distributed dataset for higher availability
> and better response times.
>
>  The maximum storage depends completely on the disks  capacity of your
> nodes and what your OS supports. Typically I have heard of about 1-2
> TB/node to start with, but I may be wrong.
> -abhishek
>
>
>   From: jeba earnest <je...@yahoo.com>
> Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest
> <je...@yahoo.com>
> Date: Wednesday, January 30, 2013 1:38 PM
> To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
> Subject: Maximum Storage size in a Single datanode
>
>
>  Hi,
>
>
>  Is it possible to keep 1 Petabyte in a single data node?
>  If not, How much is the maximum storage for a particular data node?
>
> Regards,
> M. Jeba
>
>
>

Re: Maximum Storage size in a Single datanode

Posted by jeba earnest <je...@yahoo.com>.

I want to use either UBUNTU or REDHAT .
I just want to know how much storage space we can allocate in a single data node.

Is there any limitations in hadoop for storage in single node?



 
Regards,

Jeba


________________________________
 From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode
 

What would be the reason you would do that? 

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest <je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode




Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 
 
Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by jeba earnest <je...@yahoo.com>.

I want to use either UBUNTU or REDHAT .
I just want to know how much storage space we can allocate in a single data node.

Is there any limitations in hadoop for storage in single node?



 
Regards,

Jeba


________________________________
 From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode
 

What would be the reason you would do that? 

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest <je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode




Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 
 
Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by jeba earnest <je...@yahoo.com>.

I want to use either UBUNTU or REDHAT .
I just want to know how much storage space we can allocate in a single data node.

Is there any limitations in hadoop for storage in single node?



 
Regards,

Jeba


________________________________
 From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode
 

What would be the reason you would do that? 

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest <je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode




Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 
 
Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by jeba earnest <je...@yahoo.com>.

I want to use either UBUNTU or REDHAT .
I just want to know how much storage space we can allocate in a single data node.

Is there any limitations in hadoop for storage in single node?



 
Regards,

Jeba


________________________________
 From: "Pamecha, Abhishek" <ap...@ebay.com>
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>; jeba earnest <je...@yahoo.com> 
Sent: Wednesday, 30 January 2013 2:45 PM
Subject: Re: Maximum Storage size in a Single datanode
 

What would be the reason you would do that? 

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek

From: jeba earnest <je...@yahoo.com>
Reply-To: "user@hadoop.apache.org" <us...@hadoop.apache.org>, jeba earnest <je...@yahoo.com>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org" <us...@hadoop.apache.org>
Subject: Maximum Storage size in a Single datanode




Hi,



Is it possible to keep 1 Petabyte in a single data node?

If not, How much is the maximum storage for a particular data node? 
 
Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by "Pamecha, Abhishek" <ap...@ebay.com>.
What would be the reason you would do that?

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek


From: jeba earnest <je...@yahoo.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, jeba earnest <je...@yahoo.com>>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Maximum Storage size in a Single datanode


Hi,


Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?

Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by "Pamecha, Abhishek" <ap...@ebay.com>.
What would be the reason you would do that?

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek


From: jeba earnest <je...@yahoo.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, jeba earnest <je...@yahoo.com>>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Maximum Storage size in a Single datanode


Hi,


Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?

Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by "Pamecha, Abhishek" <ap...@ebay.com>.
What would be the reason you would do that?

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek


From: jeba earnest <je...@yahoo.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, jeba earnest <je...@yahoo.com>>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Maximum Storage size in a Single datanode


Hi,


Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?

Regards,
M. Jeba

Re: Maximum Storage size in a Single datanode

Posted by "Pamecha, Abhishek" <ap...@ebay.com>.
What would be the reason you would do that?

You would want to leverage distributed dataset for higher availability and better response times.

The maximum storage depends completely on the disks  capacity of your nodes and what your OS supports. Typically I have heard of about 1-2 TB/node to start with, but I may be wrong.
-abhishek


From: jeba earnest <je...@yahoo.com>>
Reply-To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>, jeba earnest <je...@yahoo.com>>
Date: Wednesday, January 30, 2013 1:38 PM
To: "user@hadoop.apache.org<ma...@hadoop.apache.org>" <us...@hadoop.apache.org>>
Subject: Maximum Storage size in a Single datanode


Hi,


Is it possible to keep 1 Petabyte in a single data node?
If not, How much is the maximum storage for a particular data node?

Regards,
M. Jeba