You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Panshul Whisper <ou...@gmail.com> on 2013/01/11 04:02:40 UTC

HDFS disk space requirement

Hello,

I have a hadoop cluster of 5 nodes with a total of available HDFS space 130
GB with replication set to 5.
I have a file of 115 GB, which needs to be copied to the HDFS and processed.
Do I need to have anymore HDFS space for performing all processing without
running into any problems? or is this space sufficient?

-- 
Regards,
Ouch Whisper
010101010101

Re: HDFS disk space requirement

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
If the replication factor is   5 you will need at least 5x the space if the
file. So this is not going tobe enough.

On Thursday, January 10, 2013, Panshul Whisper wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>


-- 
http://balajin.net/blog
http://flic.kr/balajijegan

Re: HDFS disk space requirement

Posted by shashwat shriparv <dw...@gmail.com>.
115 * 5 = 575 Minimum GB you need, keep in mind on minimal, and you will
have other disk space needs too...



∞
Shashwat Shriparv



On Fri, Jan 11, 2013 at 11:19 AM, Alexander Pivovarov
<ap...@gmail.com>wrote:

> finish elementary school first. (plus, minus operations at least)
>
>
> On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Thank you for the response.
>>
>> Actually it is not a single file, I have JSON files that amount to 115
>> GB, these JSON files need to be processed and loaded into a Hbase data
>> tables on the same cluster for later processing. Not considering the disk
>> space required for the Hbase storage, If I reduce the replication to 3, how
>> much more HDFS space will I require?
>>
>> Thank you,
>>
>>
>> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>>
>>> If the file is a txt file, you could get a good compression ratio.
>>> Changing the replication to 3 and the file will fit. But not sure what your
>>> usecase is what you want to achieve by putting this data there. Any
>>> transformation on this data and you would need more space to save the
>>> transformed data.
>>>
>>> If you have 5 nodes and they are not virtual machines, you should
>>> consider adding more harddisks to your cluster.
>>>
>>>
>>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>>> 130 GB with replication set to 5.
>>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>>> processed.
>>>> Do I need to have anymore HDFS space for performing all processing
>>>> without running into any problems? or is this space sufficient?
>>>>
>>>> --
>>>> Regards,
>>>> Ouch Whisper
>>>> 010101010101
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>

Re: HDFS disk space requirement

Posted by shashwat shriparv <dw...@gmail.com>.
115 * 5 = 575 Minimum GB you need, keep in mind on minimal, and you will
have other disk space needs too...



∞
Shashwat Shriparv



On Fri, Jan 11, 2013 at 11:19 AM, Alexander Pivovarov
<ap...@gmail.com>wrote:

> finish elementary school first. (plus, minus operations at least)
>
>
> On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Thank you for the response.
>>
>> Actually it is not a single file, I have JSON files that amount to 115
>> GB, these JSON files need to be processed and loaded into a Hbase data
>> tables on the same cluster for later processing. Not considering the disk
>> space required for the Hbase storage, If I reduce the replication to 3, how
>> much more HDFS space will I require?
>>
>> Thank you,
>>
>>
>> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>>
>>> If the file is a txt file, you could get a good compression ratio.
>>> Changing the replication to 3 and the file will fit. But not sure what your
>>> usecase is what you want to achieve by putting this data there. Any
>>> transformation on this data and you would need more space to save the
>>> transformed data.
>>>
>>> If you have 5 nodes and they are not virtual machines, you should
>>> consider adding more harddisks to your cluster.
>>>
>>>
>>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>>> 130 GB with replication set to 5.
>>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>>> processed.
>>>> Do I need to have anymore HDFS space for performing all processing
>>>> without running into any problems? or is this space sufficient?
>>>>
>>>> --
>>>> Regards,
>>>> Ouch Whisper
>>>> 010101010101
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>

Re: HDFS disk space requirement

Posted by shashwat shriparv <dw...@gmail.com>.
115 * 5 = 575 Minimum GB you need, keep in mind on minimal, and you will
have other disk space needs too...



∞
Shashwat Shriparv



On Fri, Jan 11, 2013 at 11:19 AM, Alexander Pivovarov
<ap...@gmail.com>wrote:

> finish elementary school first. (plus, minus operations at least)
>
>
> On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Thank you for the response.
>>
>> Actually it is not a single file, I have JSON files that amount to 115
>> GB, these JSON files need to be processed and loaded into a Hbase data
>> tables on the same cluster for later processing. Not considering the disk
>> space required for the Hbase storage, If I reduce the replication to 3, how
>> much more HDFS space will I require?
>>
>> Thank you,
>>
>>
>> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>>
>>> If the file is a txt file, you could get a good compression ratio.
>>> Changing the replication to 3 and the file will fit. But not sure what your
>>> usecase is what you want to achieve by putting this data there. Any
>>> transformation on this data and you would need more space to save the
>>> transformed data.
>>>
>>> If you have 5 nodes and they are not virtual machines, you should
>>> consider adding more harddisks to your cluster.
>>>
>>>
>>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>>> 130 GB with replication set to 5.
>>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>>> processed.
>>>> Do I need to have anymore HDFS space for performing all processing
>>>> without running into any problems? or is this space sufficient?
>>>>
>>>> --
>>>> Regards,
>>>> Ouch Whisper
>>>> 010101010101
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>

Re: HDFS disk space requirement

Posted by shashwat shriparv <dw...@gmail.com>.
115 * 5 = 575 Minimum GB you need, keep in mind on minimal, and you will
have other disk space needs too...



∞
Shashwat Shriparv



On Fri, Jan 11, 2013 at 11:19 AM, Alexander Pivovarov
<ap...@gmail.com>wrote:

> finish elementary school first. (plus, minus operations at least)
>
>
> On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Thank you for the response.
>>
>> Actually it is not a single file, I have JSON files that amount to 115
>> GB, these JSON files need to be processed and loaded into a Hbase data
>> tables on the same cluster for later processing. Not considering the disk
>> space required for the Hbase storage, If I reduce the replication to 3, how
>> much more HDFS space will I require?
>>
>> Thank you,
>>
>>
>> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>>
>>> If the file is a txt file, you could get a good compression ratio.
>>> Changing the replication to 3 and the file will fit. But not sure what your
>>> usecase is what you want to achieve by putting this data there. Any
>>> transformation on this data and you would need more space to save the
>>> transformed data.
>>>
>>> If you have 5 nodes and they are not virtual machines, you should
>>> consider adding more harddisks to your cluster.
>>>
>>>
>>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>>
>>>> Hello,
>>>>
>>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>>> 130 GB with replication set to 5.
>>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>>> processed.
>>>> Do I need to have anymore HDFS space for performing all processing
>>>> without running into any problems? or is this space sufficient?
>>>>
>>>> --
>>>> Regards,
>>>> Ouch Whisper
>>>> 010101010101
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>

Re: HDFS disk space requirement

Posted by Alexander Pivovarov <ap...@gmail.com>.
finish elementary school first. (plus, minus operations at least)


On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Thank you for the response.
>
> Actually it is not a single file, I have JSON files that amount to 115 GB,
> these JSON files need to be processed and loaded into a Hbase data tables
> on the same cluster for later processing. Not considering the disk space
> required for the Hbase storage, If I reduce the replication to 3, how much
> more HDFS space will I require?
>
> Thank you,
>
>
> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>
>> If the file is a txt file, you could get a good compression ratio.
>> Changing the replication to 3 and the file will fit. But not sure what your
>> usecase is what you want to achieve by putting this data there. Any
>> transformation on this data and you would need more space to save the
>> transformed data.
>>
>> If you have 5 nodes and they are not virtual machines, you should
>> consider adding more harddisks to your cluster.
>>
>>
>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>> 130 GB with replication set to 5.
>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>> processed.
>>> Do I need to have anymore HDFS space for performing all processing
>>> without running into any problems? or is this space sufficient?
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by Alexander Pivovarov <ap...@gmail.com>.
finish elementary school first. (plus, minus operations at least)


On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Thank you for the response.
>
> Actually it is not a single file, I have JSON files that amount to 115 GB,
> these JSON files need to be processed and loaded into a Hbase data tables
> on the same cluster for later processing. Not considering the disk space
> required for the Hbase storage, If I reduce the replication to 3, how much
> more HDFS space will I require?
>
> Thank you,
>
>
> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>
>> If the file is a txt file, you could get a good compression ratio.
>> Changing the replication to 3 and the file will fit. But not sure what your
>> usecase is what you want to achieve by putting this data there. Any
>> transformation on this data and you would need more space to save the
>> transformed data.
>>
>> If you have 5 nodes and they are not virtual machines, you should
>> consider adding more harddisks to your cluster.
>>
>>
>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>> 130 GB with replication set to 5.
>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>> processed.
>>> Do I need to have anymore HDFS space for performing all processing
>>> without running into any problems? or is this space sufficient?
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by Alexander Pivovarov <ap...@gmail.com>.
finish elementary school first. (plus, minus operations at least)


On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Thank you for the response.
>
> Actually it is not a single file, I have JSON files that amount to 115 GB,
> these JSON files need to be processed and loaded into a Hbase data tables
> on the same cluster for later processing. Not considering the disk space
> required for the Hbase storage, If I reduce the replication to 3, how much
> more HDFS space will I require?
>
> Thank you,
>
>
> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>
>> If the file is a txt file, you could get a good compression ratio.
>> Changing the replication to 3 and the file will fit. But not sure what your
>> usecase is what you want to achieve by putting this data there. Any
>> transformation on this data and you would need more space to save the
>> transformed data.
>>
>> If you have 5 nodes and they are not virtual machines, you should
>> consider adding more harddisks to your cluster.
>>
>>
>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>> 130 GB with replication set to 5.
>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>> processed.
>>> Do I need to have anymore HDFS space for performing all processing
>>> without running into any problems? or is this space sufficient?
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by Alexander Pivovarov <ap...@gmail.com>.
finish elementary school first. (plus, minus operations at least)


On Thu, Jan 10, 2013 at 7:23 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Thank you for the response.
>
> Actually it is not a single file, I have JSON files that amount to 115 GB,
> these JSON files need to be processed and loaded into a Hbase data tables
> on the same cluster for later processing. Not considering the disk space
> required for the Hbase storage, If I reduce the replication to 3, how much
> more HDFS space will I require?
>
> Thank you,
>
>
> On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com>wrote:
>
>> If the file is a txt file, you could get a good compression ratio.
>> Changing the replication to 3 and the file will fit. But not sure what your
>> usecase is what you want to achieve by putting this data there. Any
>> transformation on this data and you would need more space to save the
>> transformed data.
>>
>> If you have 5 nodes and they are not virtual machines, you should
>> consider adding more harddisks to your cluster.
>>
>>
>> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>>> 130 GB with replication set to 5.
>>> I have a file of 115 GB, which needs to be copied to the HDFS and
>>> processed.
>>> Do I need to have anymore HDFS space for performing all processing
>>> without running into any problems? or is this space sufficient?
>>>
>>> --
>>> Regards,
>>> Ouch Whisper
>>> 010101010101
>>>
>>
>>
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by Panshul Whisper <ou...@gmail.com>.
Thank you for the response.

Actually it is not a single file, I have JSON files that amount to 115 GB,
these JSON files need to be processed and loaded into a Hbase data tables
on the same cluster for later processing. Not considering the disk space
required for the Hbase storage, If I reduce the replication to 3, how much
more HDFS space will I require?

Thank you,


On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com> wrote:

> If the file is a txt file, you could get a good compression ratio.
> Changing the replication to 3 and the file will fit. But not sure what your
> usecase is what you want to achieve by putting this data there. Any
> transformation on this data and you would need more space to save the
> transformed data.
>
> If you have 5 nodes and they are not virtual machines, you should consider
> adding more harddisks to your cluster.
>
>
> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Hello,
>>
>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>> 130 GB with replication set to 5.
>> I have a file of 115 GB, which needs to be copied to the HDFS and
>> processed.
>> Do I need to have anymore HDFS space for performing all processing
>> without running into any problems? or is this space sufficient?
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>


-- 
Regards,
Ouch Whisper
010101010101

Re: HDFS disk space requirement

Posted by Panshul Whisper <ou...@gmail.com>.
Thank you for the response.

Actually it is not a single file, I have JSON files that amount to 115 GB,
these JSON files need to be processed and loaded into a Hbase data tables
on the same cluster for later processing. Not considering the disk space
required for the Hbase storage, If I reduce the replication to 3, how much
more HDFS space will I require?

Thank you,


On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com> wrote:

> If the file is a txt file, you could get a good compression ratio.
> Changing the replication to 3 and the file will fit. But not sure what your
> usecase is what you want to achieve by putting this data there. Any
> transformation on this data and you would need more space to save the
> transformed data.
>
> If you have 5 nodes and they are not virtual machines, you should consider
> adding more harddisks to your cluster.
>
>
> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Hello,
>>
>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>> 130 GB with replication set to 5.
>> I have a file of 115 GB, which needs to be copied to the HDFS and
>> processed.
>> Do I need to have anymore HDFS space for performing all processing
>> without running into any problems? or is this space sufficient?
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>


-- 
Regards,
Ouch Whisper
010101010101

Re: HDFS disk space requirement

Posted by Panshul Whisper <ou...@gmail.com>.
Thank you for the response.

Actually it is not a single file, I have JSON files that amount to 115 GB,
these JSON files need to be processed and loaded into a Hbase data tables
on the same cluster for later processing. Not considering the disk space
required for the Hbase storage, If I reduce the replication to 3, how much
more HDFS space will I require?

Thank you,


On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com> wrote:

> If the file is a txt file, you could get a good compression ratio.
> Changing the replication to 3 and the file will fit. But not sure what your
> usecase is what you want to achieve by putting this data there. Any
> transformation on this data and you would need more space to save the
> transformed data.
>
> If you have 5 nodes and they are not virtual machines, you should consider
> adding more harddisks to your cluster.
>
>
> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Hello,
>>
>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>> 130 GB with replication set to 5.
>> I have a file of 115 GB, which needs to be copied to the HDFS and
>> processed.
>> Do I need to have anymore HDFS space for performing all processing
>> without running into any problems? or is this space sufficient?
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>


-- 
Regards,
Ouch Whisper
010101010101

Re: HDFS disk space requirement

Posted by Panshul Whisper <ou...@gmail.com>.
Thank you for the response.

Actually it is not a single file, I have JSON files that amount to 115 GB,
these JSON files need to be processed and loaded into a Hbase data tables
on the same cluster for later processing. Not considering the disk space
required for the Hbase storage, If I reduce the replication to 3, how much
more HDFS space will I require?

Thank you,


On Fri, Jan 11, 2013 at 4:16 AM, Ravi Mutyala <ra...@hortonworks.com> wrote:

> If the file is a txt file, you could get a good compression ratio.
> Changing the replication to 3 and the file will fit. But not sure what your
> usecase is what you want to achieve by putting this data there. Any
> transformation on this data and you would need more space to save the
> transformed data.
>
> If you have 5 nodes and they are not virtual machines, you should consider
> adding more harddisks to your cluster.
>
>
> On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:
>
>> Hello,
>>
>> I have a hadoop cluster of 5 nodes with a total of available HDFS space
>> 130 GB with replication set to 5.
>> I have a file of 115 GB, which needs to be copied to the HDFS and
>> processed.
>> Do I need to have anymore HDFS space for performing all processing
>> without running into any problems? or is this space sufficient?
>>
>> --
>> Regards,
>> Ouch Whisper
>> 010101010101
>>
>
>


-- 
Regards,
Ouch Whisper
010101010101

Re: HDFS disk space requirement

Posted by Ravi Mutyala <ra...@hortonworks.com>.
If the file is a txt file, you could get a good compression ratio. Changing
the replication to 3 and the file will fit. But not sure what your usecase
is what you want to achieve by putting this data there. Any transformation
on this data and you would need more space to save the transformed data.

If you have 5 nodes and they are not virtual machines, you should consider
adding more harddisks to your cluster.


On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by Ravi Mutyala <ra...@hortonworks.com>.
If the file is a txt file, you could get a good compression ratio. Changing
the replication to 3 and the file will fit. But not sure what your usecase
is what you want to achieve by putting this data there. Any transformation
on this data and you would need more space to save the transformed data.

If you have 5 nodes and they are not virtual machines, you should consider
adding more harddisks to your cluster.


On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
If the replication factor is   5 you will need at least 5x the space if the
file. So this is not going tobe enough.

On Thursday, January 10, 2013, Panshul Whisper wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>


-- 
http://balajin.net/blog
http://flic.kr/balajijegan

Re: HDFS disk space requirement

Posted by Ravi Mutyala <ra...@hortonworks.com>.
If the file is a txt file, you could get a good compression ratio. Changing
the replication to 3 and the file will fit. But not sure what your usecase
is what you want to achieve by putting this data there. Any transformation
on this data and you would need more space to save the transformed data.

If you have 5 nodes and they are not virtual machines, you should consider
adding more harddisks to your cluster.


On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>

Re: HDFS disk space requirement

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
If the replication factor is   5 you will need at least 5x the space if the
file. So this is not going tobe enough.

On Thursday, January 10, 2013, Panshul Whisper wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>


-- 
http://balajin.net/blog
http://flic.kr/balajijegan

Re: HDFS disk space requirement

Posted by "Balaji Narayanan (பாலாஜி நாராயணன்)" <li...@balajin.net>.
If the replication factor is   5 you will need at least 5x the space if the
file. So this is not going tobe enough.

On Thursday, January 10, 2013, Panshul Whisper wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>


-- 
http://balajin.net/blog
http://flic.kr/balajijegan

Re: HDFS disk space requirement

Posted by Ravi Mutyala <ra...@hortonworks.com>.
If the file is a txt file, you could get a good compression ratio. Changing
the replication to 3 and the file will fit. But not sure what your usecase
is what you want to achieve by putting this data there. Any transformation
on this data and you would need more space to save the transformed data.

If you have 5 nodes and they are not virtual machines, you should consider
adding more harddisks to your cluster.


On Thu, Jan 10, 2013 at 9:02 PM, Panshul Whisper <ou...@gmail.com>wrote:

> Hello,
>
> I have a hadoop cluster of 5 nodes with a total of available HDFS space
> 130 GB with replication set to 5.
> I have a file of 115 GB, which needs to be copied to the HDFS and
> processed.
> Do I need to have anymore HDFS space for performing all processing without
> running into any problems? or is this space sufficient?
>
> --
> Regards,
> Ouch Whisper
> 010101010101
>