You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/03/25 16:34:52 UTC

Re: can block size for namenode be different from wdatanode block size?

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich
Let your email find you with BlackBerry from Vodafone

-----Original Message-----
From: Mirko Kämpf <mi...@gmail.com>
Date: Wed, 25 Mar 2015 15:20:03 
To: user@hadoop.apache.org<us...@hadoop.apache.org>
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from datanode block size?

Hi Mich,

please see the comments in your text.



2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:

>
> Hi,
>
> The block size for HDFS is currently set to 128MB by defauilt. This is
> configurable.
>
Correct, an HDFS client can overwrite the cfg-property and define a
different block size for HDFS blocks.

>
> My point is that I assume this  parameter in hadoop-core.xml sets the
> block size for both namenode and datanode.

Correct, the block-size is a "HDFS wide setting" but in general the
HDFS-client makes the blocks.


> However, the storage and
> random access for metadata in nsamenode is different and suits smaller
> block sizes.
>
HDFS blocksize has no impact here. NameNode metadata is held in memory. For
reliability it is dumped to local discs of the server.


>
> For example in Linux the OS block size is 4k which means one HTFS blopck
> size  of 128MB can hold 32K OS blocks. For metadata this may not be
> useful and smaller block size will be suitable and hence my question.
>
Remember, metadata is in memory. The fsimage-file, which contains the
metadata
is loaded on startup of the NameNode.

Please be not confused by the two types of block-sizes.

Hope this helps a bit.
Cheers,
Mirko


>
> Thanks,
>
> Mich
>


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Ah, yes. Toms book is a good start, and Eric Sammers book Hadoop Operations too :) 

BR,
 AL


> On 26 Mar 2015, at 11:50, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Many thanks AL. I believe you meant “Hadoop the definitive guide” J
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
> Sent: 26 March 2015 10:30
> To: user@hadoop.apache.org
> Subject: Re: Total memory available to NameNode
>  
> Hi Mich,
>  
> the book Hadoop Operations may a good start:
> https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>
>  
> BR,
>  AL
>  
>  
>> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>>  
>> Is there any parameter that sets the total memory that NameNode can use?
>>  
>> Thanks
>>  
>> Mich Talebzadeh
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> Publications due shortly:
>> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>>  
>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>>  
>> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
>> Sent: 25 March 2015 16:08
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
>> Subject: Re: can block size for namenode be different from wdatanode block size?
>>  
>> Correct, let's say you run the NameNode with just 1GB of RAM.
>> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>>  
>> Cheers,
>> Mirko
>>  
>> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> Hi Mirko,
>> 
>> Thanks for feedback.
>> 
>> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
>> 
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
>> 
>> Regards,
>> 
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
>> Date: Wed, 25 Mar 2015 15:20:03 +0000
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
>> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
>> Subject: Re: can block size for namenode be different from datanode block size?
>>  
>> Hi Mich,
>>  
>> please see the comments in your text.
>> 
>>  
>>  
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> 
>> Hi,
>> 
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>>> 
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode. 
>> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>>   
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>>  
>>> 
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
>> is loaded on startup of the NameNode.
>>  
>> Please be not confused by the two types of block-sizes.
>>  
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>  
>>> 
>>> Thanks,
>>> 
>>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Ah, yes. Toms book is a good start, and Eric Sammers book Hadoop Operations too :) 

BR,
 AL


> On 26 Mar 2015, at 11:50, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Many thanks AL. I believe you meant “Hadoop the definitive guide” J
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
> Sent: 26 March 2015 10:30
> To: user@hadoop.apache.org
> Subject: Re: Total memory available to NameNode
>  
> Hi Mich,
>  
> the book Hadoop Operations may a good start:
> https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>
>  
> BR,
>  AL
>  
>  
>> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>>  
>> Is there any parameter that sets the total memory that NameNode can use?
>>  
>> Thanks
>>  
>> Mich Talebzadeh
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> Publications due shortly:
>> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>>  
>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>>  
>> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
>> Sent: 25 March 2015 16:08
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
>> Subject: Re: can block size for namenode be different from wdatanode block size?
>>  
>> Correct, let's say you run the NameNode with just 1GB of RAM.
>> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>>  
>> Cheers,
>> Mirko
>>  
>> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> Hi Mirko,
>> 
>> Thanks for feedback.
>> 
>> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
>> 
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
>> 
>> Regards,
>> 
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
>> Date: Wed, 25 Mar 2015 15:20:03 +0000
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
>> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
>> Subject: Re: can block size for namenode be different from datanode block size?
>>  
>> Hi Mich,
>>  
>> please see the comments in your text.
>> 
>>  
>>  
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> 
>> Hi,
>> 
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>>> 
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode. 
>> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>>   
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>>  
>>> 
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
>> is loaded on startup of the NameNode.
>>  
>> Please be not confused by the two types of block-sizes.
>>  
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>  
>>> 
>>> Thanks,
>>> 
>>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Ah, yes. Toms book is a good start, and Eric Sammers book Hadoop Operations too :) 

BR,
 AL


> On 26 Mar 2015, at 11:50, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Many thanks AL. I believe you meant “Hadoop the definitive guide” J
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
> Sent: 26 March 2015 10:30
> To: user@hadoop.apache.org
> Subject: Re: Total memory available to NameNode
>  
> Hi Mich,
>  
> the book Hadoop Operations may a good start:
> https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>
>  
> BR,
>  AL
>  
>  
>> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>>  
>> Is there any parameter that sets the total memory that NameNode can use?
>>  
>> Thanks
>>  
>> Mich Talebzadeh
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> Publications due shortly:
>> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>>  
>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>>  
>> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
>> Sent: 25 March 2015 16:08
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
>> Subject: Re: can block size for namenode be different from wdatanode block size?
>>  
>> Correct, let's say you run the NameNode with just 1GB of RAM.
>> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>>  
>> Cheers,
>> Mirko
>>  
>> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> Hi Mirko,
>> 
>> Thanks for feedback.
>> 
>> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
>> 
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
>> 
>> Regards,
>> 
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
>> Date: Wed, 25 Mar 2015 15:20:03 +0000
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
>> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
>> Subject: Re: can block size for namenode be different from datanode block size?
>>  
>> Hi Mich,
>>  
>> please see the comments in your text.
>> 
>>  
>>  
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> 
>> Hi,
>> 
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>>> 
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode. 
>> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>>   
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>>  
>>> 
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
>> is loaded on startup of the NameNode.
>>  
>> Please be not confused by the two types of block-sizes.
>>  
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>  
>>> 
>>> Thanks,
>>> 
>>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Ah, yes. Toms book is a good start, and Eric Sammers book Hadoop Operations too :) 

BR,
 AL


> On 26 Mar 2015, at 11:50, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Many thanks AL. I believe you meant “Hadoop the definitive guide” J
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
> Sent: 26 March 2015 10:30
> To: user@hadoop.apache.org
> Subject: Re: Total memory available to NameNode
>  
> Hi Mich,
>  
> the book Hadoop Operations may a good start:
> https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>
>  
> BR,
>  AL
>  
>  
>> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>>  
>> Is there any parameter that sets the total memory that NameNode can use?
>>  
>> Thanks
>>  
>> Mich Talebzadeh
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> Publications due shortly:
>> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>>  
>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>>  
>> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
>> Sent: 25 March 2015 16:08
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
>> Subject: Re: can block size for namenode be different from wdatanode block size?
>>  
>> Correct, let's say you run the NameNode with just 1GB of RAM.
>> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>>  
>> Cheers,
>> Mirko
>>  
>> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> Hi Mirko,
>> 
>> Thanks for feedback.
>> 
>> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
>> 
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
>> 
>> Regards,
>> 
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
>> Date: Wed, 25 Mar 2015 15:20:03 +0000
>> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
>> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
>> Subject: Re: can block size for namenode be different from datanode block size?
>>  
>> Hi Mich,
>>  
>> please see the comments in your text.
>> 
>>  
>>  
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
>> 
>> Hi,
>> 
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>>> 
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode. 
>> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>>   
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>>  
>>> 
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
>> is loaded on startup of the NameNode.
>>  
>> Please be not confused by the two types of block-sizes.
>>  
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>  
>>> 
>>> Thanks,
>>> 
>>> Mich


RE: Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Many thanks AL. I believe you meant “Hadoop the definitive guide” J

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
Sent: 26 March 2015 10:30
To: user@hadoop.apache.org
Subject: Re: Total memory available to NameNode

 

Hi Mich,

 

the book Hadoop Operations may a good start:

https://books.google.de/books?id=drbI_aro20oC <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false> &pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false

 

BR,

 AL

 

 

On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:

 

Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [ <ma...@gmail.com> mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To:  <ma...@hadoop.apache.org> user@hadoop.apache.org;  <ma...@peridale.co.uk> mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf < <ma...@gmail.com> mirko.kaempf@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To:  <ma...@hadoop.apache.org> user@hadoop.apache.org< <ma...@hadoop.apache.org> user@hadoop.apache.org>

ReplyTo:  <ma...@hadoop.apache.org> user@hadoop.apache.org

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 


RE: Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Many thanks AL. I believe you meant “Hadoop the definitive guide” J

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
Sent: 26 March 2015 10:30
To: user@hadoop.apache.org
Subject: Re: Total memory available to NameNode

 

Hi Mich,

 

the book Hadoop Operations may a good start:

https://books.google.de/books?id=drbI_aro20oC <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false> &pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false

 

BR,

 AL

 

 

On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:

 

Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [ <ma...@gmail.com> mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To:  <ma...@hadoop.apache.org> user@hadoop.apache.org;  <ma...@peridale.co.uk> mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf < <ma...@gmail.com> mirko.kaempf@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To:  <ma...@hadoop.apache.org> user@hadoop.apache.org< <ma...@hadoop.apache.org> user@hadoop.apache.org>

ReplyTo:  <ma...@hadoop.apache.org> user@hadoop.apache.org

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 


RE: Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Many thanks AL. I believe you meant “Hadoop the definitive guide” J

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
Sent: 26 March 2015 10:30
To: user@hadoop.apache.org
Subject: Re: Total memory available to NameNode

 

Hi Mich,

 

the book Hadoop Operations may a good start:

https://books.google.de/books?id=drbI_aro20oC <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false> &pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false

 

BR,

 AL

 

 

On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:

 

Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [ <ma...@gmail.com> mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To:  <ma...@hadoop.apache.org> user@hadoop.apache.org;  <ma...@peridale.co.uk> mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf < <ma...@gmail.com> mirko.kaempf@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To:  <ma...@hadoop.apache.org> user@hadoop.apache.org< <ma...@hadoop.apache.org> user@hadoop.apache.org>

ReplyTo:  <ma...@hadoop.apache.org> user@hadoop.apache.org

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 


RE: Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Many thanks AL. I believe you meant “Hadoop the definitive guide” J

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Alexander Alten-Lorenz [mailto:wget.null@gmail.com] 
Sent: 26 March 2015 10:30
To: user@hadoop.apache.org
Subject: Re: Total memory available to NameNode

 

Hi Mich,

 

the book Hadoop Operations may a good start:

https://books.google.de/books?id=drbI_aro20oC <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false> &pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false

 

BR,

 AL

 

 

On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:

 

Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [ <ma...@gmail.com> mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To:  <ma...@hadoop.apache.org> user@hadoop.apache.org;  <ma...@peridale.co.uk> mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf < <ma...@gmail.com> mirko.kaempf@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To:  <ma...@hadoop.apache.org> user@hadoop.apache.org< <ma...@hadoop.apache.org> user@hadoop.apache.org>

ReplyTo:  <ma...@hadoop.apache.org> user@hadoop.apache.org

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh < <ma...@peridale.co.uk> mich@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Mich,

the book Hadoop Operations may a good start:
https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>

BR,
 AL


> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Is there any parameter that sets the total memory that NameNode can use?
>  
> Thanks
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
> Sent: 25 March 2015 16:08
> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
> Subject: Re: can block size for namenode be different from wdatanode block size?
>  
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>  
> Cheers,
> Mirko
>  
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> Hi Mirko,
> 
> Thanks for feedback.
> 
> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
> 
> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
> 
> Regards,
> 
> Mich
> Let your email find you with BlackBerry from Vodafone
> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
> Date: Wed, 25 Mar 2015 15:20:03 +0000
> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
> Subject: Re: can block size for namenode be different from datanode block size?
>  
> Hi Mich,
>  
> please see the comments in your text.
> 
>  
>  
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> 
> Hi,
> 
> The block size for HDFS is currently set to 128MB by defauilt. This is
> configurable.
> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>> 
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode. 
> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>   
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>  
>> 
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
> is loaded on startup of the NameNode.
>  
> Please be not confused by the two types of block-sizes.
>  
> Hope this helps a bit.
> Cheers,
> Mirko
>  
>> 
>> Thanks,
>> 
>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Mich,

the book Hadoop Operations may a good start:
https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>

BR,
 AL


> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Is there any parameter that sets the total memory that NameNode can use?
>  
> Thanks
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
> Sent: 25 March 2015 16:08
> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
> Subject: Re: can block size for namenode be different from wdatanode block size?
>  
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>  
> Cheers,
> Mirko
>  
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> Hi Mirko,
> 
> Thanks for feedback.
> 
> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
> 
> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
> 
> Regards,
> 
> Mich
> Let your email find you with BlackBerry from Vodafone
> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
> Date: Wed, 25 Mar 2015 15:20:03 +0000
> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
> Subject: Re: can block size for namenode be different from datanode block size?
>  
> Hi Mich,
>  
> please see the comments in your text.
> 
>  
>  
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> 
> Hi,
> 
> The block size for HDFS is currently set to 128MB by defauilt. This is
> configurable.
> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>> 
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode. 
> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>   
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>  
>> 
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
> is loaded on startup of the NameNode.
>  
> Please be not confused by the two types of block-sizes.
>  
> Hope this helps a bit.
> Cheers,
> Mirko
>  
>> 
>> Thanks,
>> 
>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Mich,

the book Hadoop Operations may a good start:
https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>

BR,
 AL


> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Is there any parameter that sets the total memory that NameNode can use?
>  
> Thanks
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
> Sent: 25 March 2015 16:08
> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
> Subject: Re: can block size for namenode be different from wdatanode block size?
>  
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>  
> Cheers,
> Mirko
>  
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> Hi Mirko,
> 
> Thanks for feedback.
> 
> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
> 
> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
> 
> Regards,
> 
> Mich
> Let your email find you with BlackBerry from Vodafone
> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
> Date: Wed, 25 Mar 2015 15:20:03 +0000
> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
> Subject: Re: can block size for namenode be different from datanode block size?
>  
> Hi Mich,
>  
> please see the comments in your text.
> 
>  
>  
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> 
> Hi,
> 
> The block size for HDFS is currently set to 128MB by defauilt. This is
> configurable.
> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>> 
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode. 
> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>   
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>  
>> 
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
> is loaded on startup of the NameNode.
>  
> Please be not confused by the two types of block-sizes.
>  
> Hope this helps a bit.
> Cheers,
> Mirko
>  
>> 
>> Thanks,
>> 
>> Mich


Re: Total memory available to NameNode

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Mich,

the book Hadoop Operations may a good start:
https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop%20memory%20namenode&f=false <https://books.google.de/books?id=drbI_aro20oC&pg=PA308&lpg=PA308&dq=hadoop+memory+namenode&source=bl&ots=t_yltgk_i7&sig=_6LXkcSjfuwwqfz_kDGDi9ytgqU&hl=en&sa=X&ei=Nt8TVfn9AcjLPZyXgKAC&ved=0CFYQ6AEwBg#v=onepage&q=hadoop memory namenode&f=false>

BR,
 AL


> On 26 Mar 2015, at 11:16, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
> 
> Is there any parameter that sets the total memory that NameNode can use?
>  
> Thanks
>  
> Mich Talebzadeh
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> Publications due shortly:
> Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.
>  
> From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com <ma...@gmail.com>] 
> Sent: 25 March 2015 16:08
> To: user@hadoop.apache.org <ma...@hadoop.apache.org>; mich@peridale.co.uk <ma...@peridale.co.uk>
> Subject: Re: can block size for namenode be different from wdatanode block size?
>  
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.
>  
> Cheers,
> Mirko
>  
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> Hi Mirko,
> 
> Thanks for feedback.
> 
> Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.
> 
> IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?
> 
> Regards,
> 
> Mich
> Let your email find you with BlackBerry from Vodafone
> From: Mirko Kämpf <mirko.kaempf@gmail.com <ma...@gmail.com>> 
> Date: Wed, 25 Mar 2015 15:20:03 +0000
> To: user@hadoop.apache.org <ma...@hadoop.apache.org><user@hadoop.apache.org <ma...@hadoop.apache.org>>
> ReplyTo: user@hadoop.apache.org <ma...@hadoop.apache.org>
> Subject: Re: can block size for namenode be different from datanode block size?
>  
> Hi Mich,
>  
> please see the comments in your text.
> 
>  
>  
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>>:
> 
> Hi,
> 
> The block size for HDFS is currently set to 128MB by defauilt. This is
> configurable.
> Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 
>> 
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode. 
> Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
>   
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
> HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
>  
>> 
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
> Remember, metadata is in memory. The fsimage-file, which contains the metadata 
> is loaded on startup of the NameNode.
>  
> Please be not confused by the two types of block-sizes.
>  
> Hope this helps a bit.
> Cheers,
> Mirko
>  
>> 
>> Thanks,
>> 
>> Mich


Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To: user@hadoop.apache.org; mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf <mi...@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To: user@hadoop.apache.org<us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 

 


Re: Can block size for namenode be different from wdatanode block size?

Posted by Harsh J <ha...@cloudera.com>.
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
does not use this parameter

Actually, as a configuration, its only relevant to the client. See also
http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom

Other points sound about right, except the ability to do (7) can only now
be done if you have legacy mode of fsimage writes enabled. The new OIV tool
in recent releases only serves a REST based Web Server to query the file
data upon.

On Thu, Mar 26, 2015 at 1:47 AM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Thank you all for your contribution.
>
>
>
> I have summarised the findings as below
>
>
>
> 1.     The Hadoop block size is a configurable parameter dfs.block.size
> in bytes . By default this is set to 134217728 bytes or 128MB
>
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
> does not use this parameter
>
> 3.     NN behaves like an in-memory database IMDB and uses a disk file
> system called the FsImage to load the metadata as startup. This is the only
> place that I see value for Solid State Disk to make this initial load faster
>
> 4.     For the remaining period until HDFS shutdown or otherwise NN will
> use the in memory cache to access metadata
>
> 5.     With regard to sizing of NN to store metadata, one can use the
> following rules of thumb (heuristics):
>
> a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop
> Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB
> block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for
> every 1GB. Number 3 comes from the fact that the block is replicated three
> times. In other words just under 42TB of data. So if you have 10GB of
> namenode cache, you can have up to 420TB of data on your datanodes
>
> 6.     You can take FsImage file from Hadoop and convert it into a text
> file as follows:
>
>
>
> *hdfs dfsadmin -fetchImage nnimage*
>
>
>
> 15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout
> configured to 60000 milliseconds
>
> 15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file
> nnimage with file downloaded from
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at
> 1393.94 KB/s
>
>
>
> 7.     That create an image file in the current directory that can be
> converted to text file
>
> *hdfs  oiv -i nnimage -o nnimage.txt*
>
>
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543
> inodes.
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> directory section
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198
> directories
>
> 15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer
> started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
>
>
>
> Let me know if I missed  anything or got it wrong.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Publications due shortly:*
>
> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
> Coherence Cache*
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Ltd, its
> subsidiaries or their employees, unless expressly so stated. It is the
> responsibility of the recipient to ensure that this email is virus free,
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
> any responsibility.
>
>
>



-- 
Harsh J

Re: Can block size for namenode be different from wdatanode block size?

Posted by Harsh J <ha...@cloudera.com>.
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
does not use this parameter

Actually, as a configuration, its only relevant to the client. See also
http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom

Other points sound about right, except the ability to do (7) can only now
be done if you have legacy mode of fsimage writes enabled. The new OIV tool
in recent releases only serves a REST based Web Server to query the file
data upon.

On Thu, Mar 26, 2015 at 1:47 AM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Thank you all for your contribution.
>
>
>
> I have summarised the findings as below
>
>
>
> 1.     The Hadoop block size is a configurable parameter dfs.block.size
> in bytes . By default this is set to 134217728 bytes or 128MB
>
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
> does not use this parameter
>
> 3.     NN behaves like an in-memory database IMDB and uses a disk file
> system called the FsImage to load the metadata as startup. This is the only
> place that I see value for Solid State Disk to make this initial load faster
>
> 4.     For the remaining period until HDFS shutdown or otherwise NN will
> use the in memory cache to access metadata
>
> 5.     With regard to sizing of NN to store metadata, one can use the
> following rules of thumb (heuristics):
>
> a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop
> Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB
> block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for
> every 1GB. Number 3 comes from the fact that the block is replicated three
> times. In other words just under 42TB of data. So if you have 10GB of
> namenode cache, you can have up to 420TB of data on your datanodes
>
> 6.     You can take FsImage file from Hadoop and convert it into a text
> file as follows:
>
>
>
> *hdfs dfsadmin -fetchImage nnimage*
>
>
>
> 15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout
> configured to 60000 milliseconds
>
> 15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file
> nnimage with file downloaded from
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at
> 1393.94 KB/s
>
>
>
> 7.     That create an image file in the current directory that can be
> converted to text file
>
> *hdfs  oiv -i nnimage -o nnimage.txt*
>
>
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543
> inodes.
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> directory section
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198
> directories
>
> 15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer
> started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
>
>
>
> Let me know if I missed  anything or got it wrong.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Publications due shortly:*
>
> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
> Coherence Cache*
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Ltd, its
> subsidiaries or their employees, unless expressly so stated. It is the
> responsibility of the recipient to ensure that this email is virus free,
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
> any responsibility.
>
>
>



-- 
Harsh J

Re: Can block size for namenode be different from wdatanode block size?

Posted by Harsh J <ha...@cloudera.com>.
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
does not use this parameter

Actually, as a configuration, its only relevant to the client. See also
http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom

Other points sound about right, except the ability to do (7) can only now
be done if you have legacy mode of fsimage writes enabled. The new OIV tool
in recent releases only serves a REST based Web Server to query the file
data upon.

On Thu, Mar 26, 2015 at 1:47 AM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Thank you all for your contribution.
>
>
>
> I have summarised the findings as below
>
>
>
> 1.     The Hadoop block size is a configurable parameter dfs.block.size
> in bytes . By default this is set to 134217728 bytes or 128MB
>
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
> does not use this parameter
>
> 3.     NN behaves like an in-memory database IMDB and uses a disk file
> system called the FsImage to load the metadata as startup. This is the only
> place that I see value for Solid State Disk to make this initial load faster
>
> 4.     For the remaining period until HDFS shutdown or otherwise NN will
> use the in memory cache to access metadata
>
> 5.     With regard to sizing of NN to store metadata, one can use the
> following rules of thumb (heuristics):
>
> a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop
> Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB
> block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for
> every 1GB. Number 3 comes from the fact that the block is replicated three
> times. In other words just under 42TB of data. So if you have 10GB of
> namenode cache, you can have up to 420TB of data on your datanodes
>
> 6.     You can take FsImage file from Hadoop and convert it into a text
> file as follows:
>
>
>
> *hdfs dfsadmin -fetchImage nnimage*
>
>
>
> 15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout
> configured to 60000 milliseconds
>
> 15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file
> nnimage with file downloaded from
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at
> 1393.94 KB/s
>
>
>
> 7.     That create an image file in the current directory that can be
> converted to text file
>
> *hdfs  oiv -i nnimage -o nnimage.txt*
>
>
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543
> inodes.
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> directory section
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198
> directories
>
> 15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer
> started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
>
>
>
> Let me know if I missed  anything or got it wrong.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Publications due shortly:*
>
> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
> Coherence Cache*
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Ltd, its
> subsidiaries or their employees, unless expressly so stated. It is the
> responsibility of the recipient to ensure that this email is virus free,
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
> any responsibility.
>
>
>



-- 
Harsh J

Re: Can block size for namenode be different from wdatanode block size?

Posted by Harsh J <ha...@cloudera.com>.
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
does not use this parameter

Actually, as a configuration, its only relevant to the client. See also
http://www.quora.com/How-do-I-check-HDFS-blocksize-default-custom

Other points sound about right, except the ability to do (7) can only now
be done if you have legacy mode of fsimage writes enabled. The new OIV tool
in recent releases only serves a REST based Web Server to query the file
data upon.

On Thu, Mar 26, 2015 at 1:47 AM, Mich Talebzadeh <mi...@peridale.co.uk>
wrote:

> Thank you all for your contribution.
>
>
>
> I have summarised the findings as below
>
>
>
> 1.     The Hadoop block size is a configurable parameter dfs.block.size
> in bytes . By default this is set to 134217728 bytes or 128MB
>
> 2.     The block size is only relevant to DataNodes (DN). NameNode (NN)
> does not use this parameter
>
> 3.     NN behaves like an in-memory database IMDB and uses a disk file
> system called the FsImage to load the metadata as startup. This is the only
> place that I see value for Solid State Disk to make this initial load faster
>
> 4.     For the remaining period until HDFS shutdown or otherwise NN will
> use the in memory cache to access metadata
>
> 5.     With regard to sizing of NN to store metadata, one can use the
> following rules of thumb (heuristics):
>
> a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop
> Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB
> block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for
> every 1GB. Number 3 comes from the fact that the block is replicated three
> times. In other words just under 42TB of data. So if you have 10GB of
> namenode cache, you can have up to 420TB of data on your datanodes
>
> 6.     You can take FsImage file from Hadoop and convert it into a text
> file as follows:
>
>
>
> *hdfs dfsadmin -fetchImage nnimage*
>
>
>
> 15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout
> configured to 60000 milliseconds
>
> 15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file
> nnimage with file downloaded from
> http://rhes564:50070/imagetransfer?getimage=1&txid=latest
>
> 15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at
> 1393.94 KB/s
>
>
>
> 7.     That create an image file in the current directory that can be
> converted to text file
>
> *hdfs  oiv -i nnimage -o nnimage.txt*
>
>
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543
> inodes.
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode
> references
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode
> directory section
>
> 15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198
> directories
>
> 15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer
> started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
>
>
>
> Let me know if I missed  anything or got it wrong.
>
>
>
> HTH
>
>
>
> Mich Talebzadeh
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> *Publications due shortly:*
>
> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
> Coherence Cache*
>
>
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Ltd, its
> subsidiaries or their employees, unless expressly so stated. It is the
> responsibility of the recipient to ensure that this email is virus free,
> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
> any responsibility.
>
>
>



-- 
Harsh J

Can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Thank you all for your contribution.

 

I have summarised the findings as below

 

1.     The Hadoop block size is a configurable parameter dfs.block.size in bytes . By default this is set to 134217728 bytes or 128MB

2.     The block size is only relevant to DataNodes (DN). NameNode (NN) does not use this parameter

3.     NN behaves like an in-memory database IMDB and uses a disk file system called the FsImage to load the metadata as startup. This is the only place that I see value for Solid State Disk to make this initial load faster

4.     For the remaining period until HDFS shutdown or otherwise NN will use the in memory cache to access metadata

5.     With regard to sizing of NN to store metadata, one can use the following rules of thumb (heuristics):

a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that the block is replicated three times. In other words just under 42TB of data. So if you have 10GB of namenode cache, you can have up to 420TB of data on your datanodes

6.     You can take FsImage file from Hadoop and convert it into a text file as follows:

 

hdfs dfsadmin -fetchImage nnimage

 

15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds

15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file nnimage with file downloaded from http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at 1393.94 KB/s

 

7.     That create an image file in the current directory that can be converted to text file

hdfs  oiv -i nnimage -o nnimage.txt

 

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 inodes.

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode directory section

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 directories

15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.

 

Let me know if I missed  anything or got it wrong.

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 


Can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Thank you all for your contribution.

 

I have summarised the findings as below

 

1.     The Hadoop block size is a configurable parameter dfs.block.size in bytes . By default this is set to 134217728 bytes or 128MB

2.     The block size is only relevant to DataNodes (DN). NameNode (NN) does not use this parameter

3.     NN behaves like an in-memory database IMDB and uses a disk file system called the FsImage to load the metadata as startup. This is the only place that I see value for Solid State Disk to make this initial load faster

4.     For the remaining period until HDFS shutdown or otherwise NN will use the in memory cache to access metadata

5.     With regard to sizing of NN to store metadata, one can use the following rules of thumb (heuristics):

a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that the block is replicated three times. In other words just under 42TB of data. So if you have 10GB of namenode cache, you can have up to 420TB of data on your datanodes

6.     You can take FsImage file from Hadoop and convert it into a text file as follows:

 

hdfs dfsadmin -fetchImage nnimage

 

15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds

15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file nnimage with file downloaded from http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at 1393.94 KB/s

 

7.     That create an image file in the current directory that can be converted to text file

hdfs  oiv -i nnimage -o nnimage.txt

 

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 inodes.

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode directory section

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 directories

15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.

 

Let me know if I missed  anything or got it wrong.

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 


Can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Thank you all for your contribution.

 

I have summarised the findings as below

 

1.     The Hadoop block size is a configurable parameter dfs.block.size in bytes . By default this is set to 134217728 bytes or 128MB

2.     The block size is only relevant to DataNodes (DN). NameNode (NN) does not use this parameter

3.     NN behaves like an in-memory database IMDB and uses a disk file system called the FsImage to load the metadata as startup. This is the only place that I see value for Solid State Disk to make this initial load faster

4.     For the remaining period until HDFS shutdown or otherwise NN will use the in memory cache to access metadata

5.     With regard to sizing of NN to store metadata, one can use the following rules of thumb (heuristics):

a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that the block is replicated three times. In other words just under 42TB of data. So if you have 10GB of namenode cache, you can have up to 420TB of data on your datanodes

6.     You can take FsImage file from Hadoop and convert it into a text file as follows:

 

hdfs dfsadmin -fetchImage nnimage

 

15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds

15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file nnimage with file downloaded from http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at 1393.94 KB/s

 

7.     That create an image file in the current directory that can be converted to text file

hdfs  oiv -i nnimage -o nnimage.txt

 

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 inodes.

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode directory section

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 directories

15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.

 

Let me know if I missed  anything or got it wrong.

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 


Can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Thank you all for your contribution.

 

I have summarised the findings as below

 

1.     The Hadoop block size is a configurable parameter dfs.block.size in bytes . By default this is set to 134217728 bytes or 128MB

2.     The block size is only relevant to DataNodes (DN). NameNode (NN) does not use this parameter

3.     NN behaves like an in-memory database IMDB and uses a disk file system called the FsImage to load the metadata as startup. This is the only place that I see value for Solid State Disk to make this initial load faster

4.     For the remaining period until HDFS shutdown or otherwise NN will use the in memory cache to access metadata

5.     With regard to sizing of NN to store metadata, one can use the following rules of thumb (heuristics):

a.     NN consumes roughly 1GB for every 1 million blokes (source Hadoop Operations, Eric Sammer, ISBN: 978-1-499-3205-7). So if you have 128MB block size, you can store  128 * 1E6 / (3 *1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that the block is replicated three times. In other words just under 42TB of data. So if you have 10GB of namenode cache, you can have up to 420TB of data on your datanodes

6.     You can take FsImage file from Hadoop and convert it into a text file as follows:

 

hdfs dfsadmin -fetchImage nnimage

 

15/03/25 20:17:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

15/03/25 20:17:41 INFO namenode.TransferFsImage: Opening connection to http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds

15/03/25 20:17:41 WARN namenode.TransferFsImage: Overwriting existing file nnimage with file downloaded from http://rhes564:50070/imagetransfer?getimage=1&txid=latest

15/03/25 20:17:41 INFO namenode.TransferFsImage: Transfer took 0.03s at 1393.94 KB/s

 

7.     That create an image file in the current directory that can be converted to text file

hdfs  oiv -i nnimage -o nnimage.txt

 

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 2 strings

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading 543 inodes.

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode references

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loading inode directory section

15/03/25 20:20:07 INFO offlineImageViewer.FSImageHandler: Loaded 198 directories

15/03/25 20:20:07 INFO offlineImageViewer.WebImageViewer: WebImageViewer started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.

 

Let me know if I missed  anything or got it wrong.

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 


Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
This 200 bytes is just a "mental helper" not a precise measure. And it does
NOT take replication into account.
Each replica block has again another item of approx. 200 bytes in the NN
memory.
MK


2015-03-25 17:16 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Great. Does that 200 bytes for each block include overhead for three
> replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200
> around 1800 bytes memory in namenode?
>
> Thx
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 16:08:02 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>; <mich@peridale.co.uk
> >
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from wdatanode
> block size?
>
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we
> need about 200 bytes and for each block as well. Now we can estimate the
> max. capacity depending on HDFS-Blocksize and average File size.
>
> Cheers,
> Mirko
>
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:
>
>> Hi Mirko,
>>
>> Thanks for feedback.
>>
>> Since i have worked with in memory databases, this metadata caching
>> sounds more like an IMDB that caches data at start up from disk resident
>> storage.
>>
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the
>> case the case with metada as well?
>>
>> Regards,
>>
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> ------------------------------
>> *From: * Mirko Kämpf <mi...@gmail.com>
>> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
>> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
>> *ReplyTo: * user@hadoop.apache.org
>> *Subject: *Re: can block size for namenode be different from datanode
>> block size?
>>
>> Hi Mich,
>>
>> please see the comments in your text.
>>
>>
>>
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>>
>>>
>>> Hi,
>>>
>>> The block size for HDFS is currently set to 128MB by defauilt. This is
>>> configurable.
>>>
>> Correct, an HDFS client can overwrite the cfg-property and define a
>> different block size for HDFS blocks.
>>
>>>
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode.
>>
>> Correct, the block-size is a "HDFS wide setting" but in general the
>> HDFS-client makes the blocks.
>>
>>
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>>>
>> HDFS blocksize has no impact here. NameNode metadata is held in memory.
>> For reliability it is dumped to local discs of the server.
>>
>>
>>>
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>>>
>> Remember, metadata is in memory. The fsimage-file, which contains the
>> metadata
>> is loaded on startup of the NameNode.
>>
>> Please be not confused by the two types of block-sizes.
>>
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>
>>
>>>
>>> Thanks,
>>>
>>> Mich
>>>
>>
>>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
This 200 bytes is just a "mental helper" not a precise measure. And it does
NOT take replication into account.
Each replica block has again another item of approx. 200 bytes in the NN
memory.
MK


2015-03-25 17:16 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Great. Does that 200 bytes for each block include overhead for three
> replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200
> around 1800 bytes memory in namenode?
>
> Thx
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 16:08:02 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>; <mich@peridale.co.uk
> >
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from wdatanode
> block size?
>
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we
> need about 200 bytes and for each block as well. Now we can estimate the
> max. capacity depending on HDFS-Blocksize and average File size.
>
> Cheers,
> Mirko
>
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:
>
>> Hi Mirko,
>>
>> Thanks for feedback.
>>
>> Since i have worked with in memory databases, this metadata caching
>> sounds more like an IMDB that caches data at start up from disk resident
>> storage.
>>
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the
>> case the case with metada as well?
>>
>> Regards,
>>
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> ------------------------------
>> *From: * Mirko Kämpf <mi...@gmail.com>
>> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
>> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
>> *ReplyTo: * user@hadoop.apache.org
>> *Subject: *Re: can block size for namenode be different from datanode
>> block size?
>>
>> Hi Mich,
>>
>> please see the comments in your text.
>>
>>
>>
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>>
>>>
>>> Hi,
>>>
>>> The block size for HDFS is currently set to 128MB by defauilt. This is
>>> configurable.
>>>
>> Correct, an HDFS client can overwrite the cfg-property and define a
>> different block size for HDFS blocks.
>>
>>>
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode.
>>
>> Correct, the block-size is a "HDFS wide setting" but in general the
>> HDFS-client makes the blocks.
>>
>>
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>>>
>> HDFS blocksize has no impact here. NameNode metadata is held in memory.
>> For reliability it is dumped to local discs of the server.
>>
>>
>>>
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>>>
>> Remember, metadata is in memory. The fsimage-file, which contains the
>> metadata
>> is loaded on startup of the NameNode.
>>
>> Please be not confused by the two types of block-sizes.
>>
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>
>>
>>>
>>> Thanks,
>>>
>>> Mich
>>>
>>
>>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
This 200 bytes is just a "mental helper" not a precise measure. And it does
NOT take replication into account.
Each replica block has again another item of approx. 200 bytes in the NN
memory.
MK


2015-03-25 17:16 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Great. Does that 200 bytes for each block include overhead for three
> replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200
> around 1800 bytes memory in namenode?
>
> Thx
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 16:08:02 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>; <mich@peridale.co.uk
> >
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from wdatanode
> block size?
>
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we
> need about 200 bytes and for each block as well. Now we can estimate the
> max. capacity depending on HDFS-Blocksize and average File size.
>
> Cheers,
> Mirko
>
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:
>
>> Hi Mirko,
>>
>> Thanks for feedback.
>>
>> Since i have worked with in memory databases, this metadata caching
>> sounds more like an IMDB that caches data at start up from disk resident
>> storage.
>>
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the
>> case the case with metada as well?
>>
>> Regards,
>>
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> ------------------------------
>> *From: * Mirko Kämpf <mi...@gmail.com>
>> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
>> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
>> *ReplyTo: * user@hadoop.apache.org
>> *Subject: *Re: can block size for namenode be different from datanode
>> block size?
>>
>> Hi Mich,
>>
>> please see the comments in your text.
>>
>>
>>
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>>
>>>
>>> Hi,
>>>
>>> The block size for HDFS is currently set to 128MB by defauilt. This is
>>> configurable.
>>>
>> Correct, an HDFS client can overwrite the cfg-property and define a
>> different block size for HDFS blocks.
>>
>>>
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode.
>>
>> Correct, the block-size is a "HDFS wide setting" but in general the
>> HDFS-client makes the blocks.
>>
>>
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>>>
>> HDFS blocksize has no impact here. NameNode metadata is held in memory.
>> For reliability it is dumped to local discs of the server.
>>
>>
>>>
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>>>
>> Remember, metadata is in memory. The fsimage-file, which contains the
>> metadata
>> is loaded on startup of the NameNode.
>>
>> Please be not confused by the two types of block-sizes.
>>
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>
>>
>>>
>>> Thanks,
>>>
>>> Mich
>>>
>>
>>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
This 200 bytes is just a "mental helper" not a precise measure. And it does
NOT take replication into account.
Each replica block has again another item of approx. 200 bytes in the NN
memory.
MK


2015-03-25 17:16 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Great. Does that 200 bytes for each block include overhead for three
> replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200
> around 1800 bytes memory in namenode?
>
> Thx
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 16:08:02 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>; <mich@peridale.co.uk
> >
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from wdatanode
> block size?
>
> Correct, let's say you run the NameNode with just 1GB of RAM.
> This would be a very strong limitation for the cluster. For each file we
> need about 200 bytes and for each block as well. Now we can estimate the
> max. capacity depending on HDFS-Blocksize and average File size.
>
> Cheers,
> Mirko
>
> 2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:
>
>> Hi Mirko,
>>
>> Thanks for feedback.
>>
>> Since i have worked with in memory databases, this metadata caching
>> sounds more like an IMDB that caches data at start up from disk resident
>> storage.
>>
>> IMDBs tend to get issues when the cache cannot hold all data. Is this the
>> case the case with metada as well?
>>
>> Regards,
>>
>> Mich
>> Let your email find you with BlackBerry from Vodafone
>> ------------------------------
>> *From: * Mirko Kämpf <mi...@gmail.com>
>> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
>> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
>> *ReplyTo: * user@hadoop.apache.org
>> *Subject: *Re: can block size for namenode be different from datanode
>> block size?
>>
>> Hi Mich,
>>
>> please see the comments in your text.
>>
>>
>>
>> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>>
>>>
>>> Hi,
>>>
>>> The block size for HDFS is currently set to 128MB by defauilt. This is
>>> configurable.
>>>
>> Correct, an HDFS client can overwrite the cfg-property and define a
>> different block size for HDFS blocks.
>>
>>>
>>> My point is that I assume this  parameter in hadoop-core.xml sets the
>>> block size for both namenode and datanode.
>>
>> Correct, the block-size is a "HDFS wide setting" but in general the
>> HDFS-client makes the blocks.
>>
>>
>>> However, the storage and
>>> random access for metadata in nsamenode is different and suits smaller
>>> block sizes.
>>>
>> HDFS blocksize has no impact here. NameNode metadata is held in memory.
>> For reliability it is dumped to local discs of the server.
>>
>>
>>>
>>> For example in Linux the OS block size is 4k which means one HTFS blopck
>>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>>> useful and smaller block size will be suitable and hence my question.
>>>
>> Remember, metadata is in memory. The fsimage-file, which contains the
>> metadata
>> is loaded on startup of the NameNode.
>>
>> Please be not confused by the two types of block-sizes.
>>
>> Hope this helps a bit.
>> Cheers,
>> Mirko
>>
>>
>>>
>>> Thanks,
>>>
>>> Mich
>>>
>>
>>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Great. Does that 200 bytes for each block include overhead for three replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200 around 1800 bytes memory in namenode?

Thx
Let your email find you with BlackBerry from Vodafone

-----Original Message-----
From: Mirko Kämpf <mi...@gmail.com>
Date: Wed, 25 Mar 2015 16:08:02 
To: user@hadoop.apache.org<us...@hadoop.apache.org>; <mi...@peridale.co.uk>
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from wdatanode block size?

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>


Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To: user@hadoop.apache.org; mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf <mi...@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To: user@hadoop.apache.org<us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 

 


Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To: user@hadoop.apache.org; mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf <mi...@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To: user@hadoop.apache.org<us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 

 


Total memory available to NameNode

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Is there any parameter that sets the total memory that NameNode can use?

 

Thanks

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Mirko Kämpf [mailto:mirko.kaempf@gmail.com] 
Sent: 25 March 2015 16:08
To: user@hadoop.apache.org; mich@peridale.co.uk
Subject: Re: can block size for namenode be different from wdatanode block size?

 

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we need about 200 bytes and for each block as well. Now we can estimate the max. capacity depending on HDFS-Blocksize and average File size.

 

Cheers,

Mirko

 

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

Hi Mirko,

Thanks for feedback.

Since i have worked with in memory databases, this metadata caching sounds more like an IMDB that caches data at start up from disk resident storage.

IMDBs tend to get issues when the cache cannot hold all data. Is this the case the case with metada as well?

Regards,

Mich

Let your email find you with BlackBerry from Vodafone

  _____  

From: Mirko Kämpf <mi...@gmail.com> 

Date: Wed, 25 Mar 2015 15:20:03 +0000

To: user@hadoop.apache.org<us...@hadoop.apache.org>

ReplyTo: user@hadoop.apache.org 

Subject: Re: can block size for namenode be different from datanode block size?

 

Hi Mich,

 

please see the comments in your text.

 

 

2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:


Hi,

The block size for HDFS is currently set to 128MB by defauilt. This is
configurable.

Correct, an HDFS client can overwrite the cfg-property and define a different block size for HDFS blocks. 


My point is that I assume this  parameter in hadoop-core.xml sets the
block size for both namenode and datanode. 

Correct, the block-size is a "HDFS wide setting" but in general the HDFS-client makes the blocks.
  

However, the storage and
random access for metadata in nsamenode is different and suits smaller
block sizes.

HDFS blocksize has no impact here. NameNode metadata is held in memory. For reliability it is dumped to local discs of the server.
 


For example in Linux the OS block size is 4k which means one HTFS blopck
size  of 128MB can hold 32K OS blocks. For metadata this may not be
useful and smaller block size will be suitable and hence my question.

Remember, metadata is in memory. The fsimage-file, which contains the metadata 
is loaded on startup of the NameNode.

 

Please be not confused by the two types of block-sizes.

 

Hope this helps a bit.

Cheers,

Mirko

 


Thanks,

Mich

 

 


Re: can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Great. Does that 200 bytes for each block include overhead for three replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200 around 1800 bytes memory in namenode?

Thx
Let your email find you with BlackBerry from Vodafone

-----Original Message-----
From: Mirko Kämpf <mi...@gmail.com>
Date: Wed, 25 Mar 2015 16:08:02 
To: user@hadoop.apache.org<us...@hadoop.apache.org>; <mi...@peridale.co.uk>
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from wdatanode block size?

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>


Re: can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Great. Does that 200 bytes for each block include overhead for three replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200 around 1800 bytes memory in namenode?

Thx
Let your email find you with BlackBerry from Vodafone

-----Original Message-----
From: Mirko Kämpf <mi...@gmail.com>
Date: Wed, 25 Mar 2015 16:08:02 
To: user@hadoop.apache.org<us...@hadoop.apache.org>; <mi...@peridale.co.uk>
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from wdatanode block size?

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>


Re: can block size for namenode be different from wdatanode block size?

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.
Great. Does that 200 bytes for each block include overhead for three replicas? So with 128MB block a 1GB file will be 8 blocks with 200 + 8x200 around 1800 bytes memory in namenode?

Thx
Let your email find you with BlackBerry from Vodafone

-----Original Message-----
From: Mirko Kämpf <mi...@gmail.com>
Date: Wed, 25 Mar 2015 16:08:02 
To: user@hadoop.apache.org<us...@hadoop.apache.org>; <mi...@peridale.co.uk>
Reply-To: user@hadoop.apache.org
Subject: Re: can block size for namenode be different from wdatanode block size?

Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>


Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>

Re: can block size for namenode be different from wdatanode block size?

Posted by Mirko Kämpf <mi...@gmail.com>.
Correct, let's say you run the NameNode with just 1GB of RAM.
This would be a very strong limitation for the cluster. For each file we
need about 200 bytes and for each block as well. Now we can estimate the
max. capacity depending on HDFS-Blocksize and average File size.

Cheers,
Mirko

2015-03-25 15:34 GMT+00:00 Mich Talebzadeh <mi...@peridale.co.uk>:

> Hi Mirko,
>
> Thanks for feedback.
>
> Since i have worked with in memory databases, this metadata caching sounds
> more like an IMDB that caches data at start up from disk resident storage.
>
> IMDBs tend to get issues when the cache cannot hold all data. Is this the
> case the case with metada as well?
>
> Regards,
>
> Mich
> Let your email find you with BlackBerry from Vodafone
> ------------------------------
> *From: * Mirko Kämpf <mi...@gmail.com>
> *Date: *Wed, 25 Mar 2015 15:20:03 +0000
> *To: *user@hadoop.apache.org<us...@hadoop.apache.org>
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: can block size for namenode be different from datanode
> block size?
>
> Hi Mich,
>
> please see the comments in your text.
>
>
>
> 2015-03-25 15:11 GMT+00:00 Dr Mich Talebzadeh <mi...@peridale.co.uk>:
>
>>
>> Hi,
>>
>> The block size for HDFS is currently set to 128MB by defauilt. This is
>> configurable.
>>
> Correct, an HDFS client can overwrite the cfg-property and define a
> different block size for HDFS blocks.
>
>>
>> My point is that I assume this  parameter in hadoop-core.xml sets the
>> block size for both namenode and datanode.
>
> Correct, the block-size is a "HDFS wide setting" but in general the
> HDFS-client makes the blocks.
>
>
>> However, the storage and
>> random access for metadata in nsamenode is different and suits smaller
>> block sizes.
>>
> HDFS blocksize has no impact here. NameNode metadata is held in memory.
> For reliability it is dumped to local discs of the server.
>
>
>>
>> For example in Linux the OS block size is 4k which means one HTFS blopck
>> size  of 128MB can hold 32K OS blocks. For metadata this may not be
>> useful and smaller block size will be suitable and hence my question.
>>
> Remember, metadata is in memory. The fsimage-file, which contains the
> metadata
> is loaded on startup of the NameNode.
>
> Please be not confused by the two types of block-sizes.
>
> Hope this helps a bit.
> Cheers,
> Mirko
>
>
>>
>> Thanks,
>>
>> Mich
>>
>
>