You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Himawan Mahardianto <ma...@ugm.ac.id> on 2015/05/12 05:08:12 UTC

Question about Block size configuration

Hi guys, I have a couple question about HDFS block size:

What if I set my HDFS block size from default 64 MB to 2 MB each block,
what will gonna happen?

I decrease the value of a block size because I want to store an image file
(jpeg, png etc) that have size about 4MB each file, what is your opinion or
suggestion?

What will gonna happen if i don't change the default size of a block size,
then I store an image file with 4MB size, will Hadoop use full 64MB block,
or it will create 4Mb block instead 64MB?

How much memory used on RAM to store each block if my block size is 64MB,
or my block size is 4MB?

Is there anyone have experience with this? Any suggestion are welcome
Thank you

Re: Question about Block size configuration

Posted by Drake민영근 <dr...@nexr.com>.
Hi

I think metadata size is not greatly different. The problem is the number
of blocks. The block size is lesser than 64MB, more block generated with
the same file size(if 32MB then 2x more blocks).

And, yes. all metadata is in the namenode's heap memory.

Thanks.


Drake 민영근 Ph.D
kt NexR

On Tue, May 12, 2015 at 3:31 PM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> thank you for the explanation, and how much byte each metadata will
> consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
> will store on RAM right?
>

Re: Question about Block size configuration

Posted by Drake민영근 <dr...@nexr.com>.
Hi

I think metadata size is not greatly different. The problem is the number
of blocks. The block size is lesser than 64MB, more block generated with
the same file size(if 32MB then 2x more blocks).

And, yes. all metadata is in the namenode's heap memory.

Thanks.


Drake 민영근 Ph.D
kt NexR

On Tue, May 12, 2015 at 3:31 PM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> thank you for the explanation, and how much byte each metadata will
> consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
> will store on RAM right?
>

Re: Question about Block size configuration

Posted by Drake민영근 <dr...@nexr.com>.
Hi

I think metadata size is not greatly different. The problem is the number
of blocks. The block size is lesser than 64MB, more block generated with
the same file size(if 32MB then 2x more blocks).

And, yes. all metadata is in the namenode's heap memory.

Thanks.


Drake 민영근 Ph.D
kt NexR

On Tue, May 12, 2015 at 3:31 PM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> thank you for the explanation, and how much byte each metadata will
> consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
> will store on RAM right?
>

Re: Question about Block size configuration

Posted by Drake민영근 <dr...@nexr.com>.
Hi

I think metadata size is not greatly different. The problem is the number
of blocks. The block size is lesser than 64MB, more block generated with
the same file size(if 32MB then 2x more blocks).

And, yes. all metadata is in the namenode's heap memory.

Thanks.


Drake 민영근 Ph.D
kt NexR

On Tue, May 12, 2015 at 3:31 PM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> thank you for the explanation, and how much byte each metadata will
> consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
> will store on RAM right?
>

Re: Question about Block size configuration

Posted by Himawan Mahardianto <ma...@ugm.ac.id>.
thank you for the explanation, and how much byte each metadata will
consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
will store on RAM right?

Re: Question about Block size configuration

Posted by Himawan Mahardianto <ma...@ugm.ac.id>.
thank you for the explanation, and how much byte each metadata will
consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
will store on RAM right?

Re: Question about Block size configuration

Posted by Himawan Mahardianto <ma...@ugm.ac.id>.
thank you for the explanation, and how much byte each metadata will
consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
will store on RAM right?

Re: Question about Block size configuration

Posted by Himawan Mahardianto <ma...@ugm.ac.id>.
thank you for the explanation, and how much byte each metadata will
consuming in RAM if BS is 64MB or smaller than that? I heard every metadata
will store on RAM right?

Re: Question about Block size configuration

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
If you set the BS lesser then 64MB, you’ll get into Namenode issues  when a larger file will be read by a client. The client will ask for every block the NN - imagine what happen when you want to read a 1TB file.
The optimal BS size is 128MB. You have to have in mind, that every block will be replicated (typically 3 times). And since Hadoop is made to store large files in a JBOD (just a bunch of disks) configuration, a BS lesser than 64MB would also overwhelmed the physical disks. 

BR,
 Alex


> On 12 May 2015, at 07:47, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> 
> The default HDFS block size 64 MB means, it is the maximum size of block of data written on HDFS. So, if you write 4 MB files, they will still be occupying only 1 block of 4 MB size, not more than that. If your file is more than 64MB, it gets split into multiple blocks.
> 
> If you set the HDFS block size to 2MB, then your 4 MB file will get split into two blocks.
> 
> On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <mahardianto@ugm.ac.id <ma...@ugm.ac.id>> wrote:
> Hi guys, I have a couple question about HDFS block size:
> 
> What if I set my HDFS block size from default 64 MB to 2 MB each block, what will gonna happen?
> 
> I decrease the value of a block size because I want to store an image file (jpeg, png etc) that have size about 4MB each file, what is your opinion or suggestion?
> 
> What will gonna happen if i don't change the default size of a block size, then I store an image file with 4MB size, will Hadoop use full 64MB block, or it will create 4Mb block instead 64MB?
> 
> How much memory used on RAM to store each block if my block size is 64MB, or my block size is 4MB?
> 
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
> 


Re: Question about Block size configuration

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
If you set the BS lesser then 64MB, you’ll get into Namenode issues  when a larger file will be read by a client. The client will ask for every block the NN - imagine what happen when you want to read a 1TB file.
The optimal BS size is 128MB. You have to have in mind, that every block will be replicated (typically 3 times). And since Hadoop is made to store large files in a JBOD (just a bunch of disks) configuration, a BS lesser than 64MB would also overwhelmed the physical disks. 

BR,
 Alex


> On 12 May 2015, at 07:47, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> 
> The default HDFS block size 64 MB means, it is the maximum size of block of data written on HDFS. So, if you write 4 MB files, they will still be occupying only 1 block of 4 MB size, not more than that. If your file is more than 64MB, it gets split into multiple blocks.
> 
> If you set the HDFS block size to 2MB, then your 4 MB file will get split into two blocks.
> 
> On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <mahardianto@ugm.ac.id <ma...@ugm.ac.id>> wrote:
> Hi guys, I have a couple question about HDFS block size:
> 
> What if I set my HDFS block size from default 64 MB to 2 MB each block, what will gonna happen?
> 
> I decrease the value of a block size because I want to store an image file (jpeg, png etc) that have size about 4MB each file, what is your opinion or suggestion?
> 
> What will gonna happen if i don't change the default size of a block size, then I store an image file with 4MB size, will Hadoop use full 64MB block, or it will create 4Mb block instead 64MB?
> 
> How much memory used on RAM to store each block if my block size is 64MB, or my block size is 4MB?
> 
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
> 


Re: Question about Block size configuration

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
If you set the BS lesser then 64MB, you’ll get into Namenode issues  when a larger file will be read by a client. The client will ask for every block the NN - imagine what happen when you want to read a 1TB file.
The optimal BS size is 128MB. You have to have in mind, that every block will be replicated (typically 3 times). And since Hadoop is made to store large files in a JBOD (just a bunch of disks) configuration, a BS lesser than 64MB would also overwhelmed the physical disks. 

BR,
 Alex


> On 12 May 2015, at 07:47, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> 
> The default HDFS block size 64 MB means, it is the maximum size of block of data written on HDFS. So, if you write 4 MB files, they will still be occupying only 1 block of 4 MB size, not more than that. If your file is more than 64MB, it gets split into multiple blocks.
> 
> If you set the HDFS block size to 2MB, then your 4 MB file will get split into two blocks.
> 
> On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <mahardianto@ugm.ac.id <ma...@ugm.ac.id>> wrote:
> Hi guys, I have a couple question about HDFS block size:
> 
> What if I set my HDFS block size from default 64 MB to 2 MB each block, what will gonna happen?
> 
> I decrease the value of a block size because I want to store an image file (jpeg, png etc) that have size about 4MB each file, what is your opinion or suggestion?
> 
> What will gonna happen if i don't change the default size of a block size, then I store an image file with 4MB size, will Hadoop use full 64MB block, or it will create 4Mb block instead 64MB?
> 
> How much memory used on RAM to store each block if my block size is 64MB, or my block size is 4MB?
> 
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
> 


Re: Question about Block size configuration

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
If you set the BS lesser then 64MB, you’ll get into Namenode issues  when a larger file will be read by a client. The client will ask for every block the NN - imagine what happen when you want to read a 1TB file.
The optimal BS size is 128MB. You have to have in mind, that every block will be replicated (typically 3 times). And since Hadoop is made to store large files in a JBOD (just a bunch of disks) configuration, a BS lesser than 64MB would also overwhelmed the physical disks. 

BR,
 Alex


> On 12 May 2015, at 07:47, Krishna Kishore Bonagiri <wr...@gmail.com> wrote:
> 
> The default HDFS block size 64 MB means, it is the maximum size of block of data written on HDFS. So, if you write 4 MB files, they will still be occupying only 1 block of 4 MB size, not more than that. If your file is more than 64MB, it gets split into multiple blocks.
> 
> If you set the HDFS block size to 2MB, then your 4 MB file will get split into two blocks.
> 
> On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <mahardianto@ugm.ac.id <ma...@ugm.ac.id>> wrote:
> Hi guys, I have a couple question about HDFS block size:
> 
> What if I set my HDFS block size from default 64 MB to 2 MB each block, what will gonna happen?
> 
> I decrease the value of a block size because I want to store an image file (jpeg, png etc) that have size about 4MB each file, what is your opinion or suggestion?
> 
> What will gonna happen if i don't change the default size of a block size, then I store an image file with 4MB size, will Hadoop use full 64MB block, or it will create 4Mb block instead 64MB?
> 
> How much memory used on RAM to store each block if my block size is 64MB, or my block size is 4MB?
> 
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
> 


Re: Question about Block size configuration

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is
more than 64MB, it gets split into multiple blocks.

If you set the HDFS block size to 2MB, then your 4 MB file will get split
into two blocks.

On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> Hi guys, I have a couple question about HDFS block size:
>
> What if I set my HDFS block size from default 64 MB to 2 MB each block,
> what will gonna happen?
>
> I decrease the value of a block size because I want to store an image file
> (jpeg, png etc) that have size about 4MB each file, what is your opinion or
> suggestion?
>
> What will gonna happen if i don't change the default size of a block size,
> then I store an image file with 4MB size, will Hadoop use full 64MB block,
> or it will create 4Mb block instead 64MB?
>
> How much memory used on RAM to store each block if my block size is 64MB,
> or my block size is 4MB?
>
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
>

Re: Question about Block size configuration

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is
more than 64MB, it gets split into multiple blocks.

If you set the HDFS block size to 2MB, then your 4 MB file will get split
into two blocks.

On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> Hi guys, I have a couple question about HDFS block size:
>
> What if I set my HDFS block size from default 64 MB to 2 MB each block,
> what will gonna happen?
>
> I decrease the value of a block size because I want to store an image file
> (jpeg, png etc) that have size about 4MB each file, what is your opinion or
> suggestion?
>
> What will gonna happen if i don't change the default size of a block size,
> then I store an image file with 4MB size, will Hadoop use full 64MB block,
> or it will create 4Mb block instead 64MB?
>
> How much memory used on RAM to store each block if my block size is 64MB,
> or my block size is 4MB?
>
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
>

Re: Question about Block size configuration

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is
more than 64MB, it gets split into multiple blocks.

If you set the HDFS block size to 2MB, then your 4 MB file will get split
into two blocks.

On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> Hi guys, I have a couple question about HDFS block size:
>
> What if I set my HDFS block size from default 64 MB to 2 MB each block,
> what will gonna happen?
>
> I decrease the value of a block size because I want to store an image file
> (jpeg, png etc) that have size about 4MB each file, what is your opinion or
> suggestion?
>
> What will gonna happen if i don't change the default size of a block size,
> then I store an image file with 4MB size, will Hadoop use full 64MB block,
> or it will create 4Mb block instead 64MB?
>
> How much memory used on RAM to store each block if my block size is 64MB,
> or my block size is 4MB?
>
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
>

Re: Question about Block size configuration

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
The default HDFS block size 64 MB means, it is the maximum size of block of
data written on HDFS. So, if you write 4 MB files, they will still be
occupying only 1 block of 4 MB size, not more than that. If your file is
more than 64MB, it gets split into multiple blocks.

If you set the HDFS block size to 2MB, then your 4 MB file will get split
into two blocks.

On Tue, May 12, 2015 at 8:38 AM, Himawan Mahardianto <ma...@ugm.ac.id>
wrote:

> Hi guys, I have a couple question about HDFS block size:
>
> What if I set my HDFS block size from default 64 MB to 2 MB each block,
> what will gonna happen?
>
> I decrease the value of a block size because I want to store an image file
> (jpeg, png etc) that have size about 4MB each file, what is your opinion or
> suggestion?
>
> What will gonna happen if i don't change the default size of a block size,
> then I store an image file with 4MB size, will Hadoop use full 64MB block,
> or it will create 4Mb block instead 64MB?
>
> How much memory used on RAM to store each block if my block size is 64MB,
> or my block size is 4MB?
>
> Is there anyone have experience with this? Any suggestion are welcome
> Thank you
>