You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ellis H. Wilson III" <el...@cse.psu.edu> on 2012/08/11 16:55:54 UTC

fs.local.block.size vs file.blocksize

Hi guys and gals,

I originally posted a version of this question on the user list on a few 
days ago to no response, so I thought perhaps it delved a bit too far 
into the nitty-gritty to warrant one.  My apologies for cross-listing.

Can someone please briefly summarize the difference between these two 
parameters?  I do not see deprecated warnings for fs.local.block.size 
when I run with it set.  Furthermore, and I'm unsure if this is related, 
I see two copies of what is effectively RawLocalFileSystem.java (the 
other is local/RawLocalFs.java).  It appears that the one in local/ is 
for the old abstract FileSystem class, whereas RawLocalFileSystem.java 
uses the new abstract class.  Perhaps this is the root cause of the two 
parameters?  Or does file.blocksize simply control the abstract class or 
some such thing?

The practical answers I really need to get a handle on are the following:
1. Is the default for the file:// filesystem boosted to a 64MB blocksize 
in Hadoop 2.0?  It was only 32MB in Hadoop 1.0, but it's not 100% clear 
to me that it is now a full 64MB.  The core-site.xml docs online suggest 
it's been boosted.
2. If I alter the blocksize of file://, is it correct to presume that 
also will impact the shuffle block-size since that data goes locally?

Thanks!

ellis

Re: fs.local.block.size vs file.blocksize

Posted by Eli Collins <el...@cloudera.com>.
Hi Ellis,

fs.local.block.size is the default FileSystem block size, note however
that most file systems (like HDFS, see DistributedFileSystem) override
this, eg when using HDFS the default block size is configured with
dfs.blocksize which defaults to 64mb.

Note in v1 the default block size for hdfs was 64mb as well
(configured via dfs.block.size, which dfs.blocksize replaces).

Thanks,
Eli

On Sat, Aug 11, 2012 at 7:55 AM, Ellis H. Wilson III <el...@cse.psu.edu> wrote:
> Hi guys and gals,
>
> I originally posted a version of this question on the user list on a few
> days ago to no response, so I thought perhaps it delved a bit too far into
> the nitty-gritty to warrant one.  My apologies for cross-listing.
>
> Can someone please briefly summarize the difference between these two
> parameters?  I do not see deprecated warnings for fs.local.block.size when I
> run with it set.  Furthermore, and I'm unsure if this is related, I see two
> copies of what is effectively RawLocalFileSystem.java (the other is
> local/RawLocalFs.java).  It appears that the one in local/ is for the old
> abstract FileSystem class, whereas RawLocalFileSystem.java uses the new
> abstract class.  Perhaps this is the root cause of the two parameters?  Or
> does file.blocksize simply control the abstract class or some such thing?
>
> The practical answers I really need to get a handle on are the following:
> 1. Is the default for the file:// filesystem boosted to a 64MB blocksize in
> Hadoop 2.0?  It was only 32MB in Hadoop 1.0, but it's not 100% clear to me
> that it is now a full 64MB.  The core-site.xml docs online suggest it's been
> boosted.
> 2. If I alter the blocksize of file://, is it correct to presume that also
> will impact the shuffle block-size since that data goes locally?
>
> Thanks!
>
> ellis