You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by Mike Liddell <Mi...@microsoft.com> on 2013/10/15 01:24:22 UTC

How does dfs.block.size get passed into CombineFileInputFormat

Background: I'm trying to track the details of how Hive creates multi-file splits.  I'm under the impression that MapReduce's CombineFileInputFormat does the main work of combining files and specifically that, if no overrides are set, then the target split filesize will be set to dfs.block.size.

However, I cannot see how the value for dfs.block.size finds its way into CombineFileInputFormat.  I'm probably missing some obvious thing but I'd appreciate someone pointing it out!

thanks,
Mike.