You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ondřej Klimpera <kl...@fit.cvut.cz> on 2012/06/13 10:13:52 UTC

How Hadoop splits TextInput?

Hello,

I'd like to ask you how Hadoop splits text input, if it's size is 
smaller then HDFS block size.

I'm testing an application, which creates from small input large outputs.

When using NInputSplits input format and setting number of splits in 
mapred-conf.xml some results are lost during writing output.

When app runs with default TextInput format everything goes OK.

Have you an idea, where the problem should be?

Thanks for your answer.