You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dmitry Pushkarev <um...@stanford.edu> on 2009/01/18 05:59:51 UTC

streaming question.

Dear hadoop users.

 

When I use streaming on one large file, that is being split in many map
tasks, can I be sure that splits won't fall in the middle of the line? 

(i.e. if split size needs to be larger than  64Mb to fit end of the line it
will be increased?

 

Thanks.

---

Dmitry Pushkarev

+1-650-644-8988

 


Re: streaming question.

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
You can also have a look at NLineInputFormat. 
@http://hadoop.apache.org/core/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/NLineInputFormat.html

Thanks
Amareshwari
Abdul Qadeer wrote:
> Dmitry,
>
> If you are talking about Text data, then the splits can be anywhere.  But
> LineRecordReader will take care of this thing and your mapper code will
> get the correct whole line.
>
> Abdul Qadeer
>
> On Sun, Jan 18, 2009 at 9:59 AM, Dmitry Pushkarev <um...@stanford.edu> wrote:
>
>   
>> Dear hadoop users.
>>
>>
>>
>> When I use streaming on one large file, that is being split in many map
>> tasks, can I be sure that splits won't fall in the middle of the line?
>>
>> (i.e. if split size needs to be larger than  64Mb to fit end of the line it
>> will be increased?
>>
>>
>>
>> Thanks.
>>
>> ---
>>
>> Dmitry Pushkarev
>>
>> +1-650-644-8988
>>
>>
>>
>>
>>     
>
>   


Re: streaming question.

Posted by Abdul Qadeer <qa...@gmail.com>.
Dmitry,

If you are talking about Text data, then the splits can be anywhere.  But
LineRecordReader will take care of this thing and your mapper code will
get the correct whole line.

Abdul Qadeer

On Sun, Jan 18, 2009 at 9:59 AM, Dmitry Pushkarev <um...@stanford.edu> wrote:

> Dear hadoop users.
>
>
>
> When I use streaming on one large file, that is being split in many map
> tasks, can I be sure that splits won't fall in the middle of the line?
>
> (i.e. if split size needs to be larger than  64Mb to fit end of the line it
> will be increased?
>
>
>
> Thanks.
>
> ---
>
> Dmitry Pushkarev
>
> +1-650-644-8988
>
>
>
>