You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Roldano Cattoni <ca...@fbk.eu> on 2009/03/13 07:14:16 UTC

how to preserve original line order?

The task should be simple, I want to put in uppercase all the words of a
(large) file.

I tried the following:
 - streaming mode
 - the mapper is a perl script that put each line in uppercase (number of
   mappers > 1)
 - no reducer (number of reducers set to zero)

It works fine except for line order which is not preserved.

How to preserve the original line order?

I would appreciate any suggestion.

  Roldano


Re: how to preserve original line order?

Posted by Miles Osborne <mi...@inf.ed.ac.uk>.
associate with each line an identifier (eg line number) and afterwards
resort the data by that

Miles

2009/3/13 Roldano Cattoni <ca...@fbk.eu>:
> The task should be simple, I want to put in uppercase all the words of a
> (large) file.
>
> I tried the following:
>  - streaming mode
>  - the mapper is a perl script that put each line in uppercase (number of
>   mappers > 1)
>  - no reducer (number of reducers set to zero)
>
> It works fine except for line order which is not preserved.
>
> How to preserve the original line order?
>
> I would appreciate any suggestion.
>
>  Roldano
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.