You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Leiz <lz...@gmail.com> on 2009/06/26 17:57:25 UTC

hwo to read a text file in Map function until reaching specific line

For example , I have a text file with 1000 lines.
I only want to read the first 500 line of the file.
How can I do in Map function?

Thanks


-- 
View this message in context: http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching-specific-line-tp24222783p24222783.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Re: hwo to read a text file in Map function until reaching specific line

Posted by Tarandeep Singh <ta...@gmail.com>.
The TextInputFormat gives byte offset in the file as key and the entire line
as value. so it won't work for you.

You can modify NLineInputFormat to achieve what you want. NLineInputformat
gives each mapper N Lines (in your case N=500)

Since you are interested in only first 500 lines of each file, the record
reader for NLineInputFormat will be implemented as-

get the input split
check the start pos
if start pos ==0
  you will read the first 500 lines
else
  you have got a file split that is in middle of the file, don't bother to
read anything as the mapper that is reading from the beginning of the file
is reading first 500 lines. Just indicate no more input.

-Tarandeep

On Fri, Jun 26, 2009 at 10:35 AM, Ramakishore Yelamanchilli <
kyelaman@cisco.com> wrote:

> I think map function gets the line number as key. You can ignore te other
> lines after the key value 500.
>
> Thanks
>
> -----Original Message-----
> From: Leiz [mailto:lzhang32@gmail.com]
> Sent: Friday, June 26, 2009 8:57 AM
> To: core-user@hadoop.apache.org
> Subject: hwo to read a text file in Map function until reaching specific
> line
>
>
> For example , I have a text file with 1000 lines.
> I only want to read the first 500 line of the file.
> How can I do in Map function?
>
> Thanks
>
>
> --
> View this message in context:
>
> http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching
> -specific-line-tp24222783p24222783.html<http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching%0A-specific-line-tp24222783p24222783.html>
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

RE: hwo to read a text file in Map function until reaching specific line

Posted by Ramakishore Yelamanchilli <ky...@cisco.com>.
I think map function gets the line number as key. You can ignore te other
lines after the key value 500.

Thanks

-----Original Message-----
From: Leiz [mailto:lzhang32@gmail.com] 
Sent: Friday, June 26, 2009 8:57 AM
To: core-user@hadoop.apache.org
Subject: hwo to read a text file in Map function until reaching specific
line


For example , I have a text file with 1000 lines.
I only want to read the first 500 line of the file.
How can I do in Map function?

Thanks


-- 
View this message in context:
http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching
-specific-line-tp24222783p24222783.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.