You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Andy Liu <an...@gmail.com> on 2009/01/27 17:08:13 UTC

Number of records in a MapFile

Is there a way to programatically get the number of records in a MapFile
without doing a complete scan?

Re: Number of records in a MapFile

Posted by Rasit OZDAS <ra...@gmail.com>.
Do you mean, without scanning all the files line by line?
I know little about implementation of hadoop, but as a programmer, I can
presume that it's not possible without a complete scan.

But I can suggest a work-around:
- compute number of records manually before putting a file to HDFS.
- Append the computed number to the filename.
- modify InputReader, so that reader appends that number to the key of every
map.

Hope this helps,
Rasit

2009/1/27 Andy Liu <an...@gmail.com>

> Is there a way to programatically get the number of records in a MapFile
> without doing a complete scan?
>



-- 
M. Raşit ÖZDAŞ