You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Bwolen Yang <wb...@gmail.com> on 2007/06/05 00:00:17 UTC

map interface to outputs of map/reduce

Hi,

Given that map/reduce produces a partitioned set of sorted output
files, I was wondering if a map implementation exists for doing
lookups or iterate thru subranges of these files.

This would be similar to the java SortedMap, except
  - the map is read-only,
  - works with data on disk (instead of in memory),
  - for lookups, it should know to tradeoff seeks (with binary search)
vs disk read

thanks

bwolen

Re: map interface to outputs of map/reduce

Posted by Bwolen Yang <wb...@gmail.com>.

> http://lucene.apache.org/hadoop/api/org/apache/hadoop/io/MapFile.html
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/MapFileOutputFormat.html

cool.  thank you very much.

bwolen

Re: map interface to outputs of map/reduce

Posted by Doug Cutting <cu...@apache.org>.

Bwolen Yang wrote:
> Given that map/reduce produces a partitioned set of sorted output
> files, I was wondering if a map implementation exists for doing
> lookups or iterate thru subranges of these files.
> 
> This would be similar to the java SortedMap, except
>  - the map is read-only,
>  - works with data on disk (instead of in memory),
>  - for lookups, it should know to tradeoff seeks (with binary search)
> vs disk read

http://lucene.apache.org/hadoop/api/org/apache/hadoop/io/MapFile.html
http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/MapFileOutputFormat.html

Doug