You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2008/03/07 15:09:46 UTC

[jira] Commented: (HADOOP-2834) Iterator for MapFileOutputFormat

    [ https://issues.apache.org/jira/browse/HADOOP-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576204#action_12576204 ] 

Enis Soztutar commented on HADOOP-2834:
---------------------------------------

We do not offer an Iterator for MapFiles, but use MapFile.Reader#next(). Wouldn't it be better if we (a) add Iterator to MapFile.Reader and apply this patch or (b) change this patch to define MapFileOutputFormat.Reader instead of Iterators, so that reading from MapFile and MapFileOutputFormat is consistent. 

one more minor issue : I think we should change the generics to : 
{code}
private static final class IteratorEntry<K extends WritableComparable, V extends Writable> implements Entry<K, V> {
...
}

private static final class MapFileOutputFormatIterator<K extends WritableComparable, V extends Writable> implements Iterator<Entry<K, V>> {
...
}

public static<K extends WritableComparable, V extends Writable> Iterator<Entry<K, V>>
      getIterator(Path dir, Configuration conf) throws IOException {
...
}
{code}

so that we can use : 
{code}
Iterator<Entry<Text, Text>> x = MapFileOutputFormat.getIterator(path, conf);
{code}


> Iterator for MapFileOutputFormat
> --------------------------------
>
>                 Key: HADOOP-2834
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2834
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Andrzej Bialecki 
>             Fix For: 0.17.0
>
>         Attachments: map-file-v2.patch, map-file-v3.patch
>
>
> MapFileOutputFormat produces output data that is sorted locally in each part-NNNNN file - however, there is no easy way to iterate over keys from all parts in a globally ascending order.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.