You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2010/12/01 22:15:12 UTC

[jira] Commented: (MAPREDUCE-2028) streaming should support MultiFileInputFormat

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965849#action_12965849 ] 

Allen Wittenauer commented on MAPREDUCE-2028:
---------------------------------------------

Actually, what should probably happen is that MultiFileWordCount's "MyInputFormat" and "MultiLineRecordRecord" should get promoted out of examples and officially into the mapred(uce) APIs. 

The following appears to implement exactly what us streaming users want/need:

$HADOOP_HOME/bin/hadoop  \
        jar \
        `ls $HADOOP_HOME/contrib/streaming/hadoop-*-streaming.jar` \
        -libjars `ls $HADOOP_HOME/hadoop-*-examples.jar` \
        -inputformat org.apache.hadoop.examples.MultiFileWordCount\$MyInputFormat \
        -inputreader org.apache.hadoop.examples.MultiFileWordCount\$MultiFileLineRecordReader \
        ....


> streaming should support MultiFileInputFormat
> ---------------------------------------------
>
>                 Key: MAPREDUCE-2028
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2028
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.20.2
>            Reporter: Allen Wittenauer
>             Fix For: 0.21.1, 0.22.0
>
>
> There should be a way to call MultiFileInputFormat from streaming without having to write Java code...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.