You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Steve Gao <st...@yahoo.com> on 2008/10/23 19:48:11 UTC

[Help needed] Is there a way to know the input filename at Hadoop Streaming?

Sorry for the email. Thanks for any help or hint.

    I am using Hadoop Streaming. The input are multiple files.
    Is there a way to get the current filename in mapper?

    For example:
    $HADOOP_HOME/bin/hadoop  \
    jar $HADOOP_HOME/hadoop-streaming.jar \
        -input file1 \
        -input file2 \
        -output myOutputDir \
        -mapper mapper \
        -reducer reducer

    In mapper:
    while (<STDIN>){
      //how to tell the current line is from file1 or file2?
    }