You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Jonathan Poon (JIRA)" <ji...@apache.org> on 2011/08/11 02:33:27 UTC
[jira] [Created] (HADOOP-7535) Hadoop Streaming map_input_file
Hadoop Streaming map_input_file
-------------------------------
Key: HADOOP-7535
URL: https://issues.apache.org/jira/browse/HADOOP-7535
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 0.20.203.0
Environment: Debian Squeeze
Reporter: Jonathan Poon
I'm currently trying to use the map_input_file environment variable to determine the file the stdin stream is coming from into a mapper code I've written. I used the following command to print the environment variables and see what map_input_file was given:
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-0.20.203.0.jar -input A -input S -input F -output B -mapper "bash -c \"export\""
I get the following output for the map_input_file:
declare -x map_input_file="hdfs://localhost:54310/user/poonj/A/A.txt"
declare -x map_input_file="hdfs://localhost:54310/user/poonj/A/A.txt"
declare -x map_input_file="hdfs://localhost:54310/user/poonj/F/F.txt"
declare -x map_input_file="hdfs://localhost:54310/user/poonj/S/S.txt"
I am under the assumption that the variable is only used once and should store the current file being processed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira