You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by 李耀宗 <le...@yahoo.com.INVALID> on 2016/03/16 03:01:07 UTC

undocumented hadoop streaming properties stream.map.input.ignoreKey

Hello,

I am using hadoop streaming, and found that if I specify -inputformat to use another InputFormat (e.g. 
org.apache.hadoop.mapred.lib.CombineTextInputFormat) instead of 
using the default org.apache.hadoop.mapred.lib.TextInputFormat, an extra key emits out to the mapper program.


After digging the hadoop streaming source code, I found that there is a undocumented job property stream.map.input.ignoreKey. If -inputformat is unset (or set to org.apache.hadoop.mapred.lib.TextInputFormat), then this property is default to true, otherwise false. I have to manually set this property to true (-D stream.map.input.ignoreKey=true) when issuing hadoop streaming command, if I want to change -inputformat.

Actually this property was documented before, but somehow disappeared in recent documentation. Is this property deprecated or simply somehow missed in documentation?

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org