You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2010/07/05 07:27:50 UTC
[jira] Resolved: (MAPREDUCE-612) streaming should default to
KeyValueTextInputFormat with IdentityMapper
[ https://issues.apache.org/jira/browse/MAPREDUCE-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amareshwari Sriramadasu resolved MAPREDUCE-612.
-----------------------------------------------
Resolution: Duplicate
Fixed by MAPREDUCE-1888
> streaming should default to KeyValueTextInputFormat with IdentityMapper
> -----------------------------------------------------------------------
>
> Key: MAPREDUCE-612
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-612
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Joydeep Sen Sarma
> Priority: Minor
>
> in 15.3 - streaming defaults to TextInputFormat (without -inputformat option).
> this is great in case the PipeMapper is used. but in many cases people want to do an IdentityMapper - and it fails with the IdentityMapper:
> a) the map output key type becomes LongWritable (but hadoop has already defaulted to expect Text)
> b) the map output key is the Line number - and intuitively - this is not what the user expects (almost no one wants to use the line number as the map key).
> if we could simply default to KeyValueTextInputFormat with IdentityMapper - that would resolve both of these problems. This would change default behavior though - so a little leery ..
> using '-mapper cat' is the common workaround - but it just seems like a needless waste of resources ..
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.