You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2007/01/23 17:19:49 UTC
[jira] Created: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
----------------------------------------------------------------------------------------------------
Key: HADOOP-920
URL: https://issues.apache.org/jira/browse/HADOOP-920
Project: Hadoop
Issue Type: Bug
Components: mapred
Affects Versions: 0.11.0
Reporter: Andrzej Bialecki
Fix For: 0.11.0
Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466773 ]
Hadoop QA commented on HADOOP-920:
----------------------------------
+1, because http://issues.apache.org/jira/secure/attachment/12349457/key-value-class.patch applied and successfully tested against trunk revision r498829.
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki resolved HADOOP-920.
--------------------------------------
Resolution: Fixed
Assignee: Andrzej Bialecki
Fixed by reverting an accidental change introduced by HADOOP-115.
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Assigned To: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki updated HADOOP-920:
-------------------------------------
Attachment: key-value-class.patch
Proposed fix, which uses different methods depending on whether we are in map or reduce task.
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki updated HADOOP-920:
-------------------------------------
Attachment: (was: key-value-class.patch)
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki updated HADOOP-920:
-------------------------------------
Attachment: key-value-class.patch
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated HADOOP-920:
--------------------------------
Status: Open (was: Patch Available)
OutputFormats are only used when reducing, to generate the final output. They're not used when creating intermediate output. So the bug here is that MapFileOutputFormat calls job.getMapOutput{Key,Value}Class()--those methods should only be called by the MapReduce kernel when generating intermediate output and should not be called by an OutputFormat implementation. This bug was introduced by HADOOP-115.
http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/MapFileOutputFormat.java?p2=%2Flucene%2Fhadoop%2Ftrunk%2Fsrc%2Fjava%2Forg%2Fapache%2Fhadoop%2Fmapred%2FMapFileOutputFormat.java&p1=%2Flucene%2Fhadoop%2Ftrunk%2Fsrc%2Fjava%2Forg%2Fapache%2Fhadoop%2Fmapred%2FMapFileOutputFormat.java&r1=407355&r2=407354&view=diff&pathrev=407355
The proper fix I think is to undo that change to this file.
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-920) MapFileOutputFormat and
SequenceFileOutputFormat use incorrect key/value classes in map/reduce
tasks
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki updated HADOOP-920:
-------------------------------------
Status: Patch Available (was: Open)
> MapFileOutputFormat and SequenceFileOutputFormat use incorrect key/value classes in map/reduce tasks
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-920
> URL: https://issues.apache.org/jira/browse/HADOOP-920
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.11.0
> Reporter: Andrzej Bialecki
> Fix For: 0.11.0
>
> Attachments: key-value-class.patch
>
>
> Let's assume a job uses different key/value class for the output of map tasks and for the final output of reduce tasks.
> When executing map tasks classes returned from JobConf.getMapOutputKeyClass() / getMapOutputValueClass() should be used, and when executing reduce tasks classes returned from JobConf.gtOutputKeyClass() / getOutputValueClass() should be used.
> Currently both map and reduce tasks will use getMapOutputKeyClass/getMapOutputValueClass when using MapFileOutputFormat, or they will always use getOutputKeyClassgetOutputValueClass when using SequenceFileOutputFormat. This causes exceptions, because Mapper / Reducer implementations will output different key/value classes than expected.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.