You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Ed Kohlwey (JIRA)" <ji...@apache.org> on 2009/11/19 19:55:39 UTC
[jira] Created: (MAPREDUCE-1223) CompositeInputFormat doesn't
consider all tuples when run in a local task tracker
CompositeInputFormat doesn't consider all tuples when run in a local task tracker
---------------------------------------------------------------------------------
Key: MAPREDUCE-1223
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1223
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.20.1
Environment: Yahoo distribution for Hadoop 0.20.1.3041192001 and Cloudera Distribution for Hadoop 0.20.1+133
Reporter: Ed Kohlwey
The CrossJoin class does not emit all tuples representing the cross product of values for a given key. The issue only occurs when using the local task tracker, and not when running the job on a cluster.
Example
{noformat}
table 1
k1 -> a
table 2
k1 ->c
k1 ->d
{noformat}
The expected output is
{noformat}
table 1 inner join table 2
k1->ac
k1->ad
{noformat}
Instead one gets
{noformat}
table 1 inner join table 2
k1->ac
{noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAPREDUCE-1223) CompositeInputFormat doesn't
consider all tuples when run in a local task tracker
Posted by "Ed Kohlwey (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ed Kohlwey resolved MAPREDUCE-1223.
-----------------------------------
Resolution: Invalid
After some additional testing I'm marking this as invalid. It appears that the issue was with one of the inputs not being sorted.
> CompositeInputFormat doesn't consider all tuples when run in a local task tracker
> ---------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1223
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1223
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.20.1
> Environment: Yahoo distribution for Hadoop 0.20.1.3041192001 and Cloudera Distribution for Hadoop 0.20.1+133
> Reporter: Ed Kohlwey
>
> The CrossJoin class does not emit all tuples representing the cross product of values for a given key. The issue only occurs when using the local task tracker, and not when running the job on a cluster.
> Example
> {noformat}
> table 1
> k1 -> a
> table 2
> k1 ->c
> k1 ->d
> {noformat}
> The expected output is
> {noformat}
> table 1 inner join table 2
> k1->ac
> k1->ad
> {noformat}
> Instead one gets
> {noformat}
> table 1 inner join table 2
> k1->ac
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.