You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Michael White (JIRA)" <ji...@apache.org> on 2011/06/15 20:36:48 UTC
[jira] [Created] (MAPREDUCE-2597) Map-side joins always empty when
an input has NullWritable value
Map-side joins always empty when an input has NullWritable value
----------------------------------------------------------------
Key: MAPREDUCE-2597
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2597
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.20.2
Environment: Linux cluster
Reporter: Michael White
It is not uncommon to have a sorted list of data that has no specific value associated with it as input to a map-side join, e.g. as an exact-match filter. In these cases, you would typically have a value class of NullWritable. However, when performing a map-side join in Hadoop 0.20.2, we have found that any input that has value class of NullWritable results in the Mapper never getting called. I found this with a 3-way map-side join, and my colleague tells me he ran into the same issue. I have not specifically tested a 2-way join to see if the problem occurs, so it may be that the bug is specific to n-way joins for n>2 (though I suspect not).
The current workaround is to use some other value type (e.g. IntWritable) and stuff an arbitrary value into it.
For a join, the value class should have no bearing on the set of keys that are considered matching.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira