You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Michael White (JIRA)" <ji...@apache.org> on 2011/06/15 20:36:48 UTC

[jira] [Created] (MAPREDUCE-2597) Map-side joins always empty when an input has NullWritable value

Map-side joins always empty when an input has NullWritable value
----------------------------------------------------------------

                 Key: MAPREDUCE-2597
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2597
             Project: Hadoop Map/Reduce
          Issue Type: Bug
    Affects Versions: 0.20.2
         Environment: Linux cluster
            Reporter: Michael White


It is not uncommon to have a sorted list of data that has no specific value associated with it as input to a map-side join, e.g. as an exact-match filter.  In these cases, you would typically have a value class of NullWritable.  However, when performing a map-side join in Hadoop 0.20.2, we have found that any input that has value class of NullWritable results in the Mapper never getting called.  I found this with a 3-way map-side join, and my colleague tells me he ran into the same issue.  I have not specifically tested a 2-way join to see if the problem occurs, so it may be that the bug is specific to n-way joins for n>2 (though I suspect not).

The current workaround is to use some other value type (e.g. IntWritable) and stuff an arbitrary value into it.

For a join, the value class should have no bearing on the set of keys that are considered matching.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira