You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Romain Thibaux (JIRA)" <ji...@apache.org> on 2010/11/30 09:16:11 UTC

[jira] Created: (HIVE-1814) Mapjoin fails on multiple partitions

Mapjoin fails on multiple partitions
------------------------------------

                 Key: HIVE-1814
                 URL: https://issues.apache.org/jira/browse/HIVE-1814
             Project: Hive
          Issue Type: Bug
            Reporter: Romain Thibaux


This query works:

set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
SELECT /*+ MAPJOIN(b) */ a.field_a, b.field_b
FROM table_a a
JOIN table_b b
ON a.ds = '2010-08-30' AND b.ds = '2010-08-30' AND a.user = b.user;


This query fails with a Null Pointer Exception:

set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
SELECT /*+ MAPJOIN(b) */ a.field_a, b.field_b
FROM table_a a
JOIN table_b b
ON a.ds >= '2010-08-30' AND b.ds <= '2010-09-30' AND b.ds >= '2010-08-30' AND b.ds <= '2010-09-30' AND a.ds = b.ds AND a.user = b.user;


java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:622)
        at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:121)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:118)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.