You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/03/06 20:36:38 UTC
[jira] [Created] (DRILL-2398) IS NOT DISTINCT FROM predicate returns incorrect result when used as a join filter

Victoria Markman created DRILL-2398:
---------------------------------------

             Summary: IS NOT DISTINCT FROM predicate returns incorrect result when used as a join filter
                 Key: DRILL-2398
                 URL: https://issues.apache.org/jira/browse/DRILL-2398
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
            Reporter: Victoria Markman
            Assignee: Jinfeng Ni
            Priority: Critical


count(*) should return 0 and not NULL
{code}
0: jdbc:drill:schema=dfs> select
. . . . . . . . . . . . >         count(*)
. . . . . . . . . . . . > from
. . . . . . . . . . . . >         j1 INNER JOIN j2 ON
. . . . . . . . . . . . >         ( j1.c_double = j2.c_double)
. . . . . . . . . . . . > where
. . . . . . . . . . . . >         j1.c_bigint IS NOT DISTINCT FROM j2.c_bigint
. . . . . . . . . . . . > ;
+------------+
|   EXPR$0   |
+------------+
+------------+
{code}
These are the values in the table
{code}
0: jdbc:drill:schema=dfs> select j1.c_bigint, j2.c_bigint, count(*) from j1 INNER JOIN j2 ON (j1.c_double = j2.c_double) group by j1.c_bigint, j2.c_bigint;
+------------+------------+------------+
|  c_bigint  | c_bigint1  |   EXPR$1   |
+------------+------------+------------+
| 460194667  | -498749284 | 1          |
| 464547172  | -498828740 | 1          |
| 467451850  | -498966611 | 2          |
| 471050029  | -499154096 | 3          |
| 472873799  | -499233550 | 3          |
| 475698977  | -499395929 | 2          |
| 478986584  | -499564607 | 1          |
| 488139464  | -499763274 | 3          |
| 498214699  | -499871720 | 2          |
+------------+------------+------------+
9 rows selected (0.339 seconds)
{code}
IS DISTINCT FROM predicate returns correct result
{code}
select
        count(*)
from
        j1 INNER JOIN j2 ON
        ( j1.c_double = j2.c_double)
where
        j1.c_bigint IS DISTINCT FROM j2.c_bigint
{code}

Explain plan for query that returns incorrect result:
{code}
00-01      StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-02        Project($f0=[0])
00-03          SelectionVectorRemover
00-04            Filter(condition=[CAST(CASE(IS NULL($1), IS NULL($3), IS NULL($3), IS NULL($1), =($1, $3))):BOOLEAN NOT NULL])
00-05              HashJoin(condition=[=($0, $2)], joinType=[inner])
00-07                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/joins/j1]], selectionRoot=/joins/j1, numFiles=1, columns=[`c_double`, `c_bigint`]]])
00-06                Project(c_double0=[$0], c_bigint0=[$1])
00-08                  Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/joins/j2]], selectionRoot=/joins/j2, numFiles=1, columns=[`c_double`, `c_bigint`]]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)