You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2023/06/28 20:14:00 UTC

[jira] [Commented] (CALCITE-5732) EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'

    [ https://issues.apache.org/jira/browse/CALCITE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17738285#comment-17738285 ] 

Ruben Q L commented on CALCITE-5732:
------------------------------------

[~viggoc] since this is not a regression (it seems indeed an old bug), there is no PR proposed to fix it, and we are preparing the release process of 1.35, I'm removing the "fixVersion=1.35" of the ticket. But this is definitely a true problem that we need to take care of.

> EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'
> --------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-5732
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5732
>             Project: Calcite
>          Issue Type: Bug
>          Components: linq4j
>            Reporter: Viggo Chen
>            Priority: Major
>             Fix For: 1.35.0
>
>
> In `EnumerableJoinTest#testMergeJoinWithCompositeKeyAndNullValues`, the query is like 
> {code:java}
> select 
>   emps.empid
> from 
>   emps a join emps b
> on a.deptno = b.deptno
> and a.commission = b.commission;{code}
> and the data is like 
> {code:java}
>   INSERT INTO "emps" VALUES (100, 10, 'Bill', 10000, 1000);
>   INSERT INTO "emps" VALUES (200, 20, 'Eric', 8000, 500);
>   INSERT INTO "emps" VALUES (150, 10, 'Sebastian', 7000, null);
>   INSERT INTO "emps" VALUES (110, 10, 'Theodore', 11500, 250); {code}
> And row with empid = 150 is in expected result. Is this the expected result of join with null condition.
> Whats more hash join result with condition a.deptno = b.deptno and a.commission = b.commission is same as merge join. And if there is just one condition a.commission = b.commission, the result do not include empid = 150.
>  
> Here is a unit test for it
> {code:java}
> @Test void testHashJoinWithCompositeKeyAndNullValues() {
>   // Both join side 'commission' a limited to null, so a.commission = b.commission should always be false.
>   // So all columns in right table b are expected to be null, this sql should result in 0 rows.
>   final String sql = "select * from\n"
>       + " (select empid, salary, commission from emps where commission is null) as a\n"
>       + " left join\n"
>       + " (select empid, salary, commission from emps where commission is null) as b\n"
>       + " on a.salary = b.salary and a.commission = b.commission\n"
>       + " where b.empid is not null";
>   CalciteAssert.that()
>       .with(CalciteConnectionProperty.LEX, Lex.JAVA)
>       .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
>       .withSchema("s", new ReflectiveSchema(new HrSchemaBig()))
>       .query(sql)
>       .withHook(Hook.PLANNER, (Consumer<RelOptPlanner>) planner -> {
>         planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
>         planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
>       })
>       .explainContains("EnumerableHashJoin")
>       .returnsCount(0)
>   ;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)