You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Ruben Q L (Jira)" <ji...@apache.org> on 2020/03/06 17:00:05 UTC
[jira] [Created] (CALCITE-3846) EnumerableMergeJoin: wrong
comparison of composite key with null values
Ruben Q L created CALCITE-3846:
----------------------------------
Summary: EnumerableMergeJoin: wrong comparison of composite key with null values
Key: CALCITE-3846
URL: https://issues.apache.org/jira/browse/CALCITE-3846
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.22.0
Reporter: Ruben Q L
The problem can be reproduced with the following test in EnumerablesTest.java:
{code}
@Test public void testMergeJoinWithCompositeKeyAndNull() {
assertThat(
EnumerableDefaults.mergeJoin(
Linq4j.asEnumerable(
Arrays.asList(
new Emp(10, "A"),
new Emp(10, "B"),
new Emp(10, "C"),
new Emp(10, "D"),
new Emp(40, "X"),
new Emp(50, "A"))),
Linq4j.asEnumerable(
Arrays.asList(
new Dept(10, "C"),
new Dept(10, null),
new Dept(30, "A"),
new Dept(40, "X"))),
e -> (Comparable) FlatLists.of(e.deptno, e.name),
d -> (Comparable) FlatLists.of(d.deptno, d.name),
(v0, v1) -> v0 + ", " + v1, false, false).toList().toString(),
equalTo("[Emp(10, C), Dept(10, C),"
+ " Emp(40, X), Dept(40, X)]"));
}
{code}
The test fails with the following exception:
{code}
java.lang.IllegalStateException: mergeJoin assumes input sorted in ascending order, however [10, C] is greater than [10, null]
{code}
The problem is that EnumerableMergeJoin implementation (i.e. EnumerableDefaults#mergeJoin) expects its inputs to be sorted in ascending order, nulls last (see EnumerableMergeJoinRule). In case of a composite key, EnumerableMergeJoin will represent keys as JavaRowFormat.LIST, which is a comparable list, whose comparison is implemented via FlatLists.ComparableListImpl#compare. This method will compare both lists, item by item, but in will consider that a null item is less than a non-null item. This is a de-facto nulls-first collation, which contradicts the pre-requisite of the mergeJoin algorithm.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)