You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Vladimir Sitnikov (JIRA)" <ji...@apache.org> on 2018/08/23 20:59:00 UTC
[jira] [Commented] (CALCITE-2271) Join of two views with window
aggregates produces incorrect results or NPE
[ https://issues.apache.org/jira/browse/CALCITE-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590799#comment-16590799 ]
Vladimir Sitnikov commented on CALCITE-2271:
--------------------------------------------
I have reviewed the patch, and it looks like:
1) The bug is there.
2) It is caused by org.apache.calcite.adapter.enumerable.EnumerableJoin#implement + org.apache.calcite.linq4j.tree.BlockBuilder#append(java.lang.String, org.apache.calcite.linq4j.tree.BlockStatement)
In other words, EnumerableJoin#implement implements left side of the join, then right side of the join, then calls {{join}}.
The problem is "left" and "right" part of the join might *unintentionally* share variables/expressions.
Alexey suggests to make variable names more unique, so it passes through, however I think we'd better augment {{BlockBuilder#append}} logic.
Below you can find the code for the test case in the description (just in case),
{{BlockBuilder#append}} combines both blocks in a single one, and it might cause variable names to collide.
I suggest to update {{#append(BasicBlock)}} so it uses the following (==I suggest {{BlockBuilder}} to keep nested braces to avoid namespace clash between two sides of the join):
{code:java}
...
var leftOutput;
{ // code for left block
leftOutput = ...;
}
var rightOutput;
{ // code for the right block
rightOutput = ...
}
{code}
[~julianhyde], is it ok with you?
{code:java}
// LeftPartOfTheJoin.java
{
final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {
0});
final org.apache.calcite.linq4j.AbstractEnumerable child = new org.apache.calcite.linq4j.AbstractEnumerable(){
public org.apache.calcite.linq4j.Enumerator<java.util.Map> enumerator() {
return new org.apache.calcite.linq4j.Enumerator<java.util.Map>(){
public final org.apache.calcite.linq4j.Enumerator<int> inputEnumerator = _inputEnumerable.enumerator();
public void reset() {
inputEnumerator.reset();
}
public boolean moveNext() {
return inputEnumerator.moveNext();
}
public void close() {
inputEnumerator.close();
}
public Object current() {
final java.util.LinkedHashMap _map = new java.util.LinkedHashMap();
_map.put($L4J$C$Integer_valueOf_1_, $L4J$C$Integer_valueOf_1_);
_map.put($L4J$C$Integer_valueOf_2_, $L4J$C$Integer_valueOf_2_);
return _map;
}
static final Integer $L4J$C$Integer_valueOf_1_ = Integer.valueOf(1);
static final Integer $L4J$C$Integer_valueOf_2_ = Integer.valueOf(2);
};
}
};
int prevStart0;
int prevEnd0;
final java.util.Comparator comparator = new java.util.Comparator(){
public int compare(org.apache.calcite.runtime.FlatLists.ComparableList v0, org.apache.calcite.runtime.FlatLists.ComparableList v1) {
final int c;
c = org.apache.calcite.runtime.Utilities.compare(org.apache.calcite.runtime.SqlFunctions.toInt(v0.get(0)), org.apache.calcite.runtime.SqlFunctions.toInt(v1.get(0)));
if (c != 0) {
return c;
}
return 0;
}
public int compare(Object o0, Object o1) {
return this.compare((org.apache.calcite.runtime.FlatLists.ComparableList) o0, (org.apache.calcite.runtime.FlatLists.ComparableList) o1);
}
};
final java.util.List tempList = (java.util.List) child.selectMany(org.apache.calcite.runtime.SqlFunctions.flatProduct(new int[] {
2}, false, new org.apache.calcite.runtime.SqlFunctions.FlatProductInputType[] {
org.apache.calcite.runtime.SqlFunctions.FlatProductInputType.MAP})).into(new java.util.ArrayList());
final java.util.Iterator<V[]> iterator = org.apache.calcite.runtime.SortedMultiMap.singletonArrayIterator(comparator, tempList);
final java.util.ArrayList _list = new java.util.ArrayList(
tempList.size());
final org.apache.calcite.linq4j.function.Function1 _keySelector0 = new org.apache.calcite.linq4j.function.Function1() {
public int apply(org.apache.calcite.runtime.FlatLists.ComparableList v) {
return org.apache.calcite.runtime.SqlFunctions.toInt(v.get(0));
}
public Object apply(Object v) {
return apply(
(org.apache.calcite.runtime.FlatLists.ComparableList) v);
}
}
;
final java.util.Comparator<T> _keyComparator0 = org.apache.calcite.linq4j.function.Functions.nullsComparator(false, false);
long DENSE__RANKa0s0w0;
Long DENSE__RANKa0w0 = (Long) null;
DENSE__RANKa0s0w0 = 0L;
while (iterator.hasNext()) {
final Object[] _rows = (Object[]) iterator.next();
prevStart0 = -1;
prevEnd0 = 2147483647;
for (int i = 0; i < _rows.length; ++i) {
final org.apache.calcite.runtime.FlatLists.ComparableList row = (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[i];
final int end = org.apache.calcite.runtime.BinarySearch.upperBound(_rows, (Integer) row.get(0), i, _rows.length - 1, _keySelector0, _keyComparator0);
if (end != prevEnd0) {
int actualStart;
if (end < prevEnd0) {
actualStart = 0;
DENSE__RANKa0s0w0 = 0L;
} else {
actualStart = prevEnd0 + 1;
}
prevEnd0 = end;
for (int j = actualStart; j <= end; ++j) {
if (j > 0) {
if (comparator.compare((org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j - 1], (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j]) < 0) {
DENSE__RANKa0s0w0 = DENSE__RANKa0s0w0 + 1;
}
}
}
DENSE__RANKa0w0 = Long.valueOf(DENSE__RANKa0s0w0 + 1);
}
_list.add(new Object[] {
(Object) row.get(0),
(Object) row.get(1),
DENSE__RANKa0w0});
}
}
tempList.clear();
final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable0 = org.apache.calcite.linq4j.Linq4j.asEnumerable(_list);
return new org.apache.calcite.linq4j.AbstractEnumerable(){
public org.apache.calcite.linq4j.Enumerator<Object[]> enumerator() {
return new org.apache.calcite.linq4j.Enumerator<Object[]>(){
public final org.apache.calcite.linq4j.Enumerator<Object[]> inputEnumerator = _inputEnumerable0.enumerator();
public void reset() {
inputEnumerator.reset();
}
public boolean moveNext() {
return inputEnumerator.moveNext();
}
public void close() {
inputEnumerator.close();
}
public Object current() {
final Object[] current = (Object[]) inputEnumerator.current();
return new Object[] {
current[2],
current[0],
org.apache.calcite.runtime.SqlFunctions.toInt(current[0]) + 1};
}
};
}
};
}
{code}
{code:java}
// RightPartOfTheJoin.java
{
final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {
0});
final org.apache.calcite.linq4j.AbstractEnumerable child = new org.apache.calcite.linq4j.AbstractEnumerable(){
public org.apache.calcite.linq4j.Enumerator<java.util.Map> enumerator() {
return new org.apache.calcite.linq4j.Enumerator<java.util.Map>(){
public final org.apache.calcite.linq4j.Enumerator<int> inputEnumerator = _inputEnumerable.enumerator();
public void reset() {
inputEnumerator.reset();
}
public boolean moveNext() {
return inputEnumerator.moveNext();
}
public void close() {
inputEnumerator.close();
}
public Object current() {
final java.util.LinkedHashMap _map = new java.util.LinkedHashMap();
_map.put($L4J$C$Integer_valueOf_2_, $L4J$C$Integer_valueOf_2_);
return _map;
}
static final Integer $L4J$C$Integer_valueOf_2_ = Integer.valueOf(2);
};
}
};
int prevStart1;
int prevEnd1;
final java.util.Comparator comparator = new java.util.Comparator(){
public int compare(org.apache.calcite.runtime.FlatLists.ComparableList v0, org.apache.calcite.runtime.FlatLists.ComparableList v1) {
final int c;
c = org.apache.calcite.runtime.Utilities.compare(org.apache.calcite.runtime.SqlFunctions.toInt(v0.get(0)), org.apache.calcite.runtime.SqlFunctions.toInt(v1.get(0)));
if (c != 0) {
return c;
}
return 0;
}
public int compare(Object o0, Object o1) {
return this.compare((org.apache.calcite.runtime.FlatLists.ComparableList) o0, (org.apache.calcite.runtime.FlatLists.ComparableList) o1);
}
};
final java.util.List tempList = (java.util.List) child.selectMany(org.apache.calcite.runtime.SqlFunctions.flatProduct(new int[] {
2}, false, new org.apache.calcite.runtime.SqlFunctions.FlatProductInputType[] {
org.apache.calcite.runtime.SqlFunctions.FlatProductInputType.MAP})).into(new java.util.ArrayList());
final java.util.Iterator<V[]> iterator = org.apache.calcite.runtime.SortedMultiMap.singletonArrayIterator(comparator, tempList);
final java.util.ArrayList _list = new java.util.ArrayList(
tempList.size());
final org.apache.calcite.linq4j.function.Function1 _keySelector0 = new org.apache.calcite.linq4j.function.Function1() {
public int apply(org.apache.calcite.runtime.FlatLists.ComparableList v) {
return org.apache.calcite.runtime.SqlFunctions.toInt(v.get(0));
}
public Object apply(Object v) {
return apply(
(org.apache.calcite.runtime.FlatLists.ComparableList) v);
}
}
;
final java.util.Comparator<T> _keyComparator0 = org.apache.calcite.linq4j.function.Functions.nullsComparator(false, false);
long DENSE__RANKa0s0w0;
Long DENSE__RANKa0w0 = (Long) null;
DENSE__RANKa0s0w0 = 0L;
while (iterator.hasNext()) {
final Object[] _rows = (Object[]) iterator.next();
prevStart1 = -1;
prevEnd1 = 2147483647;
for (int i = 0; i < _rows.length; ++i) {
final org.apache.calcite.runtime.FlatLists.ComparableList row = (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[i];
final int end = org.apache.calcite.runtime.BinarySearch.upperBound(_rows, (Integer) row.get(0), i, _rows.length - 1, _keySelector0, _keyComparator0);
if (end != prevEnd1) {
int actualStart;
if (end < prevEnd1) {
actualStart = 0;
DENSE__RANKa0s0w0 = 0L;
} else {
actualStart = prevEnd1 + 1;
}
prevEnd1 = end;
for (int j = actualStart; j <= end; ++j) {
if (j > 0) {
if (comparator.compare((org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j - 1], (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j]) < 0) {
DENSE__RANKa0s0w0 = DENSE__RANKa0s0w0 + 1;
}
}
}
DENSE__RANKa0w0 = Long.valueOf(DENSE__RANKa0s0w0 + 1);
}
_list.add(new Object[] {
(Object) row.get(0),
(Object) row.get(1),
DENSE__RANKa0w0});
}
}
tempList.clear();
final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable0 = org.apache.calcite.linq4j.Linq4j.asEnumerable(_list);
return new org.apache.calcite.linq4j.AbstractEnumerable(){
public org.apache.calcite.linq4j.Enumerator<Object[]> enumerator() {
return new org.apache.calcite.linq4j.Enumerator<Object[]>(){
public final org.apache.calcite.linq4j.Enumerator<Object[]> inputEnumerator = _inputEnumerable0.enumerator();
public void reset() {
inputEnumerator.reset();
}
public boolean moveNext() {
return inputEnumerator.moveNext();
}
public void close() {
inputEnumerator.close();
}
public Object current() {
final Object[] current = (Object[]) inputEnumerator.current();
return new Object[] {
current[2],
current[0]};
}
};
}
};
}
{code}
> Join of two views with window aggregates produces incorrect results or NPE
> --------------------------------------------------------------------------
>
> Key: CALCITE-2271
> URL: https://issues.apache.org/jira/browse/CALCITE-2271
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.16.0
> Reporter: Alexey Makhmutov
> Assignee: Julian Hyde
> Priority: Major
>
> This is a continuation of CALCITE-2081. The problem wasn't really solved in scope of that ticket:
> # It produces incorrect results, even in the unit-test which was included into the fix (function last_value(empid) shouldn't produce NULL in such query).
> # NPE is still present for more complex queries with joined window functions (e.g. if window functions with the same name are used in both part of the join).
> Here is the example of the query, which produces NPE in 1.16.0:
> {code:sql}
> select
> t1.l, t1.key, t2.key
> from
> (
> select
> dense_rank() over (order by key) l,
> key
> from
> unnest(map[1,1,2,2]) k
> ) t1
> join
> (
> select
> dense_rank() over(order by key) l,
> key
> from
> unnest(map[2,2]) k
> ) t2 on (t1.l = t2.l and t1.key + 1 = t2.key){code}
> The problem is still the same - windows produces several declarations with non-unique names, which are then incorrectly merged in BlockBuilder.append method (declaration name substitution conflicts with optimization logic).
> Two variables were fixed in scope of CALCITE-2081 - 'prevStart' and 'prevEnd', but other variables are still non-unique: result and state variables (aXwY and aXsYwZ which cause NPE during initialization) and 'list' variable (causing wrong results as both aggregates are storing results into the same variable).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)