You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Vladimir Sitnikov (JIRA)" <ji...@apache.org> on 2018/08/23 20:59:00 UTC

[jira] [Commented] (CALCITE-2271) Join of two views with window aggregates produces incorrect results or NPE

    [ https://issues.apache.org/jira/browse/CALCITE-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590799#comment-16590799 ] 

Vladimir Sitnikov commented on CALCITE-2271:
--------------------------------------------

I have reviewed the patch, and it looks like:
 1) The bug is there.
 2) It is caused by org.apache.calcite.adapter.enumerable.EnumerableJoin#implement + org.apache.calcite.linq4j.tree.BlockBuilder#append(java.lang.String, org.apache.calcite.linq4j.tree.BlockStatement)

In other words, EnumerableJoin#implement implements left side of the join, then right side of the join, then calls {{join}}.

The problem is "left" and "right" part of the join might *unintentionally* share variables/expressions.

Alexey suggests to make variable names more unique, so it passes through, however I think we'd better augment {{BlockBuilder#append}} logic.

Below you can find the code for the test case in the description (just in case),
{{BlockBuilder#append}} combines both blocks in a single one, and it might cause variable names to collide.

I suggest to update {{#append(BasicBlock)}} so it uses the following (==I suggest {{BlockBuilder}} to keep nested braces to avoid namespace clash between two sides of the join):
{code:java}
...
 var leftOutput;
 { // code for left block
   leftOutput = ...;
 }
 var rightOutput;
 { // code for the right block
   rightOutput = ...
 }
{code}
[~julianhyde], is it ok with you?


{code:java}
// LeftPartOfTheJoin.java
{
  final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {
    0});
  final org.apache.calcite.linq4j.AbstractEnumerable child = new org.apache.calcite.linq4j.AbstractEnumerable(){
    public org.apache.calcite.linq4j.Enumerator<java.util.Map> enumerator() {
      return new org.apache.calcite.linq4j.Enumerator<java.util.Map>(){
          public final org.apache.calcite.linq4j.Enumerator<int> inputEnumerator = _inputEnumerable.enumerator();
          public void reset() {
            inputEnumerator.reset();
          }

          public boolean moveNext() {
            return inputEnumerator.moveNext();
          }

          public void close() {
            inputEnumerator.close();
          }

          public Object current() {
            final java.util.LinkedHashMap _map = new java.util.LinkedHashMap();
            _map.put($L4J$C$Integer_valueOf_1_, $L4J$C$Integer_valueOf_1_);
            _map.put($L4J$C$Integer_valueOf_2_, $L4J$C$Integer_valueOf_2_);
            return _map;
          }

          static final Integer $L4J$C$Integer_valueOf_1_ = Integer.valueOf(1);
          static final Integer $L4J$C$Integer_valueOf_2_ = Integer.valueOf(2);
        };
    }

  };
  int prevStart0;
  int prevEnd0;
  final java.util.Comparator comparator = new java.util.Comparator(){
    public int compare(org.apache.calcite.runtime.FlatLists.ComparableList v0, org.apache.calcite.runtime.FlatLists.ComparableList v1) {
      final int c;
      c = org.apache.calcite.runtime.Utilities.compare(org.apache.calcite.runtime.SqlFunctions.toInt(v0.get(0)), org.apache.calcite.runtime.SqlFunctions.toInt(v1.get(0)));
      if (c != 0) {
        return c;
      }
      return 0;
    }

    public int compare(Object o0, Object o1) {
      return this.compare((org.apache.calcite.runtime.FlatLists.ComparableList) o0, (org.apache.calcite.runtime.FlatLists.ComparableList) o1);
    }

  };
  final java.util.List tempList = (java.util.List) child.selectMany(org.apache.calcite.runtime.SqlFunctions.flatProduct(new int[] {
    2}, false, new org.apache.calcite.runtime.SqlFunctions.FlatProductInputType[] {
    org.apache.calcite.runtime.SqlFunctions.FlatProductInputType.MAP})).into(new java.util.ArrayList());
  final java.util.Iterator<V[]> iterator = org.apache.calcite.runtime.SortedMultiMap.singletonArrayIterator(comparator, tempList);
  final java.util.ArrayList _list = new java.util.ArrayList(
    tempList.size());
  final org.apache.calcite.linq4j.function.Function1 _keySelector0 = new org.apache.calcite.linq4j.function.Function1() {
    public int apply(org.apache.calcite.runtime.FlatLists.ComparableList v) {
      return org.apache.calcite.runtime.SqlFunctions.toInt(v.get(0));
    }
    public Object apply(Object v) {
      return apply(
        (org.apache.calcite.runtime.FlatLists.ComparableList) v);
    }
  }
  ;
  final java.util.Comparator<T> _keyComparator0 = org.apache.calcite.linq4j.function.Functions.nullsComparator(false, false);
  long DENSE__RANKa0s0w0;
  Long DENSE__RANKa0w0 = (Long) null;
  DENSE__RANKa0s0w0 = 0L;
  while (iterator.hasNext()) {
    final Object[] _rows = (Object[]) iterator.next();
    prevStart0 = -1;
    prevEnd0 = 2147483647;
    for (int i = 0; i < _rows.length; ++i) {
      final org.apache.calcite.runtime.FlatLists.ComparableList row = (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[i];
      final int end = org.apache.calcite.runtime.BinarySearch.upperBound(_rows, (Integer) row.get(0), i, _rows.length - 1, _keySelector0, _keyComparator0);
      if (end != prevEnd0) {
        int actualStart;
        if (end < prevEnd0) {
          actualStart = 0;
          DENSE__RANKa0s0w0 = 0L;
        } else {
          actualStart = prevEnd0 + 1;
        }
        prevEnd0 = end;
        for (int j = actualStart; j <= end; ++j) {
          if (j > 0) {
            if (comparator.compare((org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j - 1], (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j]) < 0) {
              DENSE__RANKa0s0w0 = DENSE__RANKa0s0w0 + 1;
            }
          }
        }
        DENSE__RANKa0w0 = Long.valueOf(DENSE__RANKa0s0w0 + 1);
      }
      _list.add(new Object[] {
        (Object) row.get(0),
        (Object) row.get(1),
        DENSE__RANKa0w0});
    }
  }
  tempList.clear();
  final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable0 = org.apache.calcite.linq4j.Linq4j.asEnumerable(_list);
  return new org.apache.calcite.linq4j.AbstractEnumerable(){
      public org.apache.calcite.linq4j.Enumerator<Object[]> enumerator() {
        return new org.apache.calcite.linq4j.Enumerator<Object[]>(){
            public final org.apache.calcite.linq4j.Enumerator<Object[]> inputEnumerator = _inputEnumerable0.enumerator();
            public void reset() {
              inputEnumerator.reset();
            }

            public boolean moveNext() {
              return inputEnumerator.moveNext();
            }

            public void close() {
              inputEnumerator.close();
            }

            public Object current() {
              final Object[] current = (Object[]) inputEnumerator.current();
              return new Object[] {
                  current[2],
                  current[0],
                  org.apache.calcite.runtime.SqlFunctions.toInt(current[0]) + 1};
            }

          };
      }

    };
}
{code}
{code:java}
// RightPartOfTheJoin.java
{
  final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable = org.apache.calcite.linq4j.Linq4j.asEnumerable(new Integer[] {
    0});
  final org.apache.calcite.linq4j.AbstractEnumerable child = new org.apache.calcite.linq4j.AbstractEnumerable(){
    public org.apache.calcite.linq4j.Enumerator<java.util.Map> enumerator() {
      return new org.apache.calcite.linq4j.Enumerator<java.util.Map>(){
          public final org.apache.calcite.linq4j.Enumerator<int> inputEnumerator = _inputEnumerable.enumerator();
          public void reset() {
            inputEnumerator.reset();
          }

          public boolean moveNext() {
            return inputEnumerator.moveNext();
          }

          public void close() {
            inputEnumerator.close();
          }

          public Object current() {
            final java.util.LinkedHashMap _map = new java.util.LinkedHashMap();
            _map.put($L4J$C$Integer_valueOf_2_, $L4J$C$Integer_valueOf_2_);
            return _map;
          }

          static final Integer $L4J$C$Integer_valueOf_2_ = Integer.valueOf(2);
        };
    }

  };
  int prevStart1;
  int prevEnd1;
  final java.util.Comparator comparator = new java.util.Comparator(){
    public int compare(org.apache.calcite.runtime.FlatLists.ComparableList v0, org.apache.calcite.runtime.FlatLists.ComparableList v1) {
      final int c;
      c = org.apache.calcite.runtime.Utilities.compare(org.apache.calcite.runtime.SqlFunctions.toInt(v0.get(0)), org.apache.calcite.runtime.SqlFunctions.toInt(v1.get(0)));
      if (c != 0) {
        return c;
      }
      return 0;
    }

    public int compare(Object o0, Object o1) {
      return this.compare((org.apache.calcite.runtime.FlatLists.ComparableList) o0, (org.apache.calcite.runtime.FlatLists.ComparableList) o1);
    }

  };
  final java.util.List tempList = (java.util.List) child.selectMany(org.apache.calcite.runtime.SqlFunctions.flatProduct(new int[] {
    2}, false, new org.apache.calcite.runtime.SqlFunctions.FlatProductInputType[] {
    org.apache.calcite.runtime.SqlFunctions.FlatProductInputType.MAP})).into(new java.util.ArrayList());
  final java.util.Iterator<V[]> iterator = org.apache.calcite.runtime.SortedMultiMap.singletonArrayIterator(comparator, tempList);
  final java.util.ArrayList _list = new java.util.ArrayList(
    tempList.size());
  final org.apache.calcite.linq4j.function.Function1 _keySelector0 = new org.apache.calcite.linq4j.function.Function1() {
    public int apply(org.apache.calcite.runtime.FlatLists.ComparableList v) {
      return org.apache.calcite.runtime.SqlFunctions.toInt(v.get(0));
    }
    public Object apply(Object v) {
      return apply(
        (org.apache.calcite.runtime.FlatLists.ComparableList) v);
    }
  }
  ;
  final java.util.Comparator<T> _keyComparator0 = org.apache.calcite.linq4j.function.Functions.nullsComparator(false, false);
  long DENSE__RANKa0s0w0;
  Long DENSE__RANKa0w0 = (Long) null;
  DENSE__RANKa0s0w0 = 0L;
  while (iterator.hasNext()) {
    final Object[] _rows = (Object[]) iterator.next();
    prevStart1 = -1;
    prevEnd1 = 2147483647;
    for (int i = 0; i < _rows.length; ++i) {
      final org.apache.calcite.runtime.FlatLists.ComparableList row = (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[i];
      final int end = org.apache.calcite.runtime.BinarySearch.upperBound(_rows, (Integer) row.get(0), i, _rows.length - 1, _keySelector0, _keyComparator0);
      if (end != prevEnd1) {
        int actualStart;
        if (end < prevEnd1) {
          actualStart = 0;
          DENSE__RANKa0s0w0 = 0L;
        } else {
          actualStart = prevEnd1 + 1;
        }
        prevEnd1 = end;
        for (int j = actualStart; j <= end; ++j) {
          if (j > 0) {
            if (comparator.compare((org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j - 1], (org.apache.calcite.runtime.FlatLists.ComparableList) _rows[j]) < 0) {
              DENSE__RANKa0s0w0 = DENSE__RANKa0s0w0 + 1;
            }
          }
        }
        DENSE__RANKa0w0 = Long.valueOf(DENSE__RANKa0s0w0 + 1);
      }
      _list.add(new Object[] {
        (Object) row.get(0),
        (Object) row.get(1),
        DENSE__RANKa0w0});
    }
  }
  tempList.clear();
  final org.apache.calcite.linq4j.Enumerable<T> _inputEnumerable0 = org.apache.calcite.linq4j.Linq4j.asEnumerable(_list);
  return new org.apache.calcite.linq4j.AbstractEnumerable(){
      public org.apache.calcite.linq4j.Enumerator<Object[]> enumerator() {
        return new org.apache.calcite.linq4j.Enumerator<Object[]>(){
            public final org.apache.calcite.linq4j.Enumerator<Object[]> inputEnumerator = _inputEnumerable0.enumerator();
            public void reset() {
              inputEnumerator.reset();
            }

            public boolean moveNext() {
              return inputEnumerator.moveNext();
            }

            public void close() {
              inputEnumerator.close();
            }

            public Object current() {
              final Object[] current = (Object[]) inputEnumerator.current();
              return new Object[] {
                  current[2],
                  current[0]};
            }

          };
      }

    };
}
{code}


> Join of two views with window aggregates produces incorrect results or NPE
> --------------------------------------------------------------------------
>
>                 Key: CALCITE-2271
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2271
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.16.0
>            Reporter: Alexey Makhmutov
>            Assignee: Julian Hyde
>            Priority: Major
>
> This is a continuation of CALCITE-2081. The problem wasn't really solved in scope of that ticket:
>  # It produces incorrect results, even in the unit-test which was included into the fix (function last_value(empid) shouldn't produce NULL in such query).
>  # NPE is still present for more complex queries with joined window functions (e.g. if window functions with the same name are used in both part of the join).
> Here is the example of the query, which produces NPE in 1.16.0:
> {code:sql}
> select 
>  t1.l, t1.key, t2.key
> from 
>  (
>   select 
>    dense_rank() over (order by key) l, 
>    key 
>   from 
>    unnest(map[1,1,2,2]) k
>  ) t1 
>  join 
>  (
>   select 
>    dense_rank() over(order by key) l, 
>    key 
>   from 
>    unnest(map[2,2]) k
>  ) t2 on (t1.l = t2.l and t1.key + 1 = t2.key){code}
> The problem is still the same - windows produces several declarations with non-unique names, which are then incorrectly merged in BlockBuilder.append method (declaration name substitution conflicts with optimization logic).
> Two variables were fixed in scope of CALCITE-2081 - 'prevStart' and 'prevEnd', but other variables are still non-unique: result and state variables (aXwY and aXsYwZ which cause NPE during initialization) and 'list' variable (causing wrong results as both aggregates are storing results into the same variable).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)