You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2020/02/06 13:49:06 UTC

[GitHub] [calcite] rubenada opened a new pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

rubenada opened a new pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788
 
 
   Jira ticket: [CALCITE-3285](https://issues.apache.org/jira/browse/CALCITE-3285).
   Added an extra predicate for non-equi condition in EnumerableDefaults#MergeJoinEnumerator (it will be null in case of equi-joins).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#issuecomment-601194162
 
 
   Thanks, it looks good to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393636498
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   Transforming `elements` array into list. Just applying the same approach as `Linq4j#CartesianProductListEnumerator`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393637291
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
 
 Review comment:
   Not possible if we want to extend `CartesianProductEnumerator`:
   ```
   /**
    * Enumerator over the cartesian product of enumerators.
    * @param <T> Input element type
    * @param <E> Element type
    */
   public abstract class CartesianProductEnumerator<T, E> implements Enumerator<E> {...}
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393632630
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   What is the purpose of `Arrays.asList` and `.clone()` here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393605611
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
+  public static <TSource, TInner, TKey extends Comparable<TKey>, TResult> Enumerable<TResult>
+      mergeJoin(final Enumerable<TSource> outer,
+      final Enumerable<TInner> inner,
+      final Function1<TSource, TKey> outerKeySelector,
+      final Function1<TInner, TKey> innerKeySelector,
+      final Predicate2<TSource, TInner> extraPredicate,
 
 Review comment:
   documentation updated

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada merged pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada merged pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393605447
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   Refactored to re-use nested loop join

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#issuecomment-600554958
 
 
   Thanks for your feedback @vlsi !
   Do you think the PR is in a good shape in order to be merged?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392881695
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
 
 Review comment:
   Ok, I'll change it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393652306
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
 
 Review comment:
   Modified: cast in constructor

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393633611
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
 
 Review comment:
   Should it be typed like `Enumerator<TOuter>`?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393068880
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   Let me rephrase it differently:
   
   1) You calculate `lefts` and `rights` using merge join algorithm on `equal` predicates
   2) Then you have `lefts` and `rights`, and there's nothing from `merge` join left that you can do about it. The rest could be exactly the same as in `hash semi-join`.
   
   `for-for-if` logic performs exactly the same actions as the existing `semiJoin` and `antoJoin` methods. Why duplicate code (bugs)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393051599
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   Is it different from the existing `org.apache.calcite.linq4j.EnumerableDefaults#semiJoin` and `#antiJoin` methods?
   
   Why do you reimplement the same thing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#issuecomment-601195948
 
 
   Thanks @vlsi , your suggestions have improved the patch very much.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393058613
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   `org.apache.calcite.linq4j.EnumerableDefaults#semiJoin` and `#antiJoin` methods implement hashSemiJoin and hashAntiJoin, they are used by EnumerableHashJoin, the idea is to provide another alternative implementation of semiJoin/antiJoin via MergeJoin algorithm (as they exist for NestedLoopJoin, CorrelateJoin or CorrelateBatchJoin)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393093157
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   > I don't think hashJoin code is the appropriate module here, since there is no need to create the hash structure to find matching keys
   
   Well, my point is the code seems to duplicate the existing code, so it looks like a bug.
   Of course, the only predicates left are `non-equality`, so there's nothing to hash. 
   
   However, the key point here is to avoid code duplication.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392875012
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
+  public static <TSource, TInner, TKey extends Comparable<TKey>, TResult> Enumerable<TResult>
+      mergeJoin(final Enumerable<TSource> outer,
+      final Enumerable<TInner> inner,
+      final Function1<TSource, TKey> outerKeySelector,
+      final Function1<TInner, TKey> innerKeySelector,
+      final Predicate2<TSource, TInner> extraPredicate,
 
 Review comment:
   You're right, I'll document that we use a predicate rather that a filter on top of join results in order to support other join types in the (near) future.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393639468
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   Note: `list` never escapes this method, so there's no need to copy it.
   There's no need to create `list` as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392866373
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   How is this different from a filter on top of `Linq4j.product`?
   
   If it is not different, then please remove manual for-for.
   If it is different please add a clarification comment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392871743
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   At the current state (only INNER join supported), we could have a filter on top of the product. However, we have other join types in the backlog that can / will be supported by EnumerableMergeJoin (SEMI-join, see [CALCITE-3833](https://issues.apache.org/jira/browse/CALCITE-3833); and ANTI-join, see  [CALCITE-3834](https://issues.apache.org/jira/browse/CALCITE-3834)). For those join types (which do not project the right-hand-side of the join), we cannot use the strategy of applying a filter on top of the product, so I am just preparing the ground here with an homogeneous strategy that can work on other join types.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393085877
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   Ok, I think I understand your point. We should re-use existing code instead of the `for-for-if`. However, I don't think hashJoin code is the appropriate module here, since there is no need to create the hash structure to find matching keys (because we already have them with the merge join algorithm). Instead, we should use the `nestedLoopJoin(lefts, rights, extraPredicateWithNonEquiClauses)` which performs precisely this `for-for-if` process.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393605516
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
 
 Review comment:
   added apiguardian

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392914875
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   The next step will be implementing SEMI join type with Enumerable Merge Join. In order to do so, we would just a need a `break` inside the `if` (to avoid 'left' duplicates, which SEMI join must not have). So this algorithm works for INNER join and can be easily adapted for other types, like SEMI.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393638910
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
 
 Review comment:
   Well, can you cast when calling the `super` constructor?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393652622
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   modified: list no longer generated

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392909998
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3911,9 +3936,33 @@ private boolean advance() {
         }
         rights.add(right);
       }
-      cartesians = Linq4j.product(
-          ImmutableList.of(Linq4j.enumerator(lefts),
-              Linq4j.enumerator(rights)));
+
+      if (extraPredicate == null) {
+        cartesians = Linq4j.product(
+            ImmutableList.of(Linq4j.enumerator(lefts),
+                Linq4j.enumerator(rights)));
+      } else {
+        // we must verify the non equi-join predicate
+        final List<List<Object>> results = new ArrayList<>();
+        for (TSource currentLeft : lefts) {
+          for (TInner currentRight : rights) {
+            if (extraPredicate.apply(currentLeft, currentRight)) {
+              final List<Object> result = new ArrayList<>(2);
+              result.add(currentLeft);
+              result.add(currentRight);
+              results.add(result);
+            }
+          }
+        }
+        if (results.isEmpty()) {
+          if (done) {
+            return false;
+          }
+          return advance();
+        } else {
+          cartesians = Linq4j.enumerator(results);
 
 Review comment:
   Implementation-wise it looks the same.
   I see no reason why do you use `for-for-if` here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r394265327
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
+  public static <TSource, TInner, TKey extends Comparable<TKey>, TResult> Enumerable<TResult>
+      mergeJoin(final Enumerable<TSource> outer,
+      final Enumerable<TInner> inner,
+      final Function1<TSource, TKey> outerKeySelector,
+      final Function1<TInner, TKey> innerKeySelector,
+      final Predicate2<TSource, TInner> extraPredicate,
 
 Review comment:
   Thanks. However, can you please add `@param extraPredicate` that would clarify the meaning of `extraPredicate` in a couple of words?
   
   The documentation does not need to clarfy all the implementation details, however, the naming of `Predicate2<TSource, TInner> extraPredicate` provides absolutely no clue one meaning of the predicate.
   
   Well, I do see it is a predicate. I can judge that by class name. As a user of that API I want to know what is expected to be passed there. I want to know if `nulls` are allowed or not and so on.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393632630
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   What is the purpose of `.clone()` here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393638910
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
 
 Review comment:
   Well, can you cast when calling the constructor?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392866941
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
+  public static <TSource, TInner, TKey extends Comparable<TKey>, TResult> Enumerable<TResult>
+      mergeJoin(final Enumerable<TSource> outer,
+      final Enumerable<TInner> inner,
+      final Function1<TSource, TKey> outerKeySelector,
+      final Function1<TInner, TKey> innerKeySelector,
+      final Predicate2<TSource, TInner> extraPredicate,
 
 Review comment:
   Please add the relevant documentation that clarifies why this `extraPredicate` is needed and how it is different from a predicate on top of join results.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392869470
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
 
 Review comment:
   I would suggest `@API(...)`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r392912236
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
 
 Review comment:
   Unfortunately, when I try to import `org.apiguardian.api.API;` and use  `@API(since = "1.23", status = API.Status.EXPERIMENTAL)` in EnumerableDefaults (which belongs to linq4j), I get a compilation error. Is it possible to use that library there? Do I need to change gradle config files to do so?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r394282142
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
+  public static <TSource, TInner, TKey extends Comparable<TKey>, TResult> Enumerable<TResult>
+      mergeJoin(final Enumerable<TSource> outer,
+      final Enumerable<TInner> inner,
+      final Function1<TSource, TKey> outerKeySelector,
+      final Function1<TInner, TKey> innerKeySelector,
+      final Predicate2<TSource, TInner> extraPredicate,
 
 Review comment:
   javadoc updated: added `@param extraPredicate`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on issue #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#issuecomment-601171344
 
 
   @vlsi I have carried out all your suggested modifications, do you think we could merge the patch?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
rubenada commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393640721
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -3951,6 +3982,24 @@ public void close() {
     }
   }
 
+  private static class CartesianProductJoinEnumerator<TResult, TOuter, TInner>
+      extends CartesianProductEnumerator<Object, TResult> {
+    private final Function2<TOuter, TInner, TResult> resultSelector;
+
+    CartesianProductJoinEnumerator(Function2<TOuter, TInner, TResult> resultSelector,
+                                   Enumerator<Object> outer, Enumerator<Object> inner) {
+      super(ImmutableList.of(outer, inner));
+      this.resultSelector = resultSelector;
+    }
+
+    @Override public TResult current() {
+      final List<Object> list = Arrays.asList(elements.clone());
 
 Review comment:
   you're right, I'll change it

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions

Posted by GitBox <gi...@apache.org>.
vlsi commented on a change in pull request #1788: [CALCITE-3285] EnumerableMergeJoin should support non-equi join conditions
URL: https://github.com/apache/calcite/pull/1788#discussion_r393065470
 
 

 ##########
 File path: linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java
 ##########
 @@ -1971,6 +1972,26 @@ private void closeInner() {
       final Function2<TSource, TInner, TResult> resultSelector,
       boolean generateNullsOnLeft,
       boolean generateNullsOnRight) {
+    return mergeJoin(outer, inner, outerKeySelector, innerKeySelector, null,
+        resultSelector, generateNullsOnLeft, generateNullsOnRight);
+  }
+
+  /**
+   * Joins two inputs that are sorted on the key, with an extra predicate for non equi-join
+   * conditions (in case of equi-join, it will be null).
+   * Inputs must sorted in ascending order, nulls last.
+   * NOTE: The current API is experimental and subject to change without notice.
+   */
+  @Experimental
 
 Review comment:
   It comes from `apiguardian`, so you need to add the relevant dependency to `linq4j`: https://github.com/apache/calcite/blob/d234626227954eefffe49f42abec65c649ffe3a6/core/build.gradle.kts#L44

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services