You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@calcite.apache.org by GitBox <gi...@apache.org> on 2020/01/15 06:34:17 UTC

[GitHub] [calcite] amaliujia opened a new pull request #1761: [CALCITE-3737] HOP Table-valued Function

amaliujia opened a new pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761
 
 
   see:  https://issues.apache.org/jira/browse/CALCITE-3737

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r366745144
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlWindowTableFunction.java
 ##########
 @@ -44,28 +43,21 @@ public SqlWindowTableFunction(String name) {
         SqlFunctionCategory.SYSTEM);
   }
 
-  @Override public SqlOperandCountRange getOperandCountRange() {
-    return SqlOperandCountRanges.of(3);
-  }
-
-  @Override public boolean checkOperandTypes(SqlCallBinding callBinding,
 
 Review comment:
   Why we remove these methods? Each window table function has its own implementation?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367085261
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
+                                            long emitFrequency,
+                                            long intervalSize) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new HopEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, emitFrequency, intervalSize);
+      }
+    };
+  }
+
+  private static class HopEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final long emitFrequency;
+    private final long intervalSize;
+    private LinkedList<Object[]> list;
+
+    HopEnumerator(Enumerator<Object[]> inputEnumerator,
+                  int indexOfWatermarkedColumn, long emitFrequency, long intervalSize) {
+      this.inputEnumerator = inputEnumerator;
+      this.indexOfWatermarkedColumn = indexOfWatermarkedColumn;
+      this.emitFrequency = emitFrequency;
+      this.intervalSize = intervalSize;
+      list = new LinkedList<>();
+    }
+
+    public Object[] current() {
+      if (list.size() > 0) {
+        return takeOne();
+      } else {
+        Object[] current = inputEnumerator.current();
+        List<Pair> windows = hopWindows(SqlFunctions.toLong(current[indexOfWatermarkedColumn]),
+            emitFrequency, intervalSize);
+        for (Pair window : windows) {
+          Object[] curWithWindow = new Object[current.length + 2];
+          System.arraycopy(current, 0, curWithWindow, 0, current.length);
+          curWithWindow[current.length] = window.left;
+          curWithWindow[current.length + 1] = window.right;
+          list.offer(curWithWindow);
+        }
+        return takeOne();
+      }
+
 
 Review comment:
   extra line

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r394104378
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java
 ##########
 @@ -2273,8 +2275,14 @@ public void unparse(
   /** DESCRIPTOR(column_name, ...). */
   public static final SqlOperator DESCRIPTOR = new SqlDescriptorOperator();
 
-  /** TUMBLE as a table-value function. */
-  public static final SqlFunction TUMBLE_TVF = new SqlWindowTableFunction(SqlKind.TUMBLE.name());
+  /** TUMBLE as a table function. */
+  public static final SqlFunction TUMBLE_TVF = new SqlTumbleTableFunction();
 
 Review comment:
   Just of curious, is the "TVF" a commonly-used abbreviation?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r394107792
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1871,14 +1872,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 Here is an example:
 `SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '1' MINUTE))`,
 will apply tumbling with 1 minute window size on rows from table orders. rowtime is the
 watermarked column of table orders that tells data completeness.
 
+#### HOP
+In streaming queries, HOP assigns windows that cover rows within the interval of *size*, shifting every *slide*, 
+and optionally aligned at *time* based on a timestamp column. Windows assigned could have overlapping so hopping 
+sometime is named as "sliding windowing".  
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| HOP(table, DESCRIPTOR(datetime), slide, size, [, time ]) | Indicates a hopping window for *datetime*, covering rows within the interval of *size*, shifting every *slide*, and optionally aligned at *time*.
+
+Here is an example:
+`SELECT * FROM TABLE(HOP(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '2' MINUTE, INTERVAL '5' MINUTE))`,
+will apply hopping with 5-minute interval size on rows from table orders, shifting every 2 minutes. rowtime is the
+watermarked column of table orders that tells data completeness.
+
+#### SESSION
+In streaming queries, SESSION assigns windows that cover rows based on *datetime*. Within a session window, distances 
+of rows are less than *interval*. Session window is applied per *key*.
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| session(table, DESCRIPTOR(datetime), DESCRIPTOR(key), interval [, time ]) | Indicates a session window of *interval* for *datetime*, optionally aligned at *time*. Session window is applied per *key*.
+
+Here is an example:
+`SELECT * FROM TABLE(SESSION(TABLE orders, DESCRIPTOR(rowtime), DESCRIPTOR(product), INTERVAL '20' MINUTE))`,
+will apply session with 20-minute inactive gap on rows from table orders. rowtime is the
+watermarked column of table orders that tells data completeness. Session is applied per product.
+
 ### Grouped window functions
 
 Review comment:
   Some comments:
   1) Is it necessary to deprecate the manner of using them as grouped window function?
   2) Update "Table-valued functions." to "Table functions".  
   3) Could you add doc the [streaming](https://calcite.apache.org/docs/stream.html) page? I think we can do this work in this PR or after finishing the umbrella JIRA.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia edited a comment on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia edited a comment on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-613091499
 
 
   @DonnyZone squashed commits to two commits.
   
   
   Also re-checked indents to make sure they are not mistakenly configured.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#issuecomment-580394229
 
 
   @swathi858 can you send an email to dev@calcite.apache.org for your questions. That would be an better place to answer your question.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367084188
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlWindowTableFunction.java
 ##########
 @@ -44,28 +43,21 @@ public SqlWindowTableFunction(String name) {
         SqlFunctionCategory.SYSTEM);
   }
 
-  @Override public SqlOperandCountRange getOperandCountRange() {
-    return SqlOperandCountRanges.of(3);
-  }
-
-  @Override public boolean checkOperandTypes(SqlCallBinding callBinding,
 
 Review comment:
   Yes. Each windowing table-valued function shall check their own operands. The check of DESCRIPTOR names are also left for each function (which makes sense as each function has its own interpretation of names).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-610647433
 
 
   @DonnyZone 
   
   just solved the merge conflict and re-triggered tests. Would you be able to take a look?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-610714704
 
 
   @amaliujia Sorry for the late reply. I will take a look ASAP.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410588871
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java
 ##########
 @@ -3143,11 +3147,62 @@ private Expression normalize(SqlTypeName typeName, Expression e) {
       return Expressions.call(
           BuiltInMethod.TUMBLING.method,
           inputEnumerable,
-          EnumUtils.windowSelector(
+          EnumUtils.tumblingWindowSelector(
               inputPhysType,
               outputPhysType,
               translatedOperands.get(0),
               translatedOperands.get(1)));
     }
   }
+
+  /** Implements hopping. */
+  private static class HopImplementor implements TableFunctionCallImplementor {
+    @Override public Expression implement(RexToLixTranslator translator,
+        Expression inputEnumerable, RexCall call, PhysType inputPhysType, PhysType outputPhysType) {
+      Expression intervalExpression = translator.translate(call.getOperands().get(2));
+      Expression intervalExpression2 = translator.translate(call.getOperands().get(3));
+      RexCall descriptor = (RexCall) call.getOperands().get(1);
 
 Review comment:
   Give these `intervalExpression ` more meaningful name. e.g. `windowSize` and `slidingInterval` ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367737325
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
 
 Review comment:
   809L-811L The indentation format seems a bit problematic.
   There are several more functions with indentation format below.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-601554290
 
 
   Thanks @DonnyZone for your review! I will address your comments this weekend!
   
   Recently I am pretty busy in workdays.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367737325
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
 
 Review comment:
   809L-811L The indentation format seems a bit problematic.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r397429284
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1871,14 +1872,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 Here is an example:
 `SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '1' MINUTE))`,
 will apply tumbling with 1 minute window size on rows from table orders. rowtime is the
 watermarked column of table orders that tells data completeness.
 
+#### HOP
+In streaming queries, HOP assigns windows that cover rows within the interval of *size*, shifting every *slide*, 
+and optionally aligned at *time* based on a timestamp column. Windows assigned could have overlapping so hopping 
+sometime is named as "sliding windowing".  
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| HOP(table, DESCRIPTOR(datetime), slide, size, [, time ]) | Indicates a hopping window for *datetime*, covering rows within the interval of *size*, shifting every *slide*, and optionally aligned at *time*.
+
+Here is an example:
+`SELECT * FROM TABLE(HOP(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '2' MINUTE, INTERVAL '5' MINUTE))`,
+will apply hopping with 5-minute interval size on rows from table orders, shifting every 2 minutes. rowtime is the
+watermarked column of table orders that tells data completeness.
+
+#### SESSION
+In streaming queries, SESSION assigns windows that cover rows based on *datetime*. Within a session window, distances 
+of rows are less than *interval*. Session window is applied per *key*.
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| session(table, DESCRIPTOR(datetime), DESCRIPTOR(key), interval [, time ]) | Indicates a session window of *interval* for *datetime*, optionally aligned at *time*. Session window is applied per *key*.
+
+Here is an example:
+`SELECT * FROM TABLE(SESSION(TABLE orders, DESCRIPTOR(rowtime), DESCRIPTOR(product), INTERVAL '20' MINUTE))`,
+will apply session with 20-minute inactive gap on rows from table orders. rowtime is the
+watermarked column of table orders that tells data completeness. Session is applied per product.
+
 ### Grouped window functions
 
 Review comment:
   I have addressed 2.
   
   Regarding 3, I would like to do it after this PR. I would prefer to have a discussion about deprecating grouped window functions in mailing list first, which will give a much better idea how `streaming` page be updated (e.g. keep both style or just keep the new one)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] HOP Table-valued Function, SESSION Table-valued Function

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] HOP Table-valued Function, SESSION Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#issuecomment-597484302
 
 
   We'd better change the title of this PR accordingly. It is still "table-value". Or can we simplify it (e.g., "Implement HOP and SESSION table function for ...", and etc.)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-613091499
 
 
   @DonnyZone squashed commits to two commits.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-612780024
 
 
   This PR has two main commits (CALCITE-3737 and CALCITE-3780) and other minor commits.
   For merge convenience, could you please squash other minor ones into its main commit?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-613196967
 
 
   Thank you both! Sounds great to make it released in 1.23.0!
   
   
   And I checked the failed tests. I don't think they are relevant to this PR. E.g. Tried one failed test and it passes locally.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410588322
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java
 ##########
 @@ -966,8 +970,8 @@ public MatchImplementor get(final SqlMatchFunction function) {
     }
   }
 
-  public TableValuedFunctionCallImplementor get(final SqlWindowTableFunction operator) {
-    final Supplier<? extends TableValuedFunctionCallImplementor> supplier =
+  public TableFunctionCallImplementor get(final SqlWindowTableFunction operator) {
+    final Supplier<? extends TableFunctionCallImplementor> supplier =
 
 Review comment:
   Also remove the values keyword from the commit message.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396185311
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/util/BuiltInMethod.java
 ##########
 @@ -586,7 +587,11 @@
       "resultSelector", Function2.class),
   AGG_LAMBDA_FACTORY_ACC_SINGLE_GROUP_RESULT_SELECTOR(AggregateLambdaFactory.class,
       "singleGroupResultSelector", Function1.class),
-  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class);
+  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class),
+  HOPPING(EnumUtils.class, "hopping", Enumerator.class, int.class, long.class,
 
 Review comment:
   Current now, I personally suggest to put them in `EnumUtils`. It is a little strange in Linq4j. But if we will add more streaming functions later, I think we can create a separate class for them. It's more clear. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] swathi858 commented on issue #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
swathi858 commented on issue #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#issuecomment-580107971
 
 
   Can we create another table function like tumble or hop??

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396146666
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1871,14 +1872,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 Here is an example:
 `SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '1' MINUTE))`,
 will apply tumbling with 1 minute window size on rows from table orders. rowtime is the
 watermarked column of table orders that tells data completeness.
 
+#### HOP
+In streaming queries, HOP assigns windows that cover rows within the interval of *size*, shifting every *slide*, 
+and optionally aligned at *time* based on a timestamp column. Windows assigned could have overlapping so hopping 
+sometime is named as "sliding windowing".  
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| HOP(table, DESCRIPTOR(datetime), slide, size, [, time ]) | Indicates a hopping window for *datetime*, covering rows within the interval of *size*, shifting every *slide*, and optionally aligned at *time*.
+
+Here is an example:
+`SELECT * FROM TABLE(HOP(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '2' MINUTE, INTERVAL '5' MINUTE))`,
+will apply hopping with 5-minute interval size on rows from table orders, shifting every 2 minutes. rowtime is the
+watermarked column of table orders that tells data completeness.
+
+#### SESSION
+In streaming queries, SESSION assigns windows that cover rows based on *datetime*. Within a session window, distances 
+of rows are less than *interval*. Session window is applied per *key*.
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| session(table, DESCRIPTOR(datetime), DESCRIPTOR(key), interval [, time ]) | Indicates a session window of *interval* for *datetime*, optionally aligned at *time*. Session window is applied per *key*.
+
+Here is an example:
+`SELECT * FROM TABLE(SESSION(TABLE orders, DESCRIPTOR(rowtime), DESCRIPTOR(product), INTERVAL '20' MINUTE))`,
+will apply session with 20-minute inactive gap on rows from table orders. rowtime is the
+watermarked column of table orders that tells data completeness. Session is applied per product.
+
 ### Grouped window functions
 
 Review comment:
   Re: 1. Yes. First of all, table function style streaming can cover what grouped window function provides. Then grouped window function has drawbacks: 
   1) magic *_start, * _end function to get window_start, window_end 
   2) enforce to use a GROUP BY. E.g. if users only want to do windowing streaming join without aggregation, they shouldn't be bothered to use a GROUP BY. 
   3) error prone syntax. E.g. why users cannot do GROUP BY TUMBLE(), HOP()? The syntax is correct for SQL.
   So I think (and probably Julian also agreed) grouped window function can be deprecated. 
   
   I will send an email to discuss the deprecation with Calcite community. So grouped window functions won't disappear silently. 
   
   
   Re 2 and 3: nice suggestions. I will add a commit for the change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410590161
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1875,14 +1876,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 
 Review comment:
   Do we really support `[, time ]` ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-597827856
 
 
   Sounds good! Renamed the title to "Implement HOP and SESSION table functions", which should be easy to understand now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396146040
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/util/BuiltInMethod.java
 ##########
 @@ -586,7 +587,11 @@
       "resultSelector", Function2.class),
   AGG_LAMBDA_FACTORY_ACC_SINGLE_GROUP_RESULT_SELECTOR(AggregateLambdaFactory.class,
       "singleGroupResultSelector", Function1.class),
-  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class);
+  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class),
+  HOPPING(EnumUtils.class, "hopping", Enumerator.class, int.class, long.class,
 
 Review comment:
   I see. No special reason. Would you suggest I put it in the same class? If so, which class is better?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r407346923
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlSessionTableFunction.java
 ##########
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.sql;
+
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.sql.type.SqlOperandCountRanges;
+import org.apache.calcite.sql.type.SqlTypeName;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.sql.validate.SqlValidator;
+
+/**
+ * SqlSessionTableFunction implements an operator for per-key sessionization. It allows
+ * four parameters:
+ * 1. a table.
+ * 2. a descriptor to provide a watermarked column name from the input table.
+ * 3. a descriptor to provide a column as key, on which sessionization will be applied.
+ * 4. an interval parameter to specify a inactive activity gap to break sessions.
+ */
+public class SqlSessionTableFunction extends SqlWindowTableFunction {
+  public SqlSessionTableFunction() {
+    super(SqlKind.SESSION.name());
+  }
+
+  @Override public SqlOperandCountRange getOperandCountRange() {
+    return SqlOperandCountRanges.of(4);
+  }
+
+  @Override public boolean checkOperandTypes(SqlCallBinding callBinding,
+                                             boolean throwOnFailure) {
 
 Review comment:
   nitpicking: indent

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410587273
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +803,244 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies sessionization to elements from the input
+   * enumerator based on a specified key. Elements are windowed into sessions separated by
+   * periods with no input for at least the duration specified by gap parameter.
+   */
+  public static Enumerable<Object[]> sessionize(Enumerator<Object[]> inputEnumerator,
+      int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new SessionizationEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, indexOfKeyColumn, gap);
+      }
+    };
+  }
+
+  private static class SessionizationEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final int indexOfKeyColumn;
+    private final long gap;
+    private LinkedList<Object[]> list;
+    private boolean initialized;
+
+    SessionizationEnumerator(Enumerator<Object[]> inputEnumerator,
+        int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+      this.inputEnumerator = inputEnumerator;
+      this.indexOfWatermarkedColumn = indexOfWatermarkedColumn;
+      this.indexOfKeyColumn = indexOfKeyColumn;
+      this.gap = gap;
+      list = new LinkedList<>();
+      initialized = false;
+    }
+
+    @Override public Object[] current() {
+      if (!initialized) {
+        initialize();
+        initialized = true;
+      }
+      return list.pollFirst();
+    }
+
+    @Override public boolean moveNext() {
+      return initialized ? list.size() > 0 : inputEnumerator.moveNext();
+    }
+
+    @Override public void reset() {
+      list.clear();
+      inputEnumerator.reset();
+      initialized = false;
+    }
+
+    @Override public void close() {
+      list.clear();
+      inputEnumerator.close();
+      initialized = false;
+    }
+
+    private void initialize() {
+      List<Object[]> elements = new ArrayList<>();
+      // initialize() will be called when inputEnumerator.moveNext() is true,
+      // thus firstly should take the current element.
+      elements.add(inputEnumerator.current());
+      // sessionization needs to see all data.
+      while (inputEnumerator.moveNext()) {
 
 Review comment:
   Is this true ? Shouldn't we rely on the watermark and gap to decide if a session is expired ? What about the late arrive data ?
   
   This implementation should not be blocking.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r397428051
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/util/BuiltInMethod.java
 ##########
 @@ -586,7 +587,11 @@
       "resultSelector", Function2.class),
   AGG_LAMBDA_FACTORY_ACC_SINGLE_GROUP_RESULT_SELECTOR(AggregateLambdaFactory.class,
       "singleGroupResultSelector", Function1.class),
-  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class);
+  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class),
+  HOPPING(EnumUtils.class, "hopping", Enumerator.class, int.class, long.class,
 
 Review comment:
   Done!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r366742073
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlWindowTableFunction.java
 ##########
 @@ -20,17 +20,16 @@
 import org.apache.calcite.rel.type.RelDataTypeField;
 import org.apache.calcite.rel.type.RelDataTypeFieldImpl;
 import org.apache.calcite.rel.type.RelRecordType;
-import org.apache.calcite.sql.type.SqlOperandCountRanges;
 import org.apache.calcite.sql.type.SqlReturnTypeInference;
 import org.apache.calcite.sql.type.SqlTypeName;
-import org.apache.calcite.sql.type.SqlTypeUtil;
 import org.apache.calcite.sql.validate.SqlValidator;
 
 import java.util.ArrayList;
 import java.util.List;
 
 import static org.apache.calcite.util.Static.RESOURCE;
 
+
 
 Review comment:
   extra line?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410586353
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +803,244 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies sessionization to elements from the input
+   * enumerator based on a specified key. Elements are windowed into sessions separated by
+   * periods with no input for at least the duration specified by gap parameter.
+   */
+  public static Enumerable<Object[]> sessionize(Enumerator<Object[]> inputEnumerator,
+      int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new SessionizationEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, indexOfKeyColumn, gap);
+      }
+    };
+  }
+
+  private static class SessionizationEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final int indexOfKeyColumn;
+    private final long gap;
+    private LinkedList<Object[]> list;
+    private boolean initialized;
+
+    SessionizationEnumerator(Enumerator<Object[]> inputEnumerator,
+        int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+      this.inputEnumerator = inputEnumerator;
 
 Review comment:
   Give doc for these parameters. Same for other codes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367085790
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlWindowTableFunction.java
 ##########
 @@ -20,17 +20,16 @@
 import org.apache.calcite.rel.type.RelDataTypeField;
 import org.apache.calcite.rel.type.RelDataTypeFieldImpl;
 import org.apache.calcite.rel.type.RelRecordType;
-import org.apache.calcite.sql.type.SqlOperandCountRanges;
 import org.apache.calcite.sql.type.SqlReturnTypeInference;
 import org.apache.calcite.sql.type.SqlTypeName;
-import org.apache.calcite.sql.type.SqlTypeUtil;
 import org.apache.calcite.sql.validate.SqlValidator;
 
 import java.util.ArrayList;
 import java.util.List;
 
 import static org.apache.calcite.util.Static.RESOURCE;
 
+
 
 Review comment:
   thanks! I will hold the change a bit until receive enough comments (and will address all style like comments at a time). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-613188354
 
 
   Thanks! There are some failed tests. Could you please make an investigation?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] 34venu commented on issue #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
34venu commented on issue #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#issuecomment-581316271
 
 
   how to register table functions in apache calcite?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-613191382
 
 
   I would try to give a final review this week, let's make this into 1.23 ~

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367736415
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
+                                            long emitFrequency,
+                                            long intervalSize) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new HopEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, emitFrequency, intervalSize);
+      }
+    };
+  }
+
+  private static class HopEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final long emitFrequency;
+    private final long intervalSize;
+    private LinkedList<Object[]> list;
+
+    HopEnumerator(Enumerator<Object[]> inputEnumerator,
+                  int indexOfWatermarkedColumn, long emitFrequency, long intervalSize) {
+      this.inputEnumerator = inputEnumerator;
+      this.indexOfWatermarkedColumn = indexOfWatermarkedColumn;
+      this.emitFrequency = emitFrequency;
+      this.intervalSize = intervalSize;
+      list = new LinkedList<>();
+    }
+
+    public Object[] current() {
+      if (list.size() > 0) {
+        return takeOne();
+      } else {
+        Object[] current = inputEnumerator.current();
+        List<Pair> windows = hopWindows(SqlFunctions.toLong(current[indexOfWatermarkedColumn]),
+            emitFrequency, intervalSize);
+        for (Pair window : windows) {
+          Object[] curWithWindow = new Object[current.length + 2];
+          System.arraycopy(current, 0, curWithWindow, 0, current.length);
+          curWithWindow[current.length] = window.left;
+          curWithWindow[current.length + 1] = window.right;
+          list.offer(curWithWindow);
+        }
+        return takeOne();
+      }
+
+    }
+
+    public boolean moveNext() {
+      if (list.size() > 0) {
+        return true;
+      }
+      return inputEnumerator.moveNext();
+    }
+
+    public void reset() {
+      inputEnumerator.reset();
+      list.clear();
+    }
+
+    public void close() {
+    }
+
+    private Object[] takeOne() {
+      return list.pollFirst();
+    }
+  }
+
+  private static List<Pair> hopWindows(long tsMillis, long periodMillis, long sizeMillis) {
+    ArrayList<Pair> ret = new ArrayList<>((int) (sizeMillis / periodMillis));
 
 Review comment:
   Here we can use `Math.toIntExact`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r394104653
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java
 ##########
 @@ -2312,7 +2320,7 @@ public void unparse(
 
   /** The {@code HOP} group function. */
   public static final SqlGroupedWindowFunction HOP =
-      new SqlGroupedWindowFunction(SqlKind.HOP.name(), SqlKind.HOP, null,
+      new SqlGroupedWindowFunction("$HOP", SqlKind.HOP, null,
 
 Review comment:
   Why we need to change the function name here?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396184710
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java
 ##########
 @@ -2312,7 +2320,7 @@ public void unparse(
 
   /** The {@code HOP} group function. */
   public static final SqlGroupedWindowFunction HOP =
-      new SqlGroupedWindowFunction(SqlKind.HOP.name(), SqlKind.HOP, null,
+      new SqlGroupedWindowFunction("$HOP", SqlKind.HOP, null,
 
 Review comment:
   I see, thanks for explanation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410589580
 
 

 ##########
 File path: core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java
 ##########
 @@ -1778,15 +1778,16 @@ public final Sql sql(String sql) {
 
   // In generated plan, the first parameter of TUMBLE function will always be the last field
   // of it's input. There isn't a way to give the first operand a proper type.
-  @Test void testTableValuedFunctionTumble() {
+
+  @Test public void testTableFunctionTumble() {
     final String sql = "select *\n"
         + "from table(tumble(table Shipments, descriptor(rowtime), INTERVAL '1' MINUTE))";
     sql(sql).ok();
   }
 
   // In generated plan, the first parameter of TUMBLE function will always be the last field
   // of it's input. There isn't a way to give the first operand a proper type.
-  @Test void testTableValuedFunctionTumbleWithSubQueryParam() {
+  @Test public void testTableFunctionTumbleWithSubQueryParam() {
     final String sql = "select *\n"
         + "from table(tumble((select * from Shipments), descriptor(rowtime), INTERVAL '1' MINUTE))";
 
 Review comment:
   Also add tests for `SESSION` and `HOP`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367735467
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
+                                            long emitFrequency,
 
 Review comment:
   809L-811L The indentation format seems a bit problematic.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-603492894
 
 
   Thinking failed tests are not related to this PR: 
   
   that is about `':elasticsearch:test'` or Pig related test.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396145921
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java
 ##########
 @@ -2312,7 +2320,7 @@ public void unparse(
 
   /** The {@code HOP} group function. */
   public static final SqlGroupedWindowFunction HOP =
-      new SqlGroupedWindowFunction(SqlKind.HOP.name(), SqlKind.HOP, null,
+      new SqlGroupedWindowFunction("$HOP", SqlKind.HOP, null,
 
 Review comment:
   Here is the context about this change: https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java#L2304

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r397429284
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1871,14 +1872,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 Here is an example:
 `SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '1' MINUTE))`,
 will apply tumbling with 1 minute window size on rows from table orders. rowtime is the
 watermarked column of table orders that tells data completeness.
 
+#### HOP
+In streaming queries, HOP assigns windows that cover rows within the interval of *size*, shifting every *slide*, 
+and optionally aligned at *time* based on a timestamp column. Windows assigned could have overlapping so hopping 
+sometime is named as "sliding windowing".  
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| HOP(table, DESCRIPTOR(datetime), slide, size, [, time ]) | Indicates a hopping window for *datetime*, covering rows within the interval of *size*, shifting every *slide*, and optionally aligned at *time*.
+
+Here is an example:
+`SELECT * FROM TABLE(HOP(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '2' MINUTE, INTERVAL '5' MINUTE))`,
+will apply hopping with 5-minute interval size on rows from table orders, shifting every 2 minutes. rowtime is the
+watermarked column of table orders that tells data completeness.
+
+#### SESSION
+In streaming queries, SESSION assigns windows that cover rows based on *datetime*. Within a session window, distances 
+of rows are less than *interval*. Session window is applied per *key*.
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| session(table, DESCRIPTOR(datetime), DESCRIPTOR(key), interval [, time ]) | Indicates a session window of *interval* for *datetime*, optionally aligned at *time*. Session window is applied per *key*.
+
+Here is an example:
+`SELECT * FROM TABLE(SESSION(TABLE orders, DESCRIPTOR(rowtime), DESCRIPTOR(product), INTERVAL '20' MINUTE))`,
+will apply session with 20-minute inactive gap on rows from table orders. rowtime is the
+watermarked column of table orders that tells data completeness. Session is applied per product.
+
 ### Grouped window functions
 
 Review comment:
   I have addressed 2.
   
   Regarding 3, I would do it later. I would prefer to have a discussion about deprecating grouped window functions in mailing list first, which will give a much better idea how `streaming` page be updated (e.g. keep both style or just keep the new one)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410588000
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +803,244 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies sessionization to elements from the input
+   * enumerator based on a specified key. Elements are windowed into sessions separated by
+   * periods with no input for at least the duration specified by gap parameter.
+   */
+  public static Enumerable<Object[]> sessionize(Enumerator<Object[]> inputEnumerator,
+      int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new SessionizationEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, indexOfKeyColumn, gap);
+      }
+    };
+  }
+
+  private static class SessionizationEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final int indexOfKeyColumn;
+    private final long gap;
+    private LinkedList<Object[]> list;
+    private boolean initialized;
+
+    SessionizationEnumerator(Enumerator<Object[]> inputEnumerator,
+        int indexOfWatermarkedColumn, int indexOfKeyColumn, long gap) {
+      this.inputEnumerator = inputEnumerator;
+      this.indexOfWatermarkedColumn = indexOfWatermarkedColumn;
+      this.indexOfKeyColumn = indexOfKeyColumn;
+      this.gap = gap;
+      list = new LinkedList<>();
+      initialized = false;
+    }
+
+    @Override public Object[] current() {
+      if (!initialized) {
+        initialize();
+        initialized = true;
+      }
+      return list.pollFirst();
+    }
+
+    @Override public boolean moveNext() {
+      return initialized ? list.size() > 0 : inputEnumerator.moveNext();
+    }
+
+    @Override public void reset() {
+      list.clear();
+      inputEnumerator.reset();
+      initialized = false;
+    }
+
+    @Override public void close() {
+      list.clear();
+      inputEnumerator.close();
+      initialized = false;
+    }
+
+    private void initialize() {
+      List<Object[]> elements = new ArrayList<>();
+      // initialize() will be called when inputEnumerator.moveNext() is true,
+      // thus firstly should take the current element.
+      elements.add(inputEnumerator.current());
+      // sessionization needs to see all data.
+      while (inputEnumerator.moveNext()) {
+        elements.add(inputEnumerator.current());
+      }
+
+      Map<Object, SortedMultiMap<Pair<Long, Long>, Object[]>> sessionKeyMap = new HashMap<>();
+      for (Object[] element : elements) {
+        sessionKeyMap.putIfAbsent(element[indexOfKeyColumn], new SortedMultiMap<>());
+        Pair initWindow = computeInitWindow(
+            SqlFunctions.toLong(element[indexOfWatermarkedColumn]), gap);
+        sessionKeyMap.get(element[indexOfKeyColumn]).putMulti(initWindow, element);
+      }
+
+      // merge per key session windows if there is any overlap between windows.
+      for (Map.Entry<Object, SortedMultiMap<Pair<Long, Long>, Object[]>> perKeyEntry
+          : sessionKeyMap.entrySet()) {
+        Map<Pair<Long, Long>, List<Object[]>> finalWindowElementsMap = new HashMap<>();
+        Pair<Long, Long> currentWindow = null;
+        List<Object[]> tempElementList = new ArrayList<>();
+        for (Map.Entry<Pair<Long, Long>, List<Object[]>> sessionEntry
+            : perKeyEntry.getValue().entrySet()) {
+          // check the next window can be merged.
+          if (currentWindow == null || !isOverlapped(currentWindow, sessionEntry.getKey())) {
+            // cannot merge window as there is no overlap
+            if (currentWindow != null) {
+              finalWindowElementsMap.put(currentWindow, new ArrayList<>(tempElementList));
+            }
+
+            currentWindow = sessionEntry.getKey();
+            tempElementList.clear();
+            tempElementList.addAll(sessionEntry.getValue());
+          } else {
+            // merge windows.
+            currentWindow = mergeWindows(currentWindow, sessionEntry.getKey());
+            // merge elements in windows.
+            tempElementList.addAll(sessionEntry.getValue());
+          }
+        }
+
+        if (!tempElementList.isEmpty()) {
+          finalWindowElementsMap.put(currentWindow, new ArrayList<>(tempElementList));
+        }
+
+        // construct final results from finalWindowElementsMap.
+        for (Map.Entry<Pair<Long, Long>, List<Object[]>> finalWindowElementsEntry
+            : finalWindowElementsMap.entrySet()) {
+          for (Object[] element : finalWindowElementsEntry.getValue()) {
+            Object[] curWithWindow = new Object[element.length + 2];
+            System.arraycopy(element, 0, curWithWindow, 0, element.length);
+            curWithWindow[element.length] = finalWindowElementsEntry.getKey().left;
+            curWithWindow[element.length + 1] = finalWindowElementsEntry.getKey().right;
+            list.offer(curWithWindow);
+          }
+        }
+      }
+    }
+
+    private boolean isOverlapped(Pair<Long, Long> a, Pair<Long, Long> b) {
+      return !(b.left >= a.right);
+    }
+
+    private Pair<Long, Long> mergeWindows(Pair<Long, Long> a, Pair<Long, Long> b) {
+      return new Pair<>(a.left <= b.left ? a.left : b.left, a.right >= b.right ? a.right : b.right);
+    }
+
+    private Pair<Long, Long> computeInitWindow(long ts, long gap) {
+      return new Pair<>(ts, ts + gap);
+    }
+  }
+
+  /**
+   * Create enumerable implementation that applies hopping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hopping(Enumerator<Object[]> inputEnumerator,
+      int indexOfWatermarkedColumn, long emitFrequency, long intervalSize) {
+    return new AbstractEnumerable<Object[]>() {
+      @Override public Enumerator<Object[]> enumerator() {
+        return new HopEnumerator(inputEnumerator,
+            indexOfWatermarkedColumn, emitFrequency, intervalSize);
+      }
+    };
+  }
+
+  private static class HopEnumerator implements Enumerator<Object[]> {
+    private final Enumerator<Object[]> inputEnumerator;
+    private final int indexOfWatermarkedColumn;
+    private final long emitFrequency;
+    private final long intervalSize;
+    private LinkedList<Object[]> list;
+
+    HopEnumerator(Enumerator<Object[]> inputEnumerator,
+        int indexOfWatermarkedColumn, long emitFrequency, long intervalSize) {
+      this.inputEnumerator = inputEnumerator;
+      this.indexOfWatermarkedColumn = indexOfWatermarkedColumn;
+      this.emitFrequency = emitFrequency;
 
 Review comment:
   `indexOfWatermarkedColumn ` -> `wmIndex` ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r369795725
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlWindowTableFunction.java
 ##########
 @@ -20,17 +20,16 @@
 import org.apache.calcite.rel.type.RelDataTypeField;
 import org.apache.calcite.rel.type.RelDataTypeFieldImpl;
 import org.apache.calcite.rel.type.RelRecordType;
-import org.apache.calcite.sql.type.SqlOperandCountRanges;
 import org.apache.calcite.sql.type.SqlReturnTypeInference;
 import org.apache.calcite.sql.type.SqlTypeName;
-import org.apache.calcite.sql.type.SqlTypeUtil;
 import org.apache.calcite.sql.validate.SqlValidator;
 
 import java.util.ArrayList;
 import java.util.List;
 
 import static org.apache.calcite.util.Static.RESOURCE;
 
+
 
 Review comment:
   fixed!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r394105094
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/util/BuiltInMethod.java
 ##########
 @@ -586,7 +587,11 @@
       "resultSelector", Function2.class),
   AGG_LAMBDA_FACTORY_ACC_SINGLE_GROUP_RESULT_SELECTOR(AggregateLambdaFactory.class,
       "singleGroupResultSelector", Function1.class),
-  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class);
+  TUMBLING(EnumerableDefaults.class, "tumbling", Enumerable.class, Function1.class),
+  HOPPING(EnumUtils.class, "hopping", Enumerator.class, int.class, long.class,
 
 Review comment:
   Any special reasons to implement `tumbling` and `hopping`&`sessionize` in different places?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-603479077
 
 
    @DonnyZone have addressed comment and pushed the change.
   
   I should be able to address future comments soon as my daily work has passed the intensive period.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367151092
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -744,9 +748,9 @@ static Expression generatePredicate(
     return Expressions.lambda(Predicate2.class, builder.toBlock(), left_, right_);
   }
 
-  /** Generates a window selector which appends attribute of the window based on
+  /** Generates a window selector which appends attribute of the window bases on
 
 Review comment:
   will fix `bases` as a typo.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone commented on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-603597679
 
 
   Hi @amaliujia, thanks for your work! I will take another look in these couple of days.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r396146666
 
 

 ##########
 File path: site/_docs/reference.md
 ##########
 @@ -1871,14 +1872,44 @@ is named as "fixed windowing".
 
 | Operator syntax      | Description
 |:-------------------- |:-----------
-| TUMBLE(table, DESCRIPTOR(column_name), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*. Tumbling is applied on table in which there is a watermarked column specified by descriptor.
+| TUMBLE(table, DESCRIPTOR(datetime), interval [, time ]) | Indicates a tumbling window of *interval* for *datetime*, optionally aligned at *time*.
 
 Here is an example:
 `SELECT * FROM TABLE(TUMBLE(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '1' MINUTE))`,
 will apply tumbling with 1 minute window size on rows from table orders. rowtime is the
 watermarked column of table orders that tells data completeness.
 
+#### HOP
+In streaming queries, HOP assigns windows that cover rows within the interval of *size*, shifting every *slide*, 
+and optionally aligned at *time* based on a timestamp column. Windows assigned could have overlapping so hopping 
+sometime is named as "sliding windowing".  
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| HOP(table, DESCRIPTOR(datetime), slide, size, [, time ]) | Indicates a hopping window for *datetime*, covering rows within the interval of *size*, shifting every *slide*, and optionally aligned at *time*.
+
+Here is an example:
+`SELECT * FROM TABLE(HOP(TABLE orders, DESCRIPTOR(rowtime), INTERVAL '2' MINUTE, INTERVAL '5' MINUTE))`,
+will apply hopping with 5-minute interval size on rows from table orders, shifting every 2 minutes. rowtime is the
+watermarked column of table orders that tells data completeness.
+
+#### SESSION
+In streaming queries, SESSION assigns windows that cover rows based on *datetime*. Within a session window, distances 
+of rows are less than *interval*. Session window is applied per *key*.
+
+
+| Operator syntax      | Description
+|:-------------------- |:-----------
+| session(table, DESCRIPTOR(datetime), DESCRIPTOR(key), interval [, time ]) | Indicates a session window of *interval* for *datetime*, optionally aligned at *time*. Session window is applied per *key*.
+
+Here is an example:
+`SELECT * FROM TABLE(SESSION(TABLE orders, DESCRIPTOR(rowtime), DESCRIPTOR(product), INTERVAL '20' MINUTE))`,
+will apply session with 20-minute inactive gap on rows from table orders. rowtime is the
+watermarked column of table orders that tells data completeness. Session is applied per product.
+
 ### Grouped window functions
 
 Review comment:
   Re: 1. Yes. First of all, Table function style streaming can cover what grouped window function provides. Then grouped window function has drawbacks: 1) magic *_start, * _end function to get window_start, window_end 2) enforce to us a GROUP BY (if users only want to do windowing streaming join without aggregation, they shouldn't be bothered to use a GROUP BY). 3) error prone syntax (why users cannot do GROUP BY TUMBLE(), HOP()? The syntax is correct for SQL).
   So I think (and probably Julian also agreed) grouped window function can be deprecated. 
   
   I will send an email to discuss the deprecation with community. So grouped window functions won't disappear silently. 
   
   
   Re 2 and 3: nice suggestions. I will add a commit for the change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] DonnyZone edited a comment on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
DonnyZone edited a comment on issue #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#issuecomment-612780024
 
 
   This PR has two main commits (CALCITE-3737 and CALCITE-3780) and other minor commits.
   For merge convenience, could you please squash other minor ones into its main commit accordingly?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410586500
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +803,244 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies sessionization to elements from the input
+   * enumerator based on a specified key. Elements are windowed into sessions separated by
 
 Review comment:
   `Create` -> `Creates`, same for other codes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia edited a comment on issue #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
amaliujia edited a comment on issue #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#issuecomment-580394229
 
 
   @swathi858 can you send an email to dev@calcite.apache.org for your questions? That would be an better place to answer your question.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
amaliujia commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r407712676
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/SqlSessionTableFunction.java
 ##########
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.sql;
+
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.sql.type.SqlOperandCountRanges;
+import org.apache.calcite.sql.type.SqlTypeName;
+import org.apache.calcite.sql.type.SqlTypeUtil;
+import org.apache.calcite.sql.validate.SqlValidator;
+
+/**
+ * SqlSessionTableFunction implements an operator for per-key sessionization. It allows
+ * four parameters:
+ * 1. a table.
+ * 2. a descriptor to provide a watermarked column name from the input table.
+ * 3. a descriptor to provide a column as key, on which sessionization will be applied.
+ * 4. an interval parameter to specify a inactive activity gap to break sessions.
+ */
+public class SqlSessionTableFunction extends SqlWindowTableFunction {
+  public SqlSessionTableFunction() {
+    super(SqlKind.SESSION.name());
+  }
+
+  @Override public SqlOperandCountRange getOperandCountRange() {
+    return SqlOperandCountRanges.of(4);
+  }
+
+  @Override public boolean checkOperandTypes(SqlCallBinding callBinding,
+                                             boolean throwOnFailure) {
 
 Review comment:
   oops. Fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function

Posted by GitBox <gi...@apache.org>.
XuQianJin-Stars commented on a change in pull request #1761: [CALCITE-3737] HOP Table-valued Function 
URL: https://github.com/apache/calcite/pull/1761#discussion_r367735467
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/adapter/enumerable/EnumUtils.java
 ##########
 @@ -796,4 +800,86 @@ static Expression windowSelector(
         outputPhysType.record(expressions),
         parameter);
   }
+
+  /**
+   * Create enumerable implementation that applies hoping on each element from the input
+   * enumerator and produces at least one element for each input element.
+   */
+  public static Enumerable<Object[]> hoping(Enumerator<Object[]> inputEnumerator,
+                                            int indexOfWatermarkedColumn,
+                                            long emitFrequency,
 
 Review comment:
   809L-811L The indentation format seems a bit problematic.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [calcite] danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions

Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #1761: [CALCITE-3737][CALCITE-3780] Implement HOP and SESSION table functions
URL: https://github.com/apache/calcite/pull/1761#discussion_r410589373
 
 

 ##########
 File path: core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java
 ##########
 @@ -2296,8 +2298,14 @@ public void unparse(
   /** DESCRIPTOR(column_name, ...). */
   public static final SqlOperator DESCRIPTOR = new SqlDescriptorOperator();
 
-  /** TUMBLE as a table-value function. */
-  public static final SqlFunction TUMBLE_TVF = new SqlWindowTableFunction(SqlKind.TUMBLE.name());
+  /** TUMBLE as a table function. */
+  public static final SqlFunction TUMBLE_TVF = new SqlTumbleTableFunction();
+
+  /** HOP as a table function. */
+  public static final SqlFunction HOP_TVF = new SqlHopTableFunction();
+
+  /** SESSION as a table function. */
+  public static final SqlFunction SESSION_TVF = new SqlSessionTableFunction();
 
 
 Review comment:
   What does the `TVF` mean ? How about just name it `HOP_FUN`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services