You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by HeartSaVioR <gi...@git.apache.org> on 2016/09/26 00:33:46 UTC

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

GitHub user HeartSaVioR opened a pull request:

    https://github.com/apache/storm/pull/1714

    STORM-2125 Use Calcite's implementation of Rex Compiler

    Note: This patch is on top of STORM-2089.
    
    This can cover STORM-2111(#1704), STORM-2113(#1707), STORM-2114, STORM-2116(#1709).
    
    * Upgrade Calcite to latest 1.9.0
    * borrow JaninoRexCompiler and modify to return Expression instead of Scalar (RexNodeToBlockStatementCompiler)
      * since we need to pass code block to Bolts, not executable method implementation on classloader
    * modify codebase to use RexNodeToBlockStatementCompiler
    * add some feature validation unit tests
    * remove ExprCompiler and its test
    
    Btw, Calcite Rex Compiler assigns the result of ITEM() to Object, so comparing result to other constant just fails. CAST() works but if any of the result of ITEM() is null or out of bound (array), it throws Exception at runtime. So only using them to select field would be safe.
    
    I sent a mail to Calcite dev to ask this strange behavior, but I think it's not a big matter so even though it can't be fixed, I would like to document it as limitation and drop that feature.
    
    @arunmahadevan Please review and comment, and also please comment what do you think about above behavior. Thanks in advance.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HeartSaVioR/storm STORM-2125

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1714.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1714
    
----
commit 4b69c077a1f477e06049aa97bd127fc1b2786c21
Author: Jungtaek Lim <ka...@gmail.com>
Date:   2016-09-13T02:40:13Z

    STORM-2089 Replace Consumer of ISqlTridentDataSource with SqlTridentConsumer
    
    * SqlTridentConsumer contains StateFactory and StateUpdater which is needed to store tuples to State via batch
    * Apply the change to storm-sql-kafka
    * move out JsonScheme and JsonSerializer to runtime
      * it will be used from other external sql modules
    * add javadoc to ISqlTridentDataSource

commit 2fa4c95ff1ecb3cab9c09c25ef08a4faa21905e2
Author: Jungtaek Lim <ka...@gmail.com>
Date:   2016-09-22T06:38:44Z

    STORM-2125 Use Calcite's implementation of Rex Compiler
    
    * Upgrade Calcite to latest 1.9.0
    * borrow JaninoRexCompiler and modify to return Expression instead of Scalar (RexNodeToBlockStatementCompiler)
      * since we need to pass code block to Bolts, not executable method implementation on classloader
    * modify codebase to use RexNodeToBlockStatementCompiler
    * add some feature validation unit tests
    * remove ExprCompiler and its test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80478720
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/TestCompilerUtils.java ---
    @@ -153,6 +158,29 @@ public static CalciteState sqlOverNestedTable(String sql)
                     .field("MAPFIELD", SqlTypeName.ANY)
                     .field("NESTEDMAPFIELD", SqlTypeName.ANY)
                     .field("ARRAYFIELD", SqlTypeName.ANY)
    +
    +                // TODO: replace field definitions with belows while Calcite can handle properly
    --- End diff --
    
    For me, type safely is more important than convenience. That's why we use static typing language, and also trying to restrict more by generic and so. IMO, we shouldn't rely Calcite to handle operators with 'ANY' type properly. If it can handle operators properly, it's more than essential feature.
    
    As I commented earlier we can't keep the current syntax until we modify Calcite, or back to ExprCompiler which I really don't want to.
    For now I'll try to CAST to 'ANY' explicitly and see it makes nested multi-level access possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Addressed documenting covered feature from other PR: #1725


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80442515
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                          "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200");
    +                     "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200");
         StormSql sql = StormSql.construct();
         List<Values> values = new ArrayList<>();
         ChannelHandler h = new TestUtils.CollectDataChannelHandler(values);
         sql.execute(stmt, h);
         System.out.println(values);
         Map<String, Integer> map = ImmutableMap.of("b", 2, "c", 4);
         Map<String, Map<String, Integer>> nestedMap = ImmutableMap.of("a", map);
    -    Assert.assertEquals(new Values(2, map, nestedMap, Arrays.asList(100, 200, 300)), values.get(0));
    +    Assert.assertEquals(new Values(2, 4, nestedMap, Arrays.asList(100, 200, 300)), values.get(0));
       }
     
    -  @Test
    +  @Test(expected = RuntimeException.class)
       public void testExternalNestedInvalidAccess() throws Exception {
         List<String> stmt = new ArrayList<>();
    +    // this triggers java.lang.RuntimeException: Cannot convert null to int
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
         stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    -                     "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD['a'] = 200");
    --- End diff --
    
    MAPFIELD['a'] doesn't exist so this is a test for invalid access, map type.
    Btw, I'll keep earlier test case but remove nested access from where since it just makes test failing before testing invalid array access.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    There's an another issue: when to initialize DataContext.
    
    Calcite utilizes the variables in DataContext. and there're datetime related variables in DataContext.
    
    The thing is, SQL standard says that functions such as CURRENT_TIMESTAMP return the same value throughout the query. Since we're querying in stream, when to initialize DataContext defines the boundary to show same datetime across rows. With joining stream, the issue is more complicated.
    
    For now I just initialize DataContext at the query compilation phase and pass it to components, so that the value of current datetime is static (don't change) across workers in topology. If we would like to make it dynamic, we need to discuss.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80442067
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                          "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200");
    +                     "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200");
    --- End diff --
    
    1. 
    Yes CAST is required because the type of looked up value is Object. While we're saying based on only type on runtime, we might need to consider the type on schema.
    
    2. 
    Implicit cast is not possible because we can't determine the value type on compile time, especially we use ANY to represent its type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80443829
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/backends/standalone/TestPlanCompiler.java ---
    @@ -71,9 +71,9 @@ public void testLogicalExpr() throws Exception {
     
       @Test
       public void testNested() throws Exception {
    -    String sql = "SELECT ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    String sql = "SELECT ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                 "FROM FOO " +
    -            "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200";
    --- End diff --
    
    Like I said before (PR desc and comment), with Calcite Janino compiler it is not possible. (even with explicit type information)
    Let's wait for the response from Calcite dev@ list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @arunmahadevan 
    I'm not sure we can have our own expression compiler which supports features competitive to Calcite. It reminds me why this change is necessary.
    
    We have been leaving expression compiler which has lots of unimplemented things and also bugs. It was pushed more than 6 months before and only few of contributors are taking care of. We have lots of milestones (not a single task) to do even without Storm SQL, like Unified API, extending Window feature (session, etc.), Beam integration, Java porting, Metrics, and so on. Even I have been focusing to work on Storm SQL, I give up keeping expression compiler because of having less eyes to watch, and error-prone way of building code block (string concatenation). We could save amount of time if we replace this faster, and it will save another amount of time.
    
    Btw, normally we're often worry about dependency if dependency's activity is low or release interval is too long. But Calcite's release interval for minor release is even faster than Storm's bugfix release.
    Calcite released 1.7.0 from Mar 22, 1.8.0 from Jun 13, and 1.9.0 from Sep 22, and 1.10.0 today. I have already fixed two bugs in unit tests and will be shipped to 1.10.1 or 1.11.0.
    (Yes in this case we need to wait on that.)
    
    I'm not on the fence of making it pluggable, but just wondering we have comparable plugs on that, and we can make an effort for that. Would you want to be a sponsor on current expression compiler?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80439168
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
    --- End diff --
    
    MAPFIELD['a'] doesn't exist. NESTEDMAPFIELD['a'] exists.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @HeartSaVioR I suggested to keep an option to plug our code generator. If you think calcites code generator (linq4j) addresses all the current cases then lets go with that. We can revisit having our own code generator within storm-sql once we have more contributors/focussed effort around sql.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    FYI: 
    The vote for 1.10.0 RC1 is just started so we can expect Calcite 1.10.0 to next week.
    Submitted patches for recently found bugs to Calcite: CALCITE-1418 and CALCITE-1419.
    Above issues are marked for 1.11.0, but it's not a thing to rush since at least we are documenting this for `known bug` from STORM-2135.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @HeartSaVioR with this change we are going to be heavily dependent on the calcite's code generator to translate expressions correct?. Just thinking if we should have an option to make the code generation pluggable with an option to plug in our code generator as well. The calcite's code generator can be the default.
    
    Just thinking if in future we want to do some optimizations around the generated code or have our own implementation for a built in function or want to fix some bugs in the generated code without waiting for a calcite release etc, it may not be possible with the current patch. Let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80444322
  
    --- Diff: pom.xml ---
    @@ -264,7 +264,7 @@
             <junit.version>4.11</junit.version>
             <metrics-clojure.version>2.5.1</metrics-clojure.version>
             <hdrhistogram.version>2.1.7</hdrhistogram.version>
    -        <calcite.version>1.4.0-incubating</calcite.version>
    +        <calcite.version>1.9.0</calcite.version>
             <janino.version>2.7.8</janino.version>
    --- End diff --
    
    We will follow Janino from the Calcite dependency. I'll check it again and delete if it's not used anymore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @arunmahadevan Other than that I addressed your review comments. Thanks for the thoughtful review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @arunmahadevan Yes ideally we really want to have it. I just said that ExprCompiler seems not having a value to keep and maintain. Agreed with you. Thanks for reviewing!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1714


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80444655
  
    --- Diff: external/sql/storm-sql-runtime/src/jvm/org/apache/calcite/interpreter/StormDataContext.java ---
    @@ -0,0 +1,79 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.calcite.interpreter;
    --- End diff --
    
    I applied a trick because of accessor issue, and this seems to be improperly moved. I'll check again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80416617
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/backends/standalone/TestPlanCompiler.java ---
    @@ -71,9 +71,9 @@ public void testLogicalExpr() throws Exception {
     
       @Test
       public void testNested() throws Exception {
    -    String sql = "SELECT ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    String sql = "SELECT ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                 "FROM FOO " +
    -            "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200";
    --- End diff --
    
    Lets retain support for nested arrays/map.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80593580
  
    --- Diff: external/sql/storm-sql-runtime/src/jvm/org/apache/calcite/interpreter/StormDataContext.java ---
    @@ -0,0 +1,79 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.calcite.interpreter;
    --- End diff --
    
    Only StormContext needs to be placed under org.apache.calcite.interpreter. Will move this to org.apache.storm.sql.calcite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80414692
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
    --- End diff --
    
    Shouldn't this be MAPFIELD['a'] ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Calcite 1.10.0 RC1 vote passed and artifact is available to Maven.
    Applied HeartSaVioR@b2bcf9b with changing Calcite version from 1.10.0-SNAPSHOT to 1.10.0.
    Also rebased and squashed the commits.
    
    @arunmahadevan I guess this would be the last review for this patch. Please take a look again. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Just found and fixed bugs regarding aggregate calls and timezone issue in current_date.
    Now aggregate also needs new Trident map API - STORM-2072 (#1663) to reduce the steps.
    I'll address this after STORM-2072 is merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @HeartSaVioR  +1 on this change. If the minor version of the calcite with your fixes can get released soon, it makes sense to wait for that so we can keep the current semantics for querying nested arrays/maps.
    
    Can you also file a follow up JIRA to document the functions that are currently available within storm-sql and may be some examples for users to get started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    I'm trying to address our concern of nested map / array since Calcite community (Julian) says it sounds like a bug.
    
    Addressing it from CALCITE-1386 : https://github.com/apache/calcite/pull/283.
    With applying CALCITE-1386, we could make tests back to origin. 
    (It still need to wrap CAST when detailed type information is not available - I mean ANY)
    
    Below is the commit which works with CALCITE-1386 patch.
    https://github.com/HeartSaVioR/storm/commit/b2bcf9baebb8d9bd878c3d8a74c0d366ded7d399
    
    @arunmahadevan Please take a look into above commit. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Can you also document the list of functions that are now available since we are directly using calicite to generate the java code? Does it support all the standard SQL functions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @arunmahadevan 
    
    Let me summary regarding nested map/array support.
    
    * accessing nested element in nested map/array works without modifying Calcite.
      * it should be defined as ANY. Specifying MAP's value type to MAP doesn't seem to work. I don't understand this behavior but it works anyway.
    * We need to wrap value with CAST if we need to use value in expression. 
      * For example, `MAPFIELD['a'] = 2` doesn't work but `CAST(MAPFIELD['a'] AS INTEGER) = 2` works.
      * Using value in projection (fields of SELECT) doesn't need to be wrapped with CAST.
    * NOTE: this is NOT a safe operation, and errors are shown in runtime.
      * If value is null in any chance, comparing it to other will **throw RuntimeException** in runtime.
        * value itself could be null.
        * value could be null when we access array with wrong index or map with non-exist key. 
      * If we access array with index out of bound, it will **throw ArrayIndexOutOfBoundException** in runtime.
    
    So there's a way to support current feature with adding some inconveniences (CAST), but it is no longer safe operation. Since user can't handle errors manually, this can make worker continuously crashed.
    
    I asked Calcite dev. group about this and didn't get answer yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80416478
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/TestCompilerUtils.java ---
    @@ -153,6 +158,29 @@ public static CalciteState sqlOverNestedTable(String sql)
                     .field("MAPFIELD", SqlTypeName.ANY)
                     .field("NESTEDMAPFIELD", SqlTypeName.ANY)
                     .field("ARRAYFIELD", SqlTypeName.ANY)
    +
    +                // TODO: replace field definitions with belows while Calcite can handle properly
    --- End diff --
    
    Defining multi-level nest maps with types at each level is tedious for the user. The current approach is to define the column as ANY (like Object) and do unchecked type cast in the generated code. While this may produce class cast runtime exception if the value type doesn't match, its more convenient syntax when the value types are going to match. Like I mentioned in earlier comment, we could try to keep the current syntax for CAST with our own logic to support multi-level maps / arrays until we find a more intuitive way to support it via calcite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80594627
  
    --- Diff: pom.xml ---
    @@ -264,7 +264,7 @@
             <junit.version>4.11</junit.version>
             <metrics-clojure.version>2.5.1</metrics-clojure.version>
             <hdrhistogram.version>2.1.7</hdrhistogram.version>
    -        <calcite.version>1.4.0-incubating</calcite.version>
    +        <calcite.version>1.9.0</calcite.version>
             <janino.version>2.7.8</janino.version>
    --- End diff --
    
    OK we can delete this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Filed [STORM-2135](https://issues.apache.org/jira/browse/STORM-2135) and [STORM-2136](https://issues.apache.org/jira/browse/STORM-2136). I'd like to get some helps on these issues, but don't mind to work on these by myself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80415245
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                          "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200");
    +                     "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200");
         StormSql sql = StormSql.construct();
         List<Values> values = new ArrayList<>();
         ChannelHandler h = new TestUtils.CollectDataChannelHandler(values);
         sql.execute(stmt, h);
         System.out.println(values);
         Map<String, Integer> map = ImmutableMap.of("b", 2, "c", 4);
         Map<String, Map<String, Integer>> nestedMap = ImmutableMap.of("a", map);
    -    Assert.assertEquals(new Values(2, map, nestedMap, Arrays.asList(100, 200, 300)), values.get(0));
    +    Assert.assertEquals(new Values(2, 4, nestedMap, Arrays.asList(100, 200, 300)), values.get(0));
       }
     
    -  @Test
    +  @Test(expected = RuntimeException.class)
       public void testExternalNestedInvalidAccess() throws Exception {
         List<String> stmt = new ArrayList<>();
    +    // this triggers java.lang.RuntimeException: Cannot convert null to int
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
         stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    -                     "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD['a'] = 200");
    --- End diff --
    
    This is checking invalid array access (array field being accessed by a string index instead of integer). Can you keep earlier test case and add a new one for the null check?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    @arunmahadevan 
    Regarding Calcite release, Calcite community is waiting for one issue to get resolved before starting RC. I guess RC will be available in this week or so. 
    I'll watch Calcite community and do some change (Calcite version) and let you know again. 
    (I didn't think about this but I need to modify this PR slightly because of Calcite version. ;) )
    
    Regarding documenting available features, actually I didn't test all of SQL reference on Calcite. :) We can add test cases to see what features Storm SQL is able to provide, and adopt SQL reference page with some modifications. I'll file a follow up JIRA.
    
    Regarding examples, yes, we don't have any useful documents or examples so it should be next work of Storm SQL. I'll file a follow up JIRA separately.
    
    Thanks for reviewing!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Just a head up: my patch for Calcite (https://github.com/apache/calcite/pull/283) passes the review and in progress of merging. 
    (handling some comment fixes.)
    So I'm expecting that we can get runtime safety with next version of Calcite.
    
    Do we want to hold on waiting next version of Calcite? Or just ship this as it is for now, and I will submit another pull request following up the change of Calcite?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80591597
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/TestCompilerUtils.java ---
    @@ -153,6 +158,29 @@ public static CalciteState sqlOverNestedTable(String sql)
                     .field("MAPFIELD", SqlTypeName.ANY)
                     .field("NESTEDMAPFIELD", SqlTypeName.ANY)
                     .field("ARRAYFIELD", SqlTypeName.ANY)
    +
    +                // TODO: replace field definitions with belows while Calcite can handle properly
    --- End diff --
    
    From what I'm experimenting, nested map access seems supported, but its type should be ANY instead of MAP. I'll remove this TODO for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80425461
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/backends/trident/TestPlanCompiler.java ---
    @@ -255,6 +265,198 @@ public void testUdaf() throws Exception {
         Assert.assertArrayEquals(new Values[] { new Values(0, 5L, 15L, 15L) }, getCollectedValues().toArray());
       }
     
    +  @Test
    +  public void testLike() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +
    +    // 'abcd', 'abcde' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME LIKE '%c_%'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(3), new Values(4) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testSimilar() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +
    +    // 'abc' and 'abcd' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME SIMILAR TO '[a-zA-Z]+[cd]{1}'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(2), new Values(3) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotLike() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 3;
    +
    +    // 'a', 'ab', 'abc' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT LIKE '%c_%'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(0), new Values(1), new Values(2) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotSimilar() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 3;
    +
    +    // 'a', 'ab', 'abcde' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT SIMILAR TO '[a-zA-Z]+[cd]{1}'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(0), new Values(1), new Values(4) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testIn() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 4;
    +    String sql = "SELECT ID FROM FOO WHERE NAME IN ('a', 'abc', 'ab', 'abcde')";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values(0), new Values(1), new Values(2), new Values(4)}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotIn() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT IN ('a', 'abc', 'abcde')";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values(1), new Values(3)}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testCaseStatement() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 5;
    +    String sql = "SELECT CASE WHEN NAME IN ('a', 'abc', 'abcde') THEN UPPER('a') " +
    +            "WHEN UPPER(NAME) = 'AB' THEN 'b' ELSE {fn CONCAT(NAME, '#')} END FROM FOO";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values("A"), new Values("b"), new Values("A"), new Values("abcd#"), new Values("A")}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNested() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 1;
    +    String sql = "SELECT ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
    +            "FROM FOO " +
    +            "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverNestedTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentNestedDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Map<String, Integer> map = ImmutableMap.of("b", 2, "c", 4);
    +    Map<String, Map<String, Integer>> nestedMap = ImmutableMap.of("a", map);
    +    Assert.assertArrayEquals(new Values[]{new Values(2, 4, nestedMap, Arrays.asList(100, 200, 300))}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testDateKeywords() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 1;
    +    String sql = "SELECT " +
    +            "LOCALTIME, CURRENT_TIME, LOCALTIMESTAMP, CURRENT_TIMESTAMP, CURRENT_DATE " +
    --- End diff --
    
    Is there some way to output formatted current_date like '2016-09-26' instead of an integer? May be we should also document what are the different date time and other functions supported by storm-sql.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80444129
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/compiler/backends/trident/TestPlanCompiler.java ---
    @@ -255,6 +265,198 @@ public void testUdaf() throws Exception {
         Assert.assertArrayEquals(new Values[] { new Values(0, 5L, 15L, 15L) }, getCollectedValues().toArray());
       }
     
    +  @Test
    +  public void testLike() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +
    +    // 'abcd', 'abcde' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME LIKE '%c_%'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(3), new Values(4) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testSimilar() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +
    +    // 'abc' and 'abcd' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME SIMILAR TO '[a-zA-Z]+[cd]{1}'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(2), new Values(3) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotLike() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 3;
    +
    +    // 'a', 'ab', 'abc' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT LIKE '%c_%'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(0), new Values(1), new Values(2) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotSimilar() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 3;
    +
    +    // 'a', 'ab', 'abcde' matched
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT SIMILAR TO '[a-zA-Z]+[cd]{1}'";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(),
    +            f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[] { new Values(0), new Values(1), new Values(4) }, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testIn() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 4;
    +    String sql = "SELECT ID FROM FOO WHERE NAME IN ('a', 'abc', 'ab', 'abcde')";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values(0), new Values(1), new Values(2), new Values(4)}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNotIn() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 2;
    +    String sql = "SELECT ID FROM FOO WHERE NAME NOT IN ('a', 'abc', 'abcde')";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values(1), new Values(3)}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testCaseStatement() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 5;
    +    String sql = "SELECT CASE WHEN NAME IN ('a', 'abc', 'abcde') THEN UPPER('a') " +
    +            "WHEN UPPER(NAME) = 'AB' THEN 'b' ELSE {fn CONCAT(NAME, '#')} END FROM FOO";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverDummyTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Assert.assertArrayEquals(new Values[]{new Values("A"), new Values("b"), new Values("A"), new Values("abcd#"), new Values("A")}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testNested() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 1;
    +    String sql = "SELECT ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
    +            "FROM FOO " +
    +            "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200";
    +    TestCompilerUtils.CalciteState state = TestCompilerUtils.sqlOverNestedTable(sql);
    +
    +    final Map<String, ISqlTridentDataSource> data = new HashMap<>();
    +    data.put("FOO", new TestUtils.MockSqlTridentNestedDataSource());
    +    PlanCompiler compiler = new PlanCompiler(data, typeFactory, dataContext);
    +    final AbstractTridentProcessor proc = compiler.compileForTest(state.tree());
    +    final TridentTopology topo = proc.build(data);
    +    Fields f = proc.outputStream().getOutputFields();
    +    proc.outputStream().partitionPersist(new TestUtils.MockStateFactory(), f, new TestUtils.MockStateUpdater(), new Fields());
    +    runTridentTopology(EXPECTED_VALUE_SIZE, proc, topo);
    +
    +    Map<String, Integer> map = ImmutableMap.of("b", 2, "c", 4);
    +    Map<String, Map<String, Integer>> nestedMap = ImmutableMap.of("a", map);
    +    Assert.assertArrayEquals(new Values[]{new Values(2, 4, nestedMap, Arrays.asList(100, 200, 300))}, getCollectedValues().toArray());
    +  }
    +
    +  @Test
    +  public void testDateKeywords() throws Exception {
    +    int EXPECTED_VALUE_SIZE = 1;
    +    String sql = "SELECT " +
    +            "LOCALTIME, CURRENT_TIME, LOCALTIMESTAMP, CURRENT_TIMESTAMP, CURRENT_DATE " +
    --- End diff --
    
    There should be other function available, and we can provide some if it doesn't. Agreed on documenting supporting functions / features / behaviors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80418615
  
    --- Diff: external/sql/storm-sql-runtime/src/jvm/org/apache/calcite/interpreter/StormDataContext.java ---
    @@ -0,0 +1,79 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.calcite.interpreter;
    --- End diff --
    
    shouldnt this be under org.apache.storm.sql ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80425683
  
    --- Diff: pom.xml ---
    @@ -264,7 +264,7 @@
             <junit.version>4.11</junit.version>
             <metrics-clojure.version>2.5.1</metrics-clojure.version>
             <hdrhistogram.version>2.1.7</hdrhistogram.version>
    -        <calcite.version>1.4.0-incubating</calcite.version>
    +        <calcite.version>1.9.0</calcite.version>
             <janino.version>2.7.8</janino.version>
    --- End diff --
    
    If we dont use janino anymore we should remove this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Another head up: my patch got merged to Calcite master, and Calcite dev@ is discussing about releasing new minor version soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    Added full of scalar function tests which Calcite provides. Calcite 1.9.0 has some bugs so not all functions are working well, but it can be fixed from Calcite side.
    (I left comments to test codes about bugs I found.)
    Once we track and document the bugs to our doc. page, I think it would be OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1714: STORM-2125 Use Calcite's implementation of Rex Com...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1714#discussion_r80414653
  
    --- Diff: external/sql/storm-sql-core/src/test/org/apache/storm/sql/TestStormSql.java ---
    @@ -174,26 +173,42 @@ public void testExternalDataSource() throws Exception {
       public void testExternalDataSourceNested() throws Exception {
         List<String> stmt = new ArrayList<>();
         stmt.add("CREATE EXTERNAL TABLE FOO (ID INT, MAPFIELD ANY, NESTEDMAPFIELD ANY, ARRAYFIELD ANY) LOCATION 'mocknested:///foo'");
    -    stmt.add("SELECT STREAM ID, MAPFIELD, NESTEDMAPFIELD, ARRAYFIELD " +
    +    stmt.add("SELECT STREAM ID, MAPFIELD['c'], NESTEDMAPFIELD, ARRAYFIELD " +
                          "FROM FOO " +
    -                     "WHERE NESTEDMAPFIELD['a']['b'] = 2 AND ARRAYFIELD[1] = 200");
    +                     "WHERE CAST(MAPFIELD['b'] AS INTEGER) = 2 AND CAST(ARRAYFIELD[1] AS INTEGER) = 200");
    --- End diff --
    
    1. Is the CAST required since the linq4j does type cast the result in the generated code ?
    2. Is it not possible to support Multi-dimensional array/map lookups with the new implementation? Can we have storm-sql generate code for CAST and delegate the rest to calcite and keep the earlier syntax ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1714: STORM-2125 Use Calcite's implementation of Rex Compiler

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1714
  
    I don't want to get blocked by non-SQL standard support unless we have clear use case or clear way to implement it. If my understanding is right, both Spark and Flink seems not support this as well, and also Samza SQL too.
    
    When this gets merged into master, I expect Storm SQL will support most of SQL reference page on Calcite. We can add tests for checking features availability and document supported features, and it could take some time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---