You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/07/29 05:00:47 UTC

[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/14397

    [SPARK-16771][SQL] WITH clause should not fall into infinite loop.

    ## What changes were proposed in this pull request?
    
    This PR changes the table resolving rule to use database tables before CTE tables in order to prevent infinite loops on CTE table name resolution.
    
    **Reported Error Scenarios**
    ```scala
    scala> spark.range(10).createOrReplaceTempView("t")
    scala> sql("WITH t AS (SELECT 1 FROM t) SELECT * FROM t")
    java.lang.StackOverflowError
    ...
    ```
    
    ```scala
    scala> spark.range(10).createOrReplaceTempView("t1")
    scala> spark.range(10).createOrReplaceTempView("t2")
    scala> sql("WITH t1 AS (SELECT 1 FROM t2), t2 AS (SELECT 1 FROM t1) SELECT * FROM t1, t2")
    java.lang.StackOverflowError
    ...
    ```
    
    
    ## How was this patch tested?
    
    Pass the Jenkins tests with new two testcases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-16771-TREENODE

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14397.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14397
    
----
commit 5bca528d971cad317f748cb3f2d050ce0b8f99a7
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-07-29T04:41:20Z

    [SPARK-16771][SQL] WITH clause should not fall into infinite loop.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74153582
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    +        var resolvedRelations = Seq(relations.head._1 -> ResolveRelations(relations.head._2))
    +        relations.tail.foreach { case (name, relation) =>
    +          resolvedRelations = resolvedRelations ++
    +            Seq(name -> ResolveRelations(substituteCTE(relation, resolvedRelations)))
    +        }
    +        substituteCTE(child, resolvedRelations)
    --- End diff --
    
    Here, the CTEs of `WITH` clause are resolved sequentially before applying to main SQL body.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63526/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Thank you so much! Then, see you later. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #62995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62995/consoleFull)** for PR 14397 at commit [`5bca528`](https://github.com/apache/spark/commit/5bca528d971cad317f748cb3f2d050ce0b8f99a7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63578 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63578/consoleFull)** for PR 14397 at commit [`178813e`](https://github.com/apache/spark/commit/178813ebf6e7d5f58ebab7784e07bfd5b8c5d883).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211258
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    +        var resolvedRelations = Seq(relations.head._1 -> ResolveRelations(relations.head._2))
    +        relations.tail.foreach { case (name, relation) =>
    +          resolvedRelations = resolvedRelations ++
    +            Seq(name -> ResolveRelations(substituteCTE(relation, resolvedRelations)))
    +        }
    +        substituteCTE(child, resolvedRelations)
           case other => other
         }
     
    -    def substituteCTE(plan: LogicalPlan, cteRelations: Map[String, LogicalPlan]): LogicalPlan = {
    -      plan transform {
    -        // In hive, if there is same table name in database and CTE definition,
    -        // hive will use the table in database, not the CTE one.
    -        // Taking into account the reasonableness and the implementation complexity,
    -        // here use the CTE definition first, check table name only and ignore database name
    -        // see https://github.com/apache/spark/pull/4929#discussion_r27186638 for more info
    +    def substituteCTE(plan: LogicalPlan, cteRelations: Seq[(String, LogicalPlan)]): LogicalPlan = {
    +      plan transformDown {
             case u : UnresolvedRelation =>
    -          val substituted = cteRelations.get(u.tableIdentifier.table).map { relation =>
    -            val withAlias = u.alias.map(SubqueryAlias(_, relation))
    -            withAlias.getOrElse(relation)
    +          val substituted = cteRelations.find(_._1 == u.tableIdentifier.table).map(_._2).map {
    --- End diff --
    
    Is this use case sensitive resolution? In this case it would be useful if cteRelations *is* a map.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63578 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63578/consoleFull)** for PR 14397 at commit [`178813e`](https://github.com/apache/spark/commit/178813ebf6e7d5f58ebab7784e07bfd5b8c5d883).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    LGTM pending Jenkins.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63577/consoleFull)** for PR 14397 at commit [`624bb3d`](https://github.com/apache/spark/commit/624bb3d9f6ffe558c1897501c06c76f938e15602).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74210759
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    +        var resolvedRelations = Seq(relations.head._1 -> ResolveRelations(relations.head._2))
    --- End diff --
    
    Entire Block: Either use a truly imperative approach or use foldLeft(...). This is a bit of both.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Thank you, @hvanhovell !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    LGTM (again) - merging to master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Thank you for review, @hvanhovell .
    
    For the recursive CTE queries, traditional DBMS supports optional `RECURSIVE` keyword which computing **closures**, i.e., it iterates the subquery until the result converge. For example, PostgreSQL, MSSQL, SQLite supports `WITH RECURSIVE` syntax officially.
    
    For the overlap of `ResolveRelations` rule, this rule calls `lookupRelation` to check the name conflict between the base relations and CTEs. As a side effect, it resolves the relations, too. Here is the reasons.
      - To check the conflict, this is the only simplest way.
      - For the performance, this patch only call one `lookupRelation` for `UnresolvedRelation`.
        - `ResolveRelations` rule do not try resolve again and will not complain about that.
      - The remaining general and exceptional rules like file-based datasource should be in `ResolveRelations`. We can add more datasources there, not here.
    
    For Hive, the recursive queries and `RECURSIVE` syntax are not supported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74373762
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2938,4 +2938,26 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
             data.selectExpr("`part.col1`", "`col.1`"))
         }
       }
    +
    +  test("SPARK-16771: WITH clause should not fall into infinite loop") {
    --- End diff --
    
    In this PR, `exception`s are important and should be checked.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Hi, @rxin . 
    Now the testcases are moved into `sql-tests` with `SQLQueryTestSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63529/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63526/consoleFull)** for PR 14397 at commit [`d896afc`](https://github.com/apache/spark/commit/d896afcf94994c4383aff8b4370ac189d00469fb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #62995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/62995/consoleFull)** for PR 14397 at commit [`5bca528`](https://github.com/apache/spark/commit/5bca528d971cad317f748cb3f2d050ce0b8f99a7).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63577/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63371/consoleFull)** for PR 14397 at commit [`b10ee10`](https://github.com/apache/spark/commit/b10ee106abe89649de1ff0c0b6ca3891f9ef3afb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class With(child: LogicalPlan, cteRelations: Seq[(String, SubqueryAlias)]) extends UnaryNode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211377
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    Never mind you have already done that :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    @dongjoon-hyun 
    #### New behavior versus existing systems
    I was not talking about recursive CTE's (which can be very useful in some cases). We are changing the behavior of the Analyzer, and this might surprise users. I would like to know if there is a common approach to this among other systems; so we can justify the change in behavior.
    
    #### Resolve Relations
    The change you propose is almost subsuming the `ResolveRelations` rule. I think it is correct to to do this. However I am not a big fan of duplicating the behavior in two rules. Do you think there is way to get around this? Could we for instance centralize the logic in `ResolveRelations`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74226355
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    If WITH without relations, I thought we can skip all since there is nothing to be replaced.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Oh, now `Exception` is supported at https://github.com/apache/spark/commit/0db373aaf87991207a7a8a09853b6fa602f0f45b. Thanks, @rxin . I will try again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63595/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63637 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63637/consoleFull)** for PR 14397 at commit [`f1f4a83`](https://github.com/apache/spark/commit/f1f4a833276c5b78309546beca9dd7a5d86590f8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Hi, @hvanhovell .
    Could you review this again? After the last time you've seen, only the testcase is moved into a new testsuite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74227100
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    Oh, I see!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/62995/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63578/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    The above approach also can remove the duplicated scope issue between `CTESubstitution` and `ResolveRelation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r72912387
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -131,17 +131,17 @@ class Analyzer(
     
         def substituteCTE(plan: LogicalPlan, cteRelations: Map[String, LogicalPlan]): LogicalPlan = {
           plan transform {
    -        // In hive, if there is same table name in database and CTE definition,
    -        // hive will use the table in database, not the CTE one.
    -        // Taking into account the reasonableness and the implementation complexity,
    -        // here use the CTE definition first, check table name only and ignore database name
    -        // see https://github.com/apache/spark/pull/4929#discussion_r27186638 for more info
             case u : UnresolvedRelation =>
    -          val substituted = cteRelations.get(u.tableIdentifier.table).map { relation =>
    -            val withAlias = u.alias.map(SubqueryAlias(_, relation))
    -            withAlias.getOrElse(relation)
    +          try {
    +            catalog.lookupRelation(u.tableIdentifier, u.alias)
    --- End diff --
    
    This integrates some of the functionality of the `ResolveRelations` rule, but not all (for instance file based datasources). Shouldn't we be consistent and (perhaps) integrate these rules?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63526/consoleFull)** for PR 14397 at commit [`d896afc`](https://github.com/apache/spark/commit/d896afcf94994c4383aff8b4370ac189d00469fb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63577/consoleFull)** for PR 14397 at commit [`624bb3d`](https://github.com/apache/spark/commit/624bb3d9f6ffe558c1897501c06c76f938e15602).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63595 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63595/consoleFull)** for PR 14397 at commit [`7f74ec7`](https://github.com/apache/spark/commit/7f74ec7745534b40da073c399a6439c9b03a5086).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211188
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -392,11 +392,9 @@ case class InsertIntoTable(
      * This operator will be removed during analysis and the relations will be substituted into child.
      *
      * @param child The final query of this CTE.
    - * @param cteRelations Queries that this CTE defined,
    - *                     key is the alias of the CTE definition,
    - *                     value is the CTE definition.
    + * @param cteRelations A sequence of pair (alias, the CTE definition) that this CTE defined
      */
    -case class With(child: LogicalPlan, cteRelations: Map[String, SubqueryAlias]) extends UnaryNode {
    --- End diff --
    
    Here~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Oh, sorry for misunderstanding, @hvanhovell . I think we can integrate `ResolveRelation` and this one. I'll be back after some testing in this evening. Also for the common approach, I will bring some  references, too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63637/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74504505
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/with.sql ---
    @@ -0,0 +1,14 @@
    +create temporary view t as select * from values 0, 1, 2 as t(id);
    --- End diff --
    
    Here, t1 and t2 is base tables.
    The followings are used CTEs and to check if base table or previous CTE is used correctly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63637 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63637/consoleFull)** for PR 14397 at commit [`f1f4a83`](https://github.com/apache/spark/commit/f1f4a833276c5b78309546beca9dd7a5d86590f8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63529/consoleFull)** for PR 14397 at commit [`f3a2cd4`](https://github.com/apache/spark/commit/f3a2cd43d205735d75f29ff13ebe6acebc30fa98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/63371/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Hi, @hvanhovell .
    Sorry for late update. I updated the PR description and code.
    Could you review this PR again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74372952
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2938,4 +2938,26 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
             data.selectExpr("`part.col1`", "`col.1`"))
         }
       }
    +
    +  test("SPARK-16771: WITH clause should not fall into infinite loop") {
    --- End diff --
    
    can you create a test in SQLQueryTestSuite instead?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74225717
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    This does not replace `With` in all cases. Could you remove the `relations.nonEmpty` guard?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63595 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63595/consoleFull)** for PR 14397 at commit [`7f74ec7`](https://github.com/apache/spark/commit/7f74ec7745534b40da073c399a6439c9b03a5086).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    @dongjoon-hyun this is looking very promising. I left two small comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74504744
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/with.sql ---
    @@ -0,0 +1,14 @@
    +create temporary view t as select * from values 0, 1, 2 as t(id);
    --- End diff --
    
    Ah, I see. You mean `with.sql` into `CTE.sql`.
    I see. No problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74225867
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,22 +125,20 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    +        substituteCTE(child, relations.foldLeft(Seq.empty[(String, LogicalPlan)])((resolved, r) =>
    +          resolved ++ Seq(r._1 -> ResolveRelations(substituteCTE(r._2, resolved)))))
    --- End diff --
    
    `resolved :+ r._1 -> ResolveRelations(substituteCTE(r._2, resolved))`?
    
    You could also deconstruct the `r` tuple for easier reading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Hi, @hvanhovell . It seems not clearly documented, so I did some comparisons.
    
    First of all, I found that I did overlooked the behavior of the Hive queries. Hive also uses CTE names first in the consecutive CTE queries like other DBMS. It's natural.
    ```
    with t as (select 10), s as (select * from t) select * from s;
    ```
    
    For the self recursion, the approaches are different. Hive/Oracle raises exceptions, PostgreSQL uses the base tables.
    ```
    with t as (select 10 from t), s as (select * from t) select * from s;
    ```
    
    The cross referencing case raises exceptions.
    ```
    with t as (select 10 from s), s as (select * from t) select * from s;
    ```
    
    To sum up, the general approach seems to
    - use **forward-declared(not self)** base table or CTE name
    - prevent the execution of cyclic queries by raising exceptions.
    
    The root cause of previous Spark behavior is using `self` or `backward-declared CTE` for table name resolutions. I will try to revise this PR according to the above rules. Please comment about the above rule.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74373681
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2938,4 +2938,26 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
             data.selectExpr("`part.col1`", "`col.1`"))
         }
       }
    +
    +  test("SPARK-16771: WITH clause should not fall into infinite loop") {
    --- End diff --
    
    Ur, @rxin .
    `SQLQueryTestSuite` seems not to support exceptions cases yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74373465
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -2938,4 +2938,26 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
             data.selectExpr("`part.col1`", "`col.1`"))
         }
       }
    +
    +  test("SPARK-16771: WITH clause should not fall into infinite loop") {
    --- End diff --
    
    Sure. I'll move this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Thank you, @rxin . It's updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    @dongjoon-hyun I think this has merit. I do have one question, what do other databases do? Like postgresql, mysql, sqlserver and others?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    @dongjoon-hyun I'll take a look in the morning (CET)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63529/consoleFull)** for PR 14397 at commit [`f3a2cd4`](https://github.com/apache/spark/commit/f3a2cd43d205735d75f29ff13ebe6acebc30fa98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14397


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211736
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -392,11 +392,9 @@ case class InsertIntoTable(
      * This operator will be removed during analysis and the relations will be substituted into child.
      *
      * @param child The final query of this CTE.
    - * @param cteRelations Queries that this CTE defined,
    - *                     key is the alias of the CTE definition,
    - *                     value is the CTE definition.
    + * @param cteRelations A sequence of pair (alias, the CTE definition) that this CTE defined
    --- End diff --
    
    Yep. I'll add that clearly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Hi, @rxin .
    Could you review this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Thank you, @hvanhovell !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74503932
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/with.sql ---
    @@ -0,0 +1,14 @@
    +create temporary view t as select * from values 0, 1, 2 as t(id);
    --- End diff --
    
    maybe CTE instead of "with" ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74226825
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    Yeah you are right about that. But the current code does not remove the `With` node if there are no relations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74504911
  
    --- Diff: sql/core/src/test/resources/sql-tests/inputs/with.sql ---
    @@ -0,0 +1,14 @@
    +create temporary view t as select * from values 0, 1, 2 as t(id);
    --- End diff --
    
    I will use `cte.sql` instead of `with.sql`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74210475
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    `relations` is a map. The iteration sequence of a map does not need the be the ordered in which elements were added (Use a map larger than 4 to see this in action, e.g.: `Seq.tabulate(5)(i => ('a' + i).toChar.toString -> i).toMap`).
    
    So I think we need to change the data type in With to accommodate this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Could you review this PR again, @hvanhovell ?
    Last time, I did a huge mistake, but now I think I fixed correct according to your advice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    **[Test build #63371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/63371/consoleFull)** for PR 14397 at commit [`b10ee10`](https://github.com/apache/spark/commit/b10ee106abe89649de1ff0c0b6ca3891f9ef3afb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211551
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -392,11 +392,9 @@ case class InsertIntoTable(
      * This operator will be removed during analysis and the relations will be substituted into child.
      *
      * @param child The final query of this CTE.
    - * @param cteRelations Queries that this CTE defined,
    - *                     key is the alias of the CTE definition,
    - *                     value is the CTE definition.
    + * @param cteRelations A sequence of pair (alias, the CTE definition) that this CTE defined
    --- End diff --
    
    Perhaps document that the CTE should be ordered in the way that they are defined.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14397: [SPARK-16771][SQL] WITH clause should not fall in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14397#discussion_r74211126
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -125,21 +125,23 @@ class Analyzer(
       object CTESubstitution extends Rule[LogicalPlan] {
         // TODO allow subquery to define CTE
         def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators  {
    -      case With(child, relations) => substituteCTE(child, relations)
    +      case With(child, relations) if relations.nonEmpty =>
    --- End diff --
    
    So, I changed it to `Seq`. Now, `relations` of `With` is a sequence.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14397: [SPARK-16771][SQL] WITH clause should not fall into infi...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/14397
  
    Rebased just to resolve conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org