You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by ioana-delaney <gi...@git.apache.org> on 2016/05/31 18:59:51 UTC

[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

GitHub user ioana-delaney opened a pull request:

    https://github.com/apache/spark/pull/13418

    [SPARK-15677][SQL] Query with scalar sub-query in the SELECT list throws UnsupportedOperationException

    ## What changes were proposed in this pull request?
    Queries with scalar sub-query in the SELECT list run against a local, in-memory relation throw 
    UnsupportedOperationException exception.
    
    Problem repro:
    ```SQL
    scala> Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t1")
    scala> Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t2")
    scala> sql("select (select min(c1) from t2) from t1").show()
    
    java.lang.UnsupportedOperationException: Cannot evaluate expression: scalar-subquery#62 []
      at org.apache.spark.sql.catalyst.expressions.Unevaluable$class.eval(Expression.scala:215)
      at org.apache.spark.sql.catalyst.expressions.ScalarSubquery.eval(subquery.scala:62)
      at org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:142)
      at org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:45)
      at org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:29)
      at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
      at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
      at scala.collection.immutable.List.foreach(List.scala:381)
      at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
      at scala.collection.immutable.List.map(List.scala:285)
      at org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$37.applyOrElse(Optimizer.scala:1473)
    ```
    The problem is specific to local, in memory relations. It is caused by rule ConvertToLocalRelation, which attempts to push down 
    a scalar-subquery expression to the local tables. 
    
    The solution prevents the rule to apply if Project references scalar subqueries.
    
    ## How was this patch tested?
    Added regression tests to SubquerySuite.scala
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ioana-delaney/spark scalarSubV2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13418.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13418
    
----
commit faded1df9de00185e02adcffe09a473fc46cbf14
Author: Ioana Delaney <io...@gmail.com>
Date:   2016-05-31T18:53:48Z

    [SPARK-15677] Query with scalar sub-query in the SELECT list throws UnsupportedOperationException.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65594703
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ---
    @@ -121,6 +123,16 @@ class SubquerySuite extends QueryTest with SharedSQLContext {
             " where key = (select max(key) from subqueryData) - 1)"),
           Array(Row("two"))
         )
    +
    +    checkAnswer(
    --- End diff --
    
    I think it's better to create a new test case for it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13418
  
    **[Test build #3033 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3033/consoleFull)** for PR 13418 at commit [`faded1d`](https://github.com/apache/spark/commit/faded1df9de00185e02adcffe09a473fc46cbf14).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65609083
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ---
    @@ -123,6 +123,31 @@ class SubquerySuite extends QueryTest with SharedSQLContext {
         )
       }
     
    +  test("SPARK-15677: Scalar sub-query in Select list against a DataFrame generated query") {
    --- End diff --
    
    maybe we should mention that this bug only exists in local relation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    whoops triggered a build on this one... sorry about that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13418
  
    **[Test build #3033 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3033/consoleFull)** for PR 13418 at commit [`faded1d`](https://github.com/apache/spark/commit/faded1df9de00185e02adcffe09a473fc46cbf14).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    @cloud-fan Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65609442
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ---
    @@ -123,6 +123,31 @@ class SubquerySuite extends QueryTest with SharedSQLContext {
         )
       }
     
    +  test("SPARK-15677: Scalar sub-query in Select list against a DataFrame generated query") {
    +    Seq((1, 1), (2, 2)).toDF("c1", "c2").createOrReplaceTempView("t1")
    --- End diff --
    
    please use `withTempTable`(ok the name is wrong for history reasons, it should be `withTempView`), which will drop the view after the test for you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    LGTM pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65273025
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
    --- End diff --
    
    Is it more general to check unvaluable expressions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    @cloud-fan @gatorsmile @davies @rxin @hvanhovell Thank you all. This was my first PR!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65438111
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ---
    @@ -84,6 +84,13 @@ object ScalarSubquery {
           case _ => false
         }.isDefined
       }
    +
    +  def hasScalarSubquery(e: Expression): Boolean = {
    +    e.find {
    --- End diff --
    
    @rxin Thank you for the review. I aligned my code to the existing implementation. But I can replace the method call with your suggestion.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    **[Test build #3045 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3045/consoleFull)** for PR 13418 at commit [`77166bc`](https://github.com/apache/spark/commit/77166bc5a6989382d628ba293fd9751c2ca3e20a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    @gatorsmile @davies @rxin @cloud-fan I've incorporated the comments. Please advise. Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    thanks, merging to master and 2.0!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    **[Test build #3045 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3045/consoleFull)** for PR 13418 at commit [`77166bc`](https://github.com/apache/spark/commit/77166bc5a6989382d628ba293fd9751c2ca3e20a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65441184
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
    --- End diff --
    
    I think AttributeReference is the only exception, it will be replaced to BoundReference when create an Projection, we could have a special case for that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    LGTM except one test comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    @cloud-fan I moved the unit tests to a new test case. Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13418
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65294821
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
    --- End diff --
    
    @davies Thank you for the comment. I am looking into generalizing the condition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65455758
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
    --- End diff --
    
    +1 to catch `Unevaluable` and special case `AttributeReference`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65437967
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,7 +1468,8 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(ScalarSubquery.hasScalarSubquery) =>
    --- End diff --
    
    @davies Sorry for the delay in replying. I am new to the Spark code. I've looked at Unevaluable expressions. My findings are that checking for Unevaluable expressions would be too general since a lot of expressions mix in this trait. For example, AttributeReference is one of them. If we explicitly check for Unevaluable expressions, a simple query of the form "select c1 from t1"
    would be regressed. Let me know I misunderstood your requirement. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    Thank you @cloud-fan. I mentioned the local relations in the test case description and move the test cases under withTempTable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65302021
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala ---
    @@ -84,6 +84,13 @@ object ScalarSubquery {
           case _ => false
         }.isDefined
       }
    +
    +  def hasScalarSubquery(e: Expression): Boolean = {
    +    e.find {
    --- End diff --
    
    ```
    e.find(_.isInstanceOf[ScalarSubquery]).isDefined
    ```
    
    ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13418#discussion_r65484867
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1468,10 +1468,15 @@ object DecimalAggregates extends Rule[LogicalPlan] {
      */
     object ConvertToLocalRelation extends Rule[LogicalPlan] {
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    -    case Project(projectList, LocalRelation(output, data)) =>
    +    case p @ Project(projectList, LocalRelation(output, data))
    +        if !p.expressions.exists(hasUnevaluableExpr) =>
    --- End diff --
    
    `p.expressions` is just the `projectList`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13418: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by ioana-delaney <gi...@git.apache.org>.
Github user ioana-delaney commented on the issue:

    https://github.com/apache/spark/pull/13418
  
    @cloud-fan I replaced p.expressions with projectList. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13418: [SPARK-15677][SQL] Query with scalar sub-query in...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13418


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-15677][SQL] Query with scalar sub-query in the SE...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/13418
  
    LGTM CC @hvanhovell @davies @cloud-fan 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org