You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by davies <gi...@git.apache.org> on 2015/09/14 22:52:19 UTC

[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

GitHub user davies opened a pull request:

    https://github.com/apache/spark/pull/8755

    [SPARK-10593] [SQL] fix resolve output of Generate

    The output of Generate should not be resolved as Reference. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/davies/spark view

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8755.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8755
    
----
commit dd11d2cd224e61bf033e5b35ebe1577d720dbcc6
Author: Davies Liu <da...@databricks.com>
Date:   2015-09-14T20:50:46Z

    fix resolve output of Generate

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142363772
  
    LGTM. Merging to branch 1.5 and master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140489665
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142106834
  
      [Test build #42772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42772/consoleFull) for   PR 8755 at commit [`6b02a41`](https://github.com/apache/spark/commit/6b02a416a7b0246048e9c67468383586afdde7f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r39899010
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala ---
    @@ -68,13 +68,47 @@ case class Generate(
         generator.resolved &&
           childrenResolved &&
           generator.elementTypes.length == generatorOutput.length &&
    -      !generatorOutput.exists(!_.resolved)
    +      generatorOutput.forall(_.resolved)
       }
     
       // we don't want the gOutput to be taken as part of the expressions
       // as that will cause exceptions like unresolved attributes etc.
       override def expressions: Seq[Expression] = generator :: Nil
     
    +  /**
    +   * Runs [[transformUp]] with `rule` on all expressions present in this query operator.
    +   * @param rule the rule to be applied to every expression in this operator.
    +   *
    +   * Note: This is used by ResolveReferences, we need to override it to not resolve
    +   * `generatorOutput`.
    +   */
    +  override def transformExpressionsUp(rule: PartialFunction[Expression, Expression]): this.type = {
    +    var changed = false
    +
    +    @inline def transformExpressionUp(e: Expression): Expression = {
    +      val newE = e.transformUp(rule)
    +      if (newE.fastEquals(e)) {
    +        e
    +      } else {
    +        changed = true
    +        newE
    +      }
    +    }
    +
    +    def recursiveTransform(arg: Any): AnyRef = arg match {
    --- End diff --
    
     The reason I switched to in way is to isolate the change, because `expressions.contains(e)` could be expensive for wider tables (there are many columns). Should we improve the previous one (using lazy val and Set)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142156730
  
      [Test build #1782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1782/console) for   PR 8755 at commit [`6b02a41`](https://github.com/apache/spark/commit/6b02a416a7b0246048e9c67468383586afdde7f6).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142141752
  
      [Test build #1782 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1782/consoleFull) for   PR 8755 at commit [`6b02a41`](https://github.com/apache/spark/commit/6b02a416a7b0246048e9c67468383586afdde7f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/8755


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r39544824
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala ---
    @@ -115,8 +114,8 @@ abstract class QueryPlan[PlanType <: TreeNode[PlanType]] extends TreeNode[PlanTy
         }
     
         def recursiveTransform(arg: Any): AnyRef = arg match {
    -      case e: Expression => transformExpressionUp(e)
    -      case Some(e: Expression) => Some(transformExpressionUp(e))
    +      case e: Expression if expressions.contains(e) => transformExpressionUp(e)
    +      case Some(e: Expression) if expressions.contains(e) => Some(transformExpressionUp(e))
    --- End diff --
    
    Good question, ResolveReferences only uses transformExpressionsUp, also Generate is the only special case that need it. So it's enough for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140203880
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142139529
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42772/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140607363
  
      [Test build #1764 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1764/console) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140587295
  
      [Test build #1764 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1764/consoleFull) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-141567958
  
    `generator.elementTypes` is used to stored the data types. We can create new attributes based on the name and data types. Can we fix it by changing `ResolveReferences` (having a case to handle generate)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140497250
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140506048
  
      [Test build #1757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1757/console) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140205541
  
      [Test build #42442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42442/consoleFull) for   PR 8755 at commit [`dd11d2c`](https://github.com/apache/spark/commit/dd11d2cd224e61bf033e5b35ebe1577d720dbcc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r40119174
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -378,6 +378,22 @@ class Analyzer(
             val newOrdering = resolveSortOrders(ordering, child, throws = false)
             Sort(newOrdering, global, child)
     
    +      // A special case for Generate, because the output of Generate should not be resolved by
    +      // ResolveReferences.
    --- End diff --
    
    This comment looks a little bit weird because this case is a part of ResolveReferences. Also, it will be good to say that `ResolveGenerate` will create `AttributeReference`s  for `output`. I will fix it when merging it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142139479
  
      [Test build #42772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42772/console) for   PR 8755 at commit [`6b02a41`](https://github.com/apache/spark/commit/6b02a416a7b0246048e9c67468383586afdde7f6).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142105451
  
    @yhuai @cloud-fan I had changed to use a special case for Generate in ResolveReferrence, please take another look, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142106167
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142345267
  
    This is a simple fix and it works. However, it still seems hacky to me and I'm thinking about introducing an `UnresolvedGenerate`. That will require more code changes and I'm not sure if we should do it. Maybe we should merge this one first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140497252
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42495/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r39545697
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala ---
    @@ -115,8 +114,8 @@ abstract class QueryPlan[PlanType <: TreeNode[PlanType]] extends TreeNode[PlanTy
         }
     
         def recursiveTransform(arg: Any): AnyRef = arg match {
    -      case e: Expression => transformExpressionUp(e)
    -      case Some(e: Expression) => Some(transformExpressionUp(e))
    +      case e: Expression if expressions.contains(e) => transformExpressionUp(e)
    +      case Some(e: Expression) if expressions.contains(e) => Some(transformExpressionUp(e))
    --- End diff --
    
    To be safe, I will move this change into Generate.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140229882
  
      [Test build #42442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42442/console) for   PR 8755 at commit [`dd11d2c`](https://github.com/apache/spark/commit/dd11d2cd224e61bf033e5b35ebe1577d720dbcc6).
     * This patch **passes all tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140497114
  
      [Test build #42495 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42495/console) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r39464731
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala ---
    @@ -115,8 +114,8 @@ abstract class QueryPlan[PlanType <: TreeNode[PlanType]] extends TreeNode[PlanTy
         }
     
         def recursiveTransform(arg: Any): AnyRef = arg match {
    -      case e: Expression => transformExpressionUp(e)
    -      case Some(e: Expression) => Some(transformExpressionUp(e))
    +      case e: Expression if expressions.contains(e) => transformExpressionUp(e)
    +      case Some(e: Expression) if expressions.contains(e) => Some(transformExpressionUp(e))
    --- End diff --
    
    Should we do this for `transformExpressionsUp`, `transformAllExpressions` too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142139528
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140489684
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-141563745
  
    @cloud-fan I tried to change the output to be `Seq[String]` but failed, because after resolve Generate, we still need to a Generate with Attribute.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140491302
  
      [Test build #42495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42495/consoleFull) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140229972
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42442/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140243693
  
    Hi @davies , I think the idea of only transform expressions that defined in `expressions` is not good enough. The key problem I see is we are using `Attribute` to represent the output name and type. However, the type info is already in `generator.elementTypes` and we only need names, i.e. use `Seq[String]` instead of `Seq[Arrtibute]` for `generatorOutput`.
    
    What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142349045
  
    I think this seems fine for now.  Generate has always been a weird case since its one of the few places new attributes just appear in non-leaf nodes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8755#discussion_r39550885
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala ---
    @@ -68,13 +68,47 @@ case class Generate(
         generator.resolved &&
           childrenResolved &&
           generator.elementTypes.length == generatorOutput.length &&
    -      !generatorOutput.exists(!_.resolved)
    +      generatorOutput.forall(_.resolved)
       }
     
       // we don't want the gOutput to be taken as part of the expressions
       // as that will cause exceptions like unresolved attributes etc.
       override def expressions: Seq[Expression] = generator :: Nil
     
    +  /**
    +   * Runs [[transformUp]] with `rule` on all expressions present in this query operator.
    +   * @param rule the rule to be applied to every expression in this operator.
    +   *
    +   * Note: This is used by ResolveReferences, we need to override it to not resolve
    +   * `generatorOutput`.
    +   */
    +  override def transformExpressionsUp(rule: PartialFunction[Expression, Expression]): this.type = {
    +    var changed = false
    +
    +    @inline def transformExpressionUp(e: Expression): Expression = {
    +      val newE = e.transformUp(rule)
    +      if (newE.fastEquals(e)) {
    +        e
    +      } else {
    +        changed = true
    +        newE
    +      }
    +    }
    +
    +    def recursiveTransform(arg: Any): AnyRef = arg match {
    --- End diff --
    
    Duplicating this code is kind of scary to me.  I think I like the other fix better.  Though ideally, I think that we would just fix the Analyzer to understand that generate is special.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140499831
  
      [Test build #1757 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1757/consoleFull) for   PR 8755 at commit [`887474e`](https://github.com/apache/spark/commit/887474e6908ea5f31108065d8c16f6ce5e88782d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140203892
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-142106199
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-10593] [SQL] fix resolve output of Gene...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/8755#issuecomment-140229970
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org