You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2016/02/15 19:07:07 UTC

[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/11208

    [SPARK-13320] [SQL] Star Expansion for Dataframe/Dataset Functions

    This PR resolves two issues:
    
    First, expanding * inside aggregate functions of structs when using Dataframe/Dataset APIs. For example, 
    ```scala
    structDf.groupBy($"a").agg(min(struct($"record.*")))
    ```
    
    Second, it improves the error messages when invalid star usage. 
    ```scala
    pagecounts4PartitionsDS
      .map(line => (line._1, line._3))
      .toDF()
      .groupBy($"_1")
      .agg(sum("*") as "sumOccurances")
    ```
    Before the fix, the invalid usage will issue confusing the error message:
    ```
    org.apache.spark.sql.AnalysisException: cannot resolve '_1' given input columns _1, _2;
    ```
    
    cc: @rxin @nongli @cloud-fan 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark sumDataSetResolution

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11208.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11208
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r54048383
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    --- End diff --
    
    Why `expandStarExpressions(a.aggregateExpression, a.child)` does not work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-194169593
  
    **[Test build #52727 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52727/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-188141707
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198759454
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56783882
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ---
    @@ -737,6 +737,14 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton {
           .queryExecution.analyzed
       }
     
    +  test("Star Expansion - script transform") {
    +    val data = (1 to 100000).map { i => (i, i, i) }
    +    data.toDF("d1", "d2", "d3").registerTempTable("script_trans")
    +    assert(100000 ===
    +      sql("SELECT TRANSFORM (*) USING 'cat' FROM script_trans")
    +        .queryExecution.toRdd.count())
    --- End diff --
    
    is it equal to `sql("...").count()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195569043
  
    **[Test build #52952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52952/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189566461
  
    retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184420424
  
    cc @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53722149
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -42,6 +43,7 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val f = udf((a: String) => a)
         val df = sparkContext.parallelize(Seq((1, 1))).toDF("a", "b")
         df.select(array($"a").as("s")).select(f(expr("s[0]"))).collect()
    +    df.select(array($"*").as("s")).select(f(expr("s[0]"))).collect()
    --- End diff --
    
    not needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56783810
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,43 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +
    +    // CreateStruct and CreateArray in project list (alias)
    +    assert(structDf.select(struct($"record.*").as("a")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*").as("a")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +  }
    +
    +  test("Star Expansion - explode should fail with a meaningful message if it takes a star") {
    +    val df = Seq(("1", "1,2"), ("2", "4"), ("3", "7,8,9")).toDF("prefix", "csv")
    +    val e = intercept[AnalysisException] {
    +      df.explode($"*") { case Row(prefix: String, csv: String) =>
    +        csv.split(",").map(v => Tuple1(prefix + ":" + v)).toSeq
    +      }.queryExecution.assertAnalyzed()
    +    }
    +    assert(e.getMessage.contains("Invalid usage of '*' in explode/json_tuple/UDTF"))
    +
    +    df.explode('prefix, 'csv) { case Row(prefix: String, csv: String) =>
    --- End diff --
    
    put it in `checkAnswer` to make sure the execution is right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186124520
  
    **[Test build #51536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51536/consoleFull)** for PR 11208 at commit [`2c72edf`](https://github.com/apache/spark/commit/2c72edf662b037b0dba845f81e95dadfc35bf648).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187464975
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199150318
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53655/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199150315
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184361963
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199179856
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53661/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195569065
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195777465
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187558491
  
    **[Test build #51729 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51729/consoleFull)** for PR 11208 at commit [`6b2d609`](https://github.com/apache/spark/commit/6b2d60996831fd216b4821e62ed9bea5a3892ab5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53427972
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala ---
    @@ -528,6 +528,14 @@ class DatasetSuite extends QueryTest with SharedSQLContext {
         assert(e.getMessage.contains("cannot resolve 'c' given input columns: [a, b]"), e.getMessage)
       }
     
    +  test("verify star in functions fail with a good error") {
    --- End diff --
    
    This is copied from the example in the original JIRA. Let me move it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53899877
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    `explode/json_tuple/UDTF` LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-191232077
  
    **[Test build #52318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52318/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53425152
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala ---
    @@ -528,6 +528,14 @@ class DatasetSuite extends QueryTest with SharedSQLContext {
         assert(e.getMessage.contains("cannot resolve 'c' given input columns: [a, b]"), e.getMessage)
       }
     
    +  test("verify star in functions fail with a good error") {
    --- End diff --
    
    Why put this test case in `DatasetSuite`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53734588
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1938,6 +1938,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Array(1, 1))
    --- End diff --
    
    Done. : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186105681
  
    **[Test build #51536 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51536/consoleFull)** for PR 11208 at commit [`2c72edf`](https://github.com/apache/spark/commit/2c72edf662b037b0dba845f81e95dadfc35bf648).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-191268934
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184467471
  
    Actually we do handle stars in `CreateArray` and `CreateStruct`: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L440-L458, so what you are fixing is the nested `CreateStruct`, I think we should also add `CreateArray` too.
    
    One of my concern is:  sometimes we check stars under `UnresolvedAlias` but sometimes also under `Alias`, it will be good if you can figure it out and make sure there is no missing case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199149411
  
    **[Test build #53661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53661/consoleFull)** for PR 11208 at commit [`0fce075`](https://github.com/apache/spark/commit/0fce0752fb74b4eb49931c36fdd6e43fc2ec04f2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184361964
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51321/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199179614
  
    **[Test build #53661 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53661/consoleFull)** for PR 11208 at commit [`0fce075`](https://github.com/apache/spark/commit/0fce0752fb74b4eb49931c36fdd6e43fc2ec04f2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56854915
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,50 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    --- End diff --
    
    True, let me move them to DataFrameSuite. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198775207
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56782647
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Invalid usage of '*' in explode/json_tuple/UDTF")
    +    }
    +
    +    /**
    +     * Returns true if `exprs` contains a [[Star]].
    +     */
    +    def containsStar(exprs: Seq[Expression]): Boolean =
    +      exprs.exists(_.collect { case _: Star => true }.nonEmpty)
    +
         /**
    -     * Foreach expression, expands the matching attribute.*'s in `child`'s input for the subtree
    -     * rooted at each expression.
    +     * Expands the matching attribute.*'s in `child`'s output.
          */
    -    def expandStarExpressions(exprs: Seq[Expression], child: LogicalPlan): Seq[Expression] = {
    -      exprs.flatMap {
    -        case s: Star => s.expand(child, resolver)
    -        case e =>
    -          e.transformDown {
    -            case f1: UnresolvedFunction if containsStar(f1.children) =>
    -              f1.copy(children = f1.children.flatMap {
    -                case s: Star => s.expand(child, resolver)
    -                case o => o :: Nil
    -              })
    -          } :: Nil
    +    def expandStarExpression(expr: Expression, child: LogicalPlan): Expression = {
    +      expr.transformUp {
    +        case f1: UnresolvedFunction if containsStar(f1.children) =>
    +          f1.copy(children = f1.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateStruct if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateArray if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        // count(*) has been replaced by count(1)
    +        case o if containsStar(o.children) =>
    --- End diff --
    
    Tried it, but `copy` is unable to use here. When the type is `Expression` (abstract type), we are unable to use the `copy` function to change the `children`. In addition, `withNewChildren` requires the same number of children. Do you have any idea how to fix it? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r52959214
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -362,12 +362,26 @@ class Analyzer(
           exprs.flatMap {
             case s: Star => s.expand(child, resolver)
             case e =>
    -          e.transformDown {
    +          e.transformUp {
    +            // ResolveFunctions can handle the case when the number of variables is not valid
                 case f1: UnresolvedFunction if containsStar(f1.children) =>
                   f1.copy(children = f1.children.flatMap {
                     case s: Star => s.expand(child, resolver)
                     case o => o :: Nil
                   })
    +            case c: CreateStruct if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            case c: CreateStructUnsafe if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            // count(*) has been replaced by count(1)
    +            case f2: ExpectsInputTypes if containsStar(f2.children) =>
    --- End diff --
    
    There are 283 expression types, if we exclude 10 `namedExpressions`. Should we cover all these types? Can you help me go over the long list to see if we should include all of them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195792810
  
    **[Test build #53008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53008/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r52928831
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -362,12 +362,26 @@ class Analyzer(
           exprs.flatMap {
             case s: Star => s.expand(child, resolver)
             case e =>
    -          e.transformDown {
    +          e.transformUp {
    +            // ResolveFunctions can handle the case when the number of variables is not valid
                 case f1: UnresolvedFunction if containsStar(f1.children) =>
                   f1.copy(children = f1.children.flatMap {
                     case s: Star => s.expand(child, resolver)
                     case o => o :: Nil
                   })
    +            case c: CreateStruct if containsStar(c.children) =>
    --- End diff --
    
    Not sure if we have the other functions that can accept `star` as an input parameter. If so, I think we need to create a `trait` for all these case classes. Then, we can remove the duplicate code. Any better idea? Thanks! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199419884
  
    **[Test build #53685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53685/consoleFull)** for PR 11208 at commit [`50abeec`](https://github.com/apache/spark/commit/50abeec3ea7fd4da83ac89ed90fc478d493d3dba).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186124744
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197634280
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56781817
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +
    +    // CreateStruct and CreateArray in project list (alias)
    --- End diff --
    
    Sure, let me do it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53722134
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameComplexTypeSuite.scala ---
    @@ -30,6 +30,7 @@ class DataFrameComplexTypeSuite extends QueryTest with SharedSQLContext {
         val f = udf((a: String) => a)
         val df = sparkContext.parallelize(Seq((1, 1))).toDF("a", "b")
         df.select(struct($"a").as("s")).select(f($"s.a")).collect()
    +    df.select(struct($"*").as("s")).select(f($"s.a")).collect()
    --- End diff --
    
    not needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r52958310
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -362,12 +362,26 @@ class Analyzer(
           exprs.flatMap {
             case s: Star => s.expand(child, resolver)
             case e =>
    -          e.transformDown {
    +          e.transformUp {
    +            // ResolveFunctions can handle the case when the number of variables is not valid
                 case f1: UnresolvedFunction if containsStar(f1.children) =>
                   f1.copy(children = f1.children.flatMap {
                     case s: Star => s.expand(child, resolver)
                     case o => o :: Nil
                   })
    +            case c: CreateStruct if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            case c: CreateStructUnsafe if containsStar(c.children) =>
    --- End diff --
    
    `CreateStructUnsafe` only appears after unsafe projection, so I think we don't need to handle it in `Analyzer`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187520700
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53889923
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    just realized the error message is not clear enough, `Generate` is not always "explode"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197635547
  
    cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-191230890
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56785743
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,43 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +
    +    // CreateStruct and CreateArray in project list (alias)
    +    assert(structDf.select(struct($"record.*").as("a")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*").as("a")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +  }
    +
    +  test("Star Expansion - explode should fail with a meaningful message if it takes a star") {
    +    val df = Seq(("1", "1,2"), ("2", "4"), ("3", "7,8,9")).toDF("prefix", "csv")
    +    val e = intercept[AnalysisException] {
    +      df.explode($"*") { case Row(prefix: String, csv: String) =>
    +        csv.split(",").map(v => Tuple1(prefix + ":" + v)).toSeq
    +      }.queryExecution.assertAnalyzed()
    +    }
    +    assert(e.getMessage.contains("Invalid usage of '*' in explode/json_tuple/UDTF"))
    +
    +    df.explode('prefix, 'csv) { case Row(prefix: String, csv: String) =>
    --- End diff --
    
    Sure, will do


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-194138373
  
    **[Test build #52727 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52727/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187559358
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51729/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199128149
  
    **[Test build #53655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53655/consoleFull)** for PR 11208 at commit [`ba3fe7c`](https://github.com/apache/spark/commit/ba3fe7ce3d42e93bfde7ca4e3f893d84cfa82604).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184361528
  
    **[Test build #51321 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51321/consoleFull)** for PR 11208 at commit [`3b2b448`](https://github.com/apache/spark/commit/3b2b448c640eae5b50deb69346409581e8448af3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184335988
  
    **[Test build #51321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51321/consoleFull)** for PR 11208 at commit [`3b2b448`](https://github.com/apache/spark/commit/3b2b448c640eae5b50deb69346409581e8448af3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198775210
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53620/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199555171
  
    thanks! merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186045222
  
    **[Test build #51512 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51512/consoleFull)** for PR 11208 at commit [`ac71f39`](https://github.com/apache/spark/commit/ac71f3913148f97c25eac6957aacf64015532583).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187464976
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51701/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187440006
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56781821
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Invalid usage of '*' in explode/json_tuple/UDTF")
    +    }
    +
    +    /**
    +     * Returns true if `exprs` contains a [[Star]].
    +     */
    +    def containsStar(exprs: Seq[Expression]): Boolean =
    +      exprs.exists(_.collect { case _: Star => true }.nonEmpty)
    +
         /**
    -     * Foreach expression, expands the matching attribute.*'s in `child`'s input for the subtree
    -     * rooted at each expression.
    +     * Expands the matching attribute.*'s in `child`'s output.
          */
    -    def expandStarExpressions(exprs: Seq[Expression], child: LogicalPlan): Seq[Expression] = {
    -      exprs.flatMap {
    -        case s: Star => s.expand(child, resolver)
    -        case e =>
    -          e.transformDown {
    -            case f1: UnresolvedFunction if containsStar(f1.children) =>
    -              f1.copy(children = f1.children.flatMap {
    -                case s: Star => s.expand(child, resolver)
    -                case o => o :: Nil
    -              })
    -          } :: Nil
    +    def expandStarExpression(expr: Expression, child: LogicalPlan): Expression = {
    +      expr.transformUp {
    +        case f1: UnresolvedFunction if containsStar(f1.children) =>
    +          f1.copy(children = f1.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateStruct if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateArray if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        // count(*) has been replaced by count(1)
    +        case o if containsStar(o.children) =>
    --- End diff --
    
    That is a great idea! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56785749
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    --- End diff --
    
    Yeah, a good catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-188141242
  
    **[Test build #51854 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51854/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197634281
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53376/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195792867
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187441241
  
    **[Test build #51701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51701/consoleFull)** for PR 11208 at commit [`2c72edf`](https://github.com/apache/spark/commit/2c72edf662b037b0dba845f81e95dadfc35bf648).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198794464
  
    cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-192036455
  
    cc @yhuai : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-191268151
  
    **[Test build #52318 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52318/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187528976
  
    **[Test build #51729 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51729/consoleFull)** for PR 11208 at commit [`6b2d609`](https://github.com/apache/spark/commit/6b2d60996831fd216b4821e62ed9bea5a3892ab5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199090151
  
    Sorry for putting it here for such a long time, overall LGTM, will merge it after you address the new comments, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53891726
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    True. I moved this from another rule. I will check the coverage of test cases. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-188113972
  
    **[Test build #51854 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51854/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-188141711
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51854/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199420306
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53685/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199149855
  
    **[Test build #53655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53655/consoleFull)** for PR 11208 at commit [`ba3fe7c`](https://github.com/apache/spark/commit/ba3fe7ce3d42e93bfde7ca4e3f893d84cfa82604).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53425125
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1834,6 +1834,8 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           """.stripMargin).select($"r.*"),
           Row(3, 2) :: Nil)
     
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    --- End diff --
    
    We should write a new test case to test `*` in `CreateStruct` and `CreateArray`, not just put in existing ones.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187520703
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51727/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53427991
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1834,6 +1834,8 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
           """.stripMargin).select($"r.*"),
           Row(3, 2) :: Nil)
     
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    --- End diff --
    
    Sure, will do. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199420301
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199126672
  
    @cloud-fan Thank you for your detailed reviews! I know all of you are very busy. Let me know if anything needs a change. Thanks again!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56785740
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala ---
    @@ -737,6 +737,14 @@ class SQLQuerySuite extends QueryTest with SQLTestUtils with TestHiveSingleton {
           .queryExecution.analyzed
       }
     
    +  test("Star Expansion - script transform") {
    +    val data = (1 to 100000).map { i => (i, i, i) }
    +    data.toDF("d1", "d2", "d3").registerTempTable("script_trans")
    +    assert(100000 ===
    +      sql("SELECT TRANSFORM (*) USING 'cat' FROM script_trans")
    +        .queryExecution.toRdd.count())
    --- End diff --
    
    Yeah, the results are the same. Let me simplify it. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189578179
  
    **[Test build #52097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52097/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56779046
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Invalid usage of '*' in explode/json_tuple/UDTF")
    +    }
    +
    +    /**
    +     * Returns true if `exprs` contains a [[Star]].
    +     */
    +    def containsStar(exprs: Seq[Expression]): Boolean =
    +      exprs.exists(_.collect { case _: Star => true }.nonEmpty)
    +
         /**
    -     * Foreach expression, expands the matching attribute.*'s in `child`'s input for the subtree
    -     * rooted at each expression.
    +     * Expands the matching attribute.*'s in `child`'s output.
          */
    -    def expandStarExpressions(exprs: Seq[Expression], child: LogicalPlan): Seq[Expression] = {
    -      exprs.flatMap {
    -        case s: Star => s.expand(child, resolver)
    -        case e =>
    -          e.transformDown {
    -            case f1: UnresolvedFunction if containsStar(f1.children) =>
    -              f1.copy(children = f1.children.flatMap {
    -                case s: Star => s.expand(child, resolver)
    -                case o => o :: Nil
    -              })
    -          } :: Nil
    +    def expandStarExpression(expr: Expression, child: LogicalPlan): Expression = {
    +      expr.transformUp {
    +        case f1: UnresolvedFunction if containsStar(f1.children) =>
    +          f1.copy(children = f1.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateStruct if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateArray if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        // count(*) has been replaced by count(1)
    +        case o if containsStar(o.children) =>
    --- End diff --
    
    We can have a method:
    ```
    private def mayContainsStar(expr: Expression): Boolean = expr.isInstnaceOf[UnresolvedFunction] || expr.isInstnaceOf[CreateStruct]...
    ```
    
    then we can simplify this to:
    ```
    expr.transformUp {
      case e if mayContainsStar(e) =>
        e.copy(children = ...)
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r52958379
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -362,12 +362,26 @@ class Analyzer(
           exprs.flatMap {
             case s: Star => s.expand(child, resolver)
             case e =>
    -          e.transformDown {
    +          e.transformUp {
    +            // ResolveFunctions can handle the case when the number of variables is not valid
                 case f1: UnresolvedFunction if containsStar(f1.children) =>
                   f1.copy(children = f1.children.flatMap {
                     case s: Star => s.expand(child, resolver)
                     case o => o :: Nil
                   })
    +            case c: CreateStruct if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            case c: CreateStructUnsafe if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            // count(*) has been replaced by count(1)
    +            case f2: ExpectsInputTypes if containsStar(f2.children) =>
    --- End diff --
    
    why we only fail for `ExpectsInputTypes`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187464748
  
    **[Test build #51701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51701/consoleFull)** for PR 11208 at commit [`2c72edf`](https://github.com/apache/spark/commit/2c72edf662b037b0dba845f81e95dadfc35bf648).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186124747
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51536/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186045431
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53722350
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1938,6 +1938,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Array(1, 1))
    --- End diff --
    
    nit: use `Seq(1, 1)` to match the type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-194136402
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197605174
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-201124959
  
    I will fix this in #11828 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199179852
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56783697
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    --- End diff --
    
    We will lose qualifier here, how about `a.withNewChildren(expandStarExpression(a.child, p.child) :: Nil)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53890012
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    do we have a test for this error message?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184470657
  
    uh, I see. The code you posted above is for `Project`. The error message in the original JIRA is for using `star` used in `Aggregate`. 
    
    Yeah, we need a clean and complete fix for resolving star. Let me check if can move these into `expandStarExpressions`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189583957
  
    cc @yhuai 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-199374933
  
    **[Test build #53685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53685/consoleFull)** for PR 11208 at commit [`50abeec`](https://github.com/apache/spark/commit/50abeec3ea7fd4da83ac89ed90fc478d493d3dba).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53900103
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    Thanks! Let me change it now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195777551
  
    **[Test build #53008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53008/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187559355
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56779097
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +
    +    // CreateStruct and CreateArray in project list (alias)
    --- End diff --
    
    how about we add another case: `Generate` and `ScriptTransform`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189567010
  
    **[Test build #52097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52097/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195792869
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53008/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195567000
  
    **[Test build #52952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52952/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195565781
  
    cc @yhuai @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184466379
  
    `Star Expansion` only works when the star are in a `UnresolvedFunction`. 
    
    So far, Spark SQL does not handle star expansion when we use `star` in the DataFrame or DataSet functions. That is the reason I chose this title. Let me change it.
    
    Actually, I am not sure if `CreateStruct` and `Count` are the only two functions that can accept `star`. Could you help me confirm it? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-194170195
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r52958896
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -362,12 +362,26 @@ class Analyzer(
           exprs.flatMap {
             case s: Star => s.expand(child, resolver)
             case e =>
    -          e.transformDown {
    +          e.transformUp {
    +            // ResolveFunctions can handle the case when the number of variables is not valid
                 case f1: UnresolvedFunction if containsStar(f1.children) =>
                   f1.copy(children = f1.children.flatMap {
                     case s: Star => s.expand(child, resolver)
                     case o => o :: Nil
                   })
    +            case c: CreateStruct if containsStar(c.children) =>
    +              c.copy(children = c.children.flatMap {
    +                case s: Star => s.expand(child, resolver)
    +                case o => o :: Nil
    +              })
    +            case c: CreateStructUnsafe if containsStar(c.children) =>
    --- End diff --
    
    I saw it is being used in two parts in Analyzer. Will remove them. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56789051
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,50 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    --- End diff --
    
    Why do we put these tests in `SQLQuerySuite`? It looks like they are mostly testing DF APIs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-194170203
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52727/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/11208


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186071838
  
    Overall LGTM except some comments about tests, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189578225
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52097/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197634039
  
    **[Test build #53376 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53376/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195569069
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52952/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198759811
  
    **[Test build #53620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53620/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-195565806
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-198775144
  
    **[Test build #53620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53620/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-201124385
  
    @gatorsmile  This PR can't be easily reverted, so could you send a PR to fix it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56781815
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Invalid usage of '*' in explode/json_tuple/UDTF")
    +    }
    +
    +    /**
    +     * Returns true if `exprs` contains a [[Star]].
    +     */
    +    def containsStar(exprs: Seq[Expression]): Boolean =
    +      exprs.exists(_.collect { case _: Star => true }.nonEmpty)
    +
         /**
    -     * Foreach expression, expands the matching attribute.*'s in `child`'s input for the subtree
    -     * rooted at each expression.
    +     * Expands the matching attribute.*'s in `child`'s output.
          */
    -    def expandStarExpressions(exprs: Seq[Expression], child: LogicalPlan): Seq[Expression] = {
    -      exprs.flatMap {
    -        case s: Star => s.expand(child, resolver)
    -        case e =>
    -          e.transformDown {
    -            case f1: UnresolvedFunction if containsStar(f1.children) =>
    -              f1.copy(children = f1.children.flatMap {
    -                case s: Star => s.expand(child, resolver)
    -                case o => o :: Nil
    -              })
    -          } :: Nil
    +    def expandStarExpression(expr: Expression, child: LogicalPlan): Expression = {
    +      expr.transformUp {
    +        case f1: UnresolvedFunction if containsStar(f1.children) =>
    +          f1.copy(children = f1.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateStruct if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateArray if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        // count(*) has been replaced by count(1)
    +        case o if containsStar(o.children) =>
    --- End diff --
    
    That is a great idea! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186045432
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51512/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Star Expansion for Datafra...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-184464136
  
    The PR title looks confusing, `Star Expansion` is already done, what this PR did is fixing a problem of missing `CreateStruct` when handle stars and adding a better error message, @gatorsmile could you improve it to make it more clear?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r53896443
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -350,28 +351,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Cannot explode *, explode can only be applied on a specific column.")
    --- End diff --
    
    We already have a test case: https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala#L181-L182
    
    How about changing the message to `Invalid usage of '*' in explode/json_tuple/UDTF`? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186022368
  
    @cloud-fan The latest commit separates star resolution from the reference resolution, since `ResolveReferences` becomes pretty long now. Could you help me check if the new changes cover all the cases that can accept star? Thank you! : )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-197606557
  
    **[Test build #53376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53376/consoleFull)** for PR 11208 at commit [`e060dea`](https://github.com/apache/spark/commit/e060deaaf09d122966f090bf3b86895636418664).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-186024647
  
    **[Test build #51512 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51512/consoleFull)** for PR 11208 at commit [`ac71f39`](https://github.com/apache/spark/commit/ac71f3913148f97c25eac6957aacf64015532583).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56781820
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ---
    @@ -1934,6 +1934,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
         }
       }
     
    +  test("Star Expansion - CreateStruct and CreateArray") {
    +    val structDf = testData2.select("a", "b").as("record")
    +    // CreateStruct and CreateArray in aggregateExpressions
    +    assert(structDf.groupBy($"a").agg(min(struct($"record.*"))).first() == Row(3, Row(3, 1)))
    +    assert(structDf.groupBy($"a").agg(min(array($"record.*"))).first() == Row(3, Seq(3, 1)))
    +
    +    // CreateStruct and CreateArray in project list (unresolved alias)
    +    assert(structDf.select(struct($"record.*")).first() == Row(Row(1, 1)))
    +    assert(structDf.select(array($"record.*")).first().getAs[Seq[Int]](0) === Seq(1, 1))
    +
    +    // CreateStruct and CreateArray in project list (alias)
    --- End diff --
    
    Sure, let me do it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11208#discussion_r56783741
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -369,28 +370,83 @@ class Analyzer(
       }
     
       /**
    -   * Replaces [[UnresolvedAttribute]]s with concrete [[AttributeReference]]s from
    -   * a logical plan node's children.
    +   * Expand [[UnresolvedStar]] or [[ResolvedStar]] to the matching attributes in child's output.
        */
    -  object ResolveReferences extends Rule[LogicalPlan] {
    +  object ResolveStar extends Rule[LogicalPlan] {
    +
    +    def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
    +      case p: LogicalPlan if !p.childrenResolved => p
    +
    +      // If the projection list contains Stars, expand it.
    +      case p: Project if containsStar(p.projectList) =>
    +        val expanded = p.projectList.flatMap {
    +          case s: Star => s.expand(p.child, resolver)
    +          case ua @ UnresolvedAlias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            UnresolvedAlias(child = expandStarExpression(ua.child, p.child)) :: Nil
    +          case a @ Alias(_: UnresolvedFunction | _: CreateArray | _: CreateStruct, _) =>
    +            Alias(child = expandStarExpression(a.child, p.child), a.name)(
    +              isGenerated = a.isGenerated) :: Nil
    +          case o => o :: Nil
    +        }
    +        Project(projectList = expanded, p.child)
    +      // If the aggregate function argument contains Stars, expand it.
    +      case a: Aggregate if containsStar(a.aggregateExpressions) =>
    +        val expanded = a.aggregateExpressions.flatMap {
    +          case s: Star => s.expand(a.child, resolver)
    +          case o if containsStar(o :: Nil) => expandStarExpression(o, a.child) :: Nil
    +          case o => o :: Nil
    +        }.map(_.asInstanceOf[NamedExpression])
    +        a.copy(aggregateExpressions = expanded)
    +      // If the script transformation input contains Stars, expand it.
    +      case t: ScriptTransformation if containsStar(t.input) =>
    +        t.copy(
    +          input = t.input.flatMap {
    +            case s: Star => s.expand(t.child, resolver)
    +            case o => o :: Nil
    +          }
    +        )
    +      case g: Generate if containsStar(g.generator.children) =>
    +        failAnalysis("Invalid usage of '*' in explode/json_tuple/UDTF")
    +    }
    +
    +    /**
    +     * Returns true if `exprs` contains a [[Star]].
    +     */
    +    def containsStar(exprs: Seq[Expression]): Boolean =
    +      exprs.exists(_.collect { case _: Star => true }.nonEmpty)
    +
         /**
    -     * Foreach expression, expands the matching attribute.*'s in `child`'s input for the subtree
    -     * rooted at each expression.
    +     * Expands the matching attribute.*'s in `child`'s output.
          */
    -    def expandStarExpressions(exprs: Seq[Expression], child: LogicalPlan): Seq[Expression] = {
    -      exprs.flatMap {
    -        case s: Star => s.expand(child, resolver)
    -        case e =>
    -          e.transformDown {
    -            case f1: UnresolvedFunction if containsStar(f1.children) =>
    -              f1.copy(children = f1.children.flatMap {
    -                case s: Star => s.expand(child, resolver)
    -                case o => o :: Nil
    -              })
    -          } :: Nil
    +    def expandStarExpression(expr: Expression, child: LogicalPlan): Expression = {
    +      expr.transformUp {
    +        case f1: UnresolvedFunction if containsStar(f1.children) =>
    +          f1.copy(children = f1.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateStruct if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        case c: CreateArray if containsStar(c.children) =>
    +          c.copy(children = c.children.flatMap {
    +            case s: Star => s.expand(child, resolver)
    +            case o => o :: Nil
    +          })
    +        // count(*) has been replaced by count(1)
    +        case o if containsStar(o.children) =>
    --- End diff --
    
    oh i see, I don't have a better idea, let's just keep it this way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-189578223
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-187526529
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-191268943
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52318/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13320] [SQL] Support Star in CreateStru...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/11208#issuecomment-201122555
  
    @gatorsmile @cloud-fan This PR revert the change in https://github.com/apache/spark/pull/3674,  unfortunately the unit test in AnalysisSuite. This test break once we enforce max-iteration check in tests, see https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54090/testReport/org.apache.spark.sql.catalyst.analysis/AnalysisSuite/union_project__/ 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org