You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/06/29 10:18:47 UTC

[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/13971

    [SPARK-16289][SQL] Implement posexplode table generating function

    ## What changes were proposed in this pull request?
    
    This PR implements `posexplode` table generating function. Currently, master branch raises the following exception for `map` argument. It's different from Hive.
    
    **Before**
    ```scala
    scala> sql("select posexplode(map('a', 1, 'b', 2))").show
    org.apache.spark.sql.AnalysisException: No handler for Hive UDF ... posexplode() takes an array as a parameter; line 1 pos 7
    ```
    
    **After**
    ```scala
    scala> sql("select posexplode(map('a', 1, 'b', 2))").show
    +---+---+-----+
    |pos|key|value|
    +---+---+-----+
    |  0|  a|    1|
    |  1|  b|    2|
    +---+---+-----+
    ```
    
    For `array` argument, `after` is the same with `before`.
    ```
    scala> sql("select posexplode(array(1, 2, 3))").show
    +---+---+
    |pos|col|
    +---+---+
    |  0|  1|
    |  1|  2|
    |  2|  3|
    +---+---+
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests with newly added testcases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-16289

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13971.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13971
    
----
commit 584eb9ebb15846d11c0c8d139de17183e26402a8
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-06-29T09:54:21Z

    [SPARK-16289][SQL] Implement posexplode table generating function

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Oops. I added R, too. The exactly same semantic of current `explode` in R.
    Yep, please wait for two hours again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61492/consoleFull)** for PR 13971 at commit [`fcfccee`](https://github.com/apache/spark/commit/fcfccee5349cde657d7765d2e78948125a962045).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Do we have unit tests for explode expression? (not end-to-end tests)
    
    if not, do you mind looking into it?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    I tried to add to Python/R.
    But, currently R explode is a little misleading. So, I just committed Python first.
    For R, I will clean up `explode` and `posexplode` later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030199
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    while you are at it, let's create a new suite for end-to-end generators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61509/consoleFull)** for PR 13971 at commit [`0266052`](https://github.com/apache/spark/commit/0266052718ecf802f4d881941b5a99b262ebe754).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69621929
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -1637,6 +1637,27 @@ def explode(col):
         return Column(jc)
     
     
    +@since(2.1)
    +def posexplode(col):
    --- End diff --
    
    yea this one is probably fine.
    
    i wouldn't register the other ones.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Hi, @rxin . Thank you for review. I updated the followings.
    - Add function descriptions for `explode` and `posexplode`.
    - Add examples in comments.
    - Change indentation to make the fields clearer.
    
    For the `explode` and `posexplode` expressions, it seems that we don't have unit tests in `expression` level because they are generators.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69192174
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    Yep. Thank you. I'll investigate it later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61498/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61498/consoleFull)** for PR 13971 at commit [`e255873`](https://github.com/apache/spark/commit/e2558733e62b17da02035a504788d775abbf32ab).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    If I understand correctly, the remaining issue is `checkEvaluation`.
    I'm sorry for this, but I'm still not sure how to use `checkEvaluation` for the generators.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69031844
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    Is `sql.GeneratorSuite.scala` okay? The package is different from `expression.GeneratorSuite`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61512/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61492/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Yea let's start with that, and we can add more in the future. I'd also add it for the other ones you are implementing, e.g. inline, in those prs.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030922
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    and move the existing ones over there


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030053
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +155,47 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   10
    + *   20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  10\n  20")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    +}
    +
    +/**
    + * Given an input array produces a sequence of rows for each position and value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   0  10
    + *   1  20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows with positions, or the elements of a map into multiple rows and columns with positions.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  0\t10\n  1\t20")
    +// scalastyle:on line.size.limit
    +case class PosExplode(child: Expression)
    +  extends ExplodeBase(child, position = true) with Serializable {
    --- End diff --
    
    case classes are automatically serializable so you don't need this i think


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69036220
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorSuite.scala ---
    @@ -0,0 +1,63 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorSuite extends SparkFunSuite with ExpressionEvalHelper {
    --- End diff --
    
    thsi one maybe GeneratorExpressionSuite


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61509/consoleFull)** for PR 13971 at commit [`0266052`](https://github.com/apache/spark/commit/0266052718ecf802f4d881941b5a99b262ebe754).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61504/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69008195
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -115,9 +112,19 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
     
       // hive-compatible default alias for explode function ("col" for array, "key", "value" for map)
       override def elementSchema: StructType = child.dataType match {
    -    case ArrayType(et, containsNull) => new StructType().add("col", et, containsNull)
    +    case ArrayType(et, containsNull) =>
    +      if (position) {
    +        new StructType().add("pos", IntegerType, false).add("col", et, containsNull)
    +      } else {
    +        new StructType().add("col", et, containsNull)
    +      }
         case MapType(kt, vt, valueContainsNull) =>
    -      new StructType().add("key", kt, false).add("value", vt, valueContainsNull)
    +      if (position) {
    +        new StructType().add("pos", IntegerType, false).add("key", kt, false)
    --- End diff --
    
    maybe 
    ```
    new StructType()
      .add("pos", IntegerType, nullable = false)
      .add("key", kt, nullable = false)
      .add("value", vt, valueContainsNull)
    ```
    so it is more clear we have three fields.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69141133
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    `checkEvaluation` takes `Any` as expected result, so I don't think `checkEvaluation` is only used for a single row.
    Have you tried to pass a `Seq[Row]` to `checkEvaluation`? If it doesn't work, is it possible to improve `checkEvaluation` so that it can work for this case? thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69190862
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    Let's not change it for now. We also don't want test code to become so complicated that is is no longer obvious what's going on.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69036837
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    Sorry. What do you mean by `the rest` for GeneratorFunctionsSuite?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    This looks pretty good. Let's fix the remaining minor issues and merge it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030737
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorSuite.scala ---
    @@ -0,0 +1,63 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    +  }
    +
    +  private final val int_array = Seq(1, 2, 3)
    +  private final val str_array = Seq("a", "b", "c")
    +
    +  test("explode") {
    --- End diff --
    
    Yep. Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61501/consoleFull)** for PR 13971 at commit [`c5dee49`](https://github.com/apache/spark/commit/c5dee492ff71c4127ed0a6bb0e818d61d95525bd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class Explode(child: Expression) extends ExplodeBase(child, position = false)`
      * `case class PosExplode(child: Expression) extends ExplodeBase(child, position = true)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69183700
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    Sure. @cloud-fan . In fact, I try everything you told me in many ways because I trust you. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merging in master. Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61504/consoleFull)** for PR 13971 at commit [`153e8eb`](https://github.com/apache/spark/commit/153e8eb89db5a0e33f2797f792e4f710bbe499c2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61501/consoleFull)** for PR 13971 at commit [`c5dee49`](https://github.com/apache/spark/commit/c5dee492ff71c4127ed0a6bb0e818d61d95525bd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    LGTM pending Jenkins.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61492 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61492/consoleFull)** for PR 13971 at commit [`fcfccee`](https://github.com/apache/spark/commit/fcfccee5349cde657d7765d2e78948125a962045).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    He added it, didn't he?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69074335
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    Oh, thank you for review, @cloud-fan , too.
    Do we have an example of `checkEvaluation` to check the generator, multiple InternalRows?
    I just thought `checkEvaluation` is just for a single row, e.g., values, arrays, maps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69054569
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala ---
    @@ -0,0 +1,92 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.spark.sql.functions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +class GeneratorFunctionSuite extends QueryTest with SharedSQLContext {
    --- End diff --
    
    oh sorry I just realized it's not expression level unit test, but end-to-end test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69036197
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    GeneratorFunctionsSuite to match the rest


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030079
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +155,47 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   10
    + *   20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  10\n  20")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    --- End diff --
    
    also you don't need the `{ }`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69395230
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -1637,6 +1637,27 @@ def explode(col):
         return Column(jc)
     
     
    +@since(2.1)
    +def posexplode(col):
    --- End diff --
    
    cc @rxin , is `posexplode` a special hive fallback function that we need to register? other ones don't get registered in `functions`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61501/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Thank you for review and merging, @rxin and @cloud-fan .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69037002
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    Ah, instead of `sql.GeneratorSuite`. I see.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61512 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61512/consoleFull)** for PR 13971 at commit [`5f3a951`](https://github.com/apache/spark/commit/5f3a9514aab50310d388808895a067d1912bf0d8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61500/consoleFull)** for PR 13971 at commit [`1cf723a`](https://github.com/apache/spark/commit/1cf723a6084de399ad2217c746d5a799b45be460).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61461/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030058
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +155,47 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   10
    + *   20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  10\n  20")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    --- End diff --
    
    case classes are automatically serializable so you don't need this i think


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69030091
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +148,33 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    +}
    +
    +/**
    + * Given an input array produces a sequence of rows for each position and value in the array.
    --- End diff --
    
    also you don't need the `{ }`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61512 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61512/consoleFull)** for PR 13971 at commit [`5f3a951`](https://github.com/apache/spark/commit/5f3a9514aab50310d388808895a067d1912bf0d8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69029085
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorSuite.scala ---
    @@ -0,0 +1,63 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    +  }
    +
    +  private final val int_array = Seq(1, 2, 3)
    +  private final val str_array = Seq("a", "b", "c")
    +
    +  test("explode") {
    --- End diff --
    
    can you add a test case for empty input? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61500/consoleFull)** for PR 13971 at commit [`1cf723a`](https://github.com/apache/spark/commit/1cf723a6084de399ad2217c746d5a799b45be460).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61461/consoleFull)** for PR 13971 at commit [`584eb9e`](https://github.com/apache/spark/commit/584eb9ebb15846d11c0c8d139de17183e26402a8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69622986
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -1637,6 +1637,27 @@ def explode(col):
         return Column(jc)
     
     
    +@since(2.1)
    +def posexplode(col):
    --- End diff --
    
    Thank you for reconfirming!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    @rxin Thank you for intensive reviewing this PR.
    I will improve another PRs (adding new SQL functions) with the same level of quality!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69036718
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
    @@ -2722,6 +2722,14 @@ object functions {
       def explode(e: Column): Column = withExpr { Explode(e.expr) }
     
       /**
    +   * Creates a new row for each element with position in the given array or map column.
    +   *
    +   * @group collection_funcs
    +   * @since 2.1.0
    +   */
    +  def posexplode(e: Column): Column = withExpr { PosExplode(e.expr) }
    --- End diff --
    
    need to add this to python too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69008018
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +148,33 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    +}
    +
    +/**
    + * Given an input array produces a sequence of rows for each position and value in the array.
    --- End diff --
    
    btw since the expression description might be difficult to see without line wrapping, it'd also be better to put an example here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61504/consoleFull)** for PR 13971 at commit [`153e8eb`](https://github.com/apache/spark/commit/153e8eb89db5a0e33f2797f792e4f710bbe499c2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper `
      * `class GeneratorFunctionSuite extends QueryTest with SharedSQLContext `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    LGTM except the unit test, @rxin do we need expression level unit test for it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    cc @rxin and @cloud-fan .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61509/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69031088
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +155,47 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   10
    + *   20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  10\n  20")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    +}
    +
    +/**
    + * Given an input array produces a sequence of rows for each position and value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   0  10
    + *   1  20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows with positions, or the elements of a map into multiple rows and columns with positions.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  0\t10\n  1\t20")
    +// scalastyle:on line.size.limit
    +case class PosExplode(child: Expression)
    +  extends ExplodeBase(child, position = true) with Serializable {
    --- End diff --
    
    Actually, this is needed. Without this, there occurs runtime errors. This happens usually when extending `abstract` base classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69075261
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    And, how to check the zero row? At Line 39, 
    https://github.com/apache/spark/pull/13971/files#diff-6715134a4e95980149a7600ecb71674cR41


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69055904
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    We have `checkEvaluation` for this purpose, how about using that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13971


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69395876
  
    --- Diff: python/pyspark/sql/functions.py ---
    @@ -1637,6 +1637,27 @@ def explode(col):
         return Column(jc)
     
     
    +@since(2.1)
    +def posexplode(col):
    --- End diff --
    
    For this one, I thought the reason is `explode` is already registered. `posexplode` is a pair of that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Could you do me a favor?
    
    There is a tiny fix. Could you take a look at https://github.com/apache/spark/pull/13730 ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69188603
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    If I didn't misunderstand, it's definitely valuable issue to investigate more. If we can upgrade `checkEvaluation` later, we can unify the testcases of this PR with `checkEvaluation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69054154
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -115,9 +112,26 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
     
       // hive-compatible default alias for explode function ("col" for array, "key", "value" for map)
       override def elementSchema: StructType = child.dataType match {
    -    case ArrayType(et, containsNull) => new StructType().add("col", et, containsNull)
    +    case ArrayType(et, containsNull) =>
    +      if (position) {
    --- End diff --
    
    instead of using if here, I'd like to implement `elementSchema` in `Explode` and `PosExplode `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    If you want that for `explode` and `posexplode`, sure!
    
    In general, `GeneratorTestSuite` seems to have not only `explode` and `posexplode`, but also `UserDefinedGenerator` and `HiveGenericUDTF`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61500/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69007928
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +148,33 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.")
    --- End diff --
    
    an example would be useful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69037028
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ColumnExpressionSuite.scala ---
    @@ -129,6 +129,13 @@ class ColumnExpressionSuite extends QueryTest with SharedSQLContext {
           Row(1) :: Row(2) :: Row(3) :: Nil)
       }
     
    +  test("single posexplode") {
    --- End diff --
    
    the rest of the test cases that use explode in this file


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69007938
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +148,33 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    +}
    +
    +/**
    + * Given an input array produces a sequence of rows for each position and value in the array.
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    --- End diff --
    
    an example would be useful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61461/consoleFull)** for PR 13971 at commit [`584eb9e`](https://github.com/apache/spark/commit/584eb9ebb15846d11c0c8d139de17183e26402a8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `abstract class ExplodeBase(child: Expression, position: Boolean)`
      * `case class Explode(child: Expression)`
      * `case class PosExplode(child: Expression)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Now, `GeneratorSuite` is added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69031350
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala ---
    @@ -141,11 +155,47 @@ case class Explode(child: Expression) extends UnaryExpression with Generator wit
               val rows = new Array[InternalRow](inputMap.numElements())
               var i = 0
               inputMap.foreach(kt, vt, (k, v) => {
    -            rows(i) = InternalRow(k, v)
    +            rows(i) = if (position) InternalRow(i, k, v) else InternalRow(k, v)
                 i += 1
               })
               rows
             }
         }
       }
     }
    +
    +/**
    + * Given an input array produces a sequence of rows for each value in the array.
    + *
    + * {{{
    + *   SELECT explode(array(10,20)) ->
    + *   10
    + *   20
    + * }}}
    + */
    +// scalastyle:off line.size.limit
    +@ExpressionDescription(
    +  usage = "_FUNC_(a) - Separates the elements of array a into multiple rows, or the elements of map a into multiple rows and columns.",
    +  extended = "> SELECT _FUNC_(array(10,20));\n  10\n  20")
    +// scalastyle:on line.size.limit
    +case class Explode(child: Expression)
    +  extends ExplodeBase(child, position = false) with Serializable {
    --- End diff --
    
    Hmm. Okay. I will try to remove Serializable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Now, the followings are updated.
    - Make `sql/GeneratorSuite.scala` and moves the testcases from `ColumnExpressionSuite.scala`.
    - Remove redundant `Serializable` and braces.
    - Fix a typo in `PosExplode` example.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69054415
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/GeneratorFunctionSuite.scala ---
    @@ -0,0 +1,92 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql
    +
    +import org.apache.spark.sql.functions._
    +import org.apache.spark.sql.test.SharedSQLContext
    +
    +class GeneratorFunctionSuite extends QueryTest with SharedSQLContext {
    --- End diff --
    
    why do we put expression level unit test in sql core module instead of catalyst?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Sure. Thank you for fast feedback! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    **[Test build #61498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61498/consoleFull)** for PR 13971 at commit [`e255873`](https://github.com/apache/spark/commit/e2558733e62b17da02035a504788d775abbf32ab).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `class GeneratorSuite extends SparkFunSuite with ExpressionEvalHelper `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Can we create a suite for unit testing generators?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13971: [SPARK-16289][SQL] Implement posexplode table generating...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13971
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13971: [SPARK-16289][SQL] Implement posexplode table gen...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13971#discussion_r69187010
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/GeneratorExpressionSuite.scala ---
    @@ -0,0 +1,71 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.expressions
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.catalyst.InternalRow
    +import org.apache.spark.unsafe.types.UTF8String
    +
    +class GeneratorExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
    +  private def checkTuple(actual: ExplodeBase, expected: Seq[InternalRow]): Unit = {
    +    assert(actual.eval(null).toSeq === expected)
    --- End diff --
    
    As a evidence, let me write the results of the most simplest case.
    ```scala
    checkEvaluation(Explode(CreateArray(Seq.empty)), Seq.empty[Row])
    checkEvaluation(Explode(CreateArray(Seq.empty)), Seq.empty[InternalRow])
    checkEvaluation(Explode(CreateArray(Seq.empty)), Seq.empty)
    ```
    
    All the above returns the followings.
    ```
    Incorrect evaluation (codegen off): explode(array()), actual: InternalRow;(), expected: []
    ```
    
    Here is the body of `checkEvaluation`. The following comments are the limitation I found.
    ```scala
    // 1. This makes `Seq[Any]` into `GenericArrayData` generally.
    val catalystValue = CatalystTypeConverters.convertToCatalyst(expected)
    checkEvaluationWithoutCodegen(expression, catalystValue, inputRow)
    
    // 2. Here, `val actual = plan(inputRow).get(0, expression.dataType)` is called to try casting to `expression.dataType`.
    checkEvaluationWithGeneratedMutableProjection(expression, catalystValue, inputRow)
    
    if (GenerateUnsafeProjection.canSupport(expression.dataType)) {
          // 3. Here, `val unsafeRow = plan(inputRow)` with one row assumption.
          checkEvalutionWithUnsafeProjection(expression, catalystValue, inputRow)
    }
    
    // 4. Here, `val plan = Project(Alias(expression, s"Optimized($expression)")() :: Nil, OneRowRelation)`.
    checkEvaluationWithOptimization(expression, catalystValue, inputRow)
    ```
    
    In short, every steps of the `checkEvaluation` seem to depend on the **single row assumption** heavily. If we wan to change this. We should do in a separate issue since it's not trivial.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org