You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/10/24 04:27:39 UTC

[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/22812

    [SPARK-25817][SQL] Dataset encoder should support combination of map and product type

    ## What changes were proposed in this pull request?
    
    After https://github.com/apache/spark/pull/22745 , Dataset encoder supports the combination of java bean and map type. This PR is to fix the Scala side.
    
    The reason why it didn't work before is, `CatalystToExternalMap` tries to get the data type of the input map expression, while it can be unresolved and its data type is known. To fix it, we can follow `UnresolvedMapObjects`, to create a `UnresolvedCatalystToExternalMap`, and only create `CatalystToExternalMap` when the input map expression is resolved and the data type is known.
    
    ## How was this patch tested?
    
    enable a old test case

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark map

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22812.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22812
    
----
commit da31d2602b8e12eb8949336cf14b903c0df731cf
Author: Wenchen Fan <we...@...>
Date:   2018-10-24T04:21:44Z

    Dataset encoder should support combination of map and product type

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r227638902
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -1090,15 +1096,9 @@ case class CatalystToExternalMap private(
         val tupleLoopValue = ctx.freshName("tupleLoopValue")
         val builderValue = ctx.freshName("builderValue")
     
    -    val getLength = s"${genInputData.value}.numElements()"
    --- End diff --
    
    these are unrelated, but is a followup of https://github.com/apache/spark/pull/16986 to address the remaining comments.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4477/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98019/testReport)** for PR 22812 at commit [`bc39824`](https://github.com/apache/spark/commit/bc39824baa7310da7b2bd52dc4f8d19a8c2d8425).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98020/testReport)** for PR 22812 at commit [`cf5a01e`](https://github.com/apache/spark/commit/cf5a01e7de4ece9e78c017d331027f722460ccac).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22812


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97965/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228388671
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -2384,14 +2384,23 @@ class Analyzer(
                 case UnresolvedMapObjects(func, inputData, cls) if inputData.resolved =>
                   inputData.dataType match {
                     case ArrayType(et, cn) =>
    -                  val expr = MapObjects(func, inputData, et, cn, cls) transformUp {
    +                  MapObjects(func, inputData, et, cn, cls) transformUp {
                         case UnresolvedExtractValue(child, fieldName) if child.resolved =>
                           ExtractValue(child, fieldName, resolver)
                       }
    -                  expr
                     case other =>
                       throw new AnalysisException("need an array field but got " + other.catalogString)
                   }
    +            case u: UnresolvedCatalystToExternalMap if u.child.resolved =>
    +              u.child.dataType match {
    +                case _: MapType =>
    +                  CatalystToExternalMap(u) transformUp {
    +                    case UnresolvedExtractValue(child, fieldName) if child.resolved =>
    --- End diff --
    
    Yea I think so. The `UnresolvedExtractValue` might appear in `CatalystToExternalMap.keyLambdaFunction` and `valueLambdaFunction`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98147/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228195016
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala ---
    @@ -119,10 +119,9 @@ object ExpressionEncoder {
         }
     
         val childrenDeserializers = encoders.zipWithIndex.map { case (enc, index) =>
    -      val getColumnsByOrdinals = enc.objDeserializer.collect { case c: GetColumnByOrdinal => c }
    -        .distinct
    -      assert(getColumnsByOrdinals.size == 1, "object deserializer should have only one " +
    -        s"`GetColumnByOrdinal`, but there are ${getColumnsByOrdinals.size}")
    +      val getColExprs = enc.objDeserializer.collect { case c: GetColumnByOrdinal => c }.distinct
    --- End diff --
    
    unrelated but to fix minor code style issues in https://github.com/apache/spark/pull/22749


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98147 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98147/testReport)** for PR 22812 at commit [`517bebf`](https://github.com/apache/spark/commit/517bebfb1e49f2315019696a50b657dcf715778c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r227638941
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala ---
    @@ -1837,8 +1837,6 @@ case class GetArrayFromMap private(
         arrayGetter: MapData => ArrayData,
         elementTypeGetter: MapType => DataType) extends UnaryExpression with NonSQLExpression {
     
    -  private lazy val encodedFunctionName: String = TermName(functionName).encodedName.toString
    --- End diff --
    
    this is to address https://github.com/apache/spark/pull/22745#discussion_r227407344


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98062/testReport)** for PR 22812 at commit [`517bebf`](https://github.com/apache/spark/commit/517bebfb1e49f2315019696a50b657dcf715778c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4482/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98026/testReport)** for PR 22812 at commit [`cf5a01e`](https://github.com/apache/spark/commit/cf5a01e7de4ece9e78c017d331027f722460ccac).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228380982
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala ---
    @@ -216,7 +215,6 @@ case class ExpressionEncoder[T](
           }
           nullSafeSerializer match {
             case If(_: IsNull, _, s: CreateNamedStruct) => s
    -        case s: CreateNamedStruct => s
    --- End diff --
    
    Thanks!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98019/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98062/testReport)** for PR 22812 at commit [`517bebf`](https://github.com/apache/spark/commit/517bebfb1e49f2315019696a50b657dcf715778c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97959/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4429/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98019/testReport)** for PR 22812 at commit [`bc39824`](https://github.com/apache/spark/commit/bc39824baa7310da7b2bd52dc4f8d19a8c2d8425).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98026/testReport)** for PR 22812 at commit [`cf5a01e`](https://github.com/apache/spark/commit/cf5a01e7de4ece9e78c017d331027f722460ccac).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #97959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97959/testReport)** for PR 22812 at commit [`da31d26`](https://github.com/apache/spark/commit/da31d2602b8e12eb8949336cf14b903c0df731cf).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4435/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228391626
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -2384,14 +2384,23 @@ class Analyzer(
                 case UnresolvedMapObjects(func, inputData, cls) if inputData.resolved =>
                   inputData.dataType match {
                     case ArrayType(et, cn) =>
    -                  val expr = MapObjects(func, inputData, et, cn, cls) transformUp {
    +                  MapObjects(func, inputData, et, cn, cls) transformUp {
                         case UnresolvedExtractValue(child, fieldName) if child.resolved =>
                           ExtractValue(child, fieldName, resolver)
                       }
    -                  expr
                     case other =>
                       throw new AnalysisException("need an array field but got " + other.catalogString)
                   }
    +            case u: UnresolvedCatalystToExternalMap if u.child.resolved =>
    +              u.child.dataType match {
    +                case _: MapType =>
    +                  CatalystToExternalMap(u) transformUp {
    +                    case UnresolvedExtractValue(child, fieldName) if child.resolved =>
    --- End diff --
    
    TBH I don't quite remember why I did this for `MapObjects`, so I just follow it here. Maybe we can remove it in a followup PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #97965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97965/testReport)** for PR 22812 at commit [`da31d26`](https://github.com/apache/spark/commit/da31d2602b8e12eb8949336cf14b903c0df731cf).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98062/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4562/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #97959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97959/testReport)** for PR 22812 at commit [`da31d26`](https://github.com/apache/spark/commit/da31d2602b8e12eb8949336cf14b903c0df731cf).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class UnresolvedCatalystToExternalMap(`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98020/testReport)** for PR 22812 at commit [`cf5a01e`](https://github.com/apache/spark/commit/cf5a01e7de4ece9e78c017d331027f722460ccac).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98020/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    cc @michalsenkyr @vofque @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228381728
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -2384,14 +2384,23 @@ class Analyzer(
                 case UnresolvedMapObjects(func, inputData, cls) if inputData.resolved =>
                   inputData.dataType match {
                     case ArrayType(et, cn) =>
    -                  val expr = MapObjects(func, inputData, et, cn, cls) transformUp {
    +                  MapObjects(func, inputData, et, cn, cls) transformUp {
                         case UnresolvedExtractValue(child, fieldName) if child.resolved =>
                           ExtractValue(child, fieldName, resolver)
                       }
    -                  expr
                     case other =>
                       throw new AnalysisException("need an array field but got " + other.catalogString)
                   }
    +            case u: UnresolvedCatalystToExternalMap if u.child.resolved =>
    +              u.child.dataType match {
    +                case _: MapType =>
    +                  CatalystToExternalMap(u) transformUp {
    +                    case UnresolvedExtractValue(child, fieldName) if child.resolved =>
    --- End diff --
    
    When `u.child` is resolved, is there still `UnresolvedExtractValue`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98026/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    **[Test build #98147 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98147/testReport)** for PR 22812 at commit [`517bebf`](https://github.com/apache/spark/commit/517bebfb1e49f2315019696a50b657dcf715778c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22812: [SPARK-25817][SQL] Dataset encoder should support...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22812#discussion_r228390150
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -2384,14 +2384,23 @@ class Analyzer(
                 case UnresolvedMapObjects(func, inputData, cls) if inputData.resolved =>
                   inputData.dataType match {
                     case ArrayType(et, cn) =>
    -                  val expr = MapObjects(func, inputData, et, cn, cls) transformUp {
    +                  MapObjects(func, inputData, et, cn, cls) transformUp {
                         case UnresolvedExtractValue(child, fieldName) if child.resolved =>
                           ExtractValue(child, fieldName, resolver)
                       }
    -                  expr
                     case other =>
                       throw new AnalysisException("need an array field but got " + other.catalogString)
                   }
    +            case u: UnresolvedCatalystToExternalMap if u.child.resolved =>
    +              u.child.dataType match {
    +                case _: MapType =>
    +                  CatalystToExternalMap(u) transformUp {
    +                    case UnresolvedExtractValue(child, fieldName) if child.resolved =>
    --- End diff --
    
    `ResolveReferences` might also process that, but it is also good to have them here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22812: [SPARK-25817][SQL] Dataset encoder should support combin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22812
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4517/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org