You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by liancheng <gi...@git.apache.org> on 2015/12/22 13:16:06 UTC

[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

GitHub user liancheng opened a pull request:

    https://github.com/apache/spark/pull/10431

    [SPARK-12478][SQL] Bugfix: Dataset fields of product types can't be null

    When creating extractors for product types (i.e. case classes and tuples), a null check is missing, thus we always assume input product values are non-null.
    
    This PR adds a null check in the extractor expression for product types. The null check is stripped off for top level product fields, which are mapped to the outermost `Row`s, since they can't be null.
    
    Thanks @cloud-fan for helping investigating this issue!

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/liancheng/spark spark-12478.top-level-null-field

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10431.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10431
    
----
commit af0b368f80890153e39729abeefb28ae4df21e39
Author: Cheng Lian <li...@databricks.com>
Date:   2015-12-22T12:01:50Z

    Fixes SPARK-12478

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166706328
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48207/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166613033
  
    **[Test build #48195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48195/consoleFull)** for PR 10431 at commit [`383b4e8`](https://github.com/apache/spark/commit/383b4e80acd25f2ddfcc54968fc80dae57990ae4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166607223
  
    **[Test build #48193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48193/consoleFull)** for PR 10431 at commit [`72a380d`](https://github.com/apache/spark/commit/72a380db79475df70dbb28d3253fcacd8f82f4b2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166607113
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-178118573
  
    @marmbrus Should we backport this to branch-1.6? I missed @yhuai's comment above and haven't done it yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166607116
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48192/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166625063
  
    **[Test build #48193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48193/consoleFull)** for PR 10431 at commit [`72a380d`](https://github.com/apache/spark/commit/72a380db79475df70dbb28d3253fcacd8f82f4b2).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class AssertNotNull(`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166625244
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48193/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166706327
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166625242
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166706134
  
    **[Test build #48207 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48207/consoleFull)** for PR 10431 at commit [`47636db`](https://github.com/apache/spark/commit/47636db147fd8cf11bf9c8e0ab43a4d8c4f4f7c0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class AssertNotNull(`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166631234
  
    **[Test build #48195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48195/consoleFull)** for PR 10431 at commit [`383b4e8`](https://github.com/apache/spark/commit/383b4e80acd25f2ddfcc54968fc80dae57990ae4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `case class AssertNotNull(`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166783716
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10431#discussion_r48248538
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala ---
    @@ -467,13 +467,16 @@ object ScalaReflection extends ScalaReflection {
             case t if t <:< localTypeOf[Product] =>
               val params = getConstructorParameters(t)
     
    -          CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    -            val fieldValue = Invoke(inputObject, fieldName, dataTypeFor(fieldType))
    -            val clsName = getClassNameFromType(fieldType)
    -            val newPath = s"""- field (class: "$clsName", name: "$fieldName")""" +: walkedTypePath
    -
    -            expressions.Literal(fieldName) :: extractorFor(fieldValue, fieldType, newPath) :: Nil
    -          })
    +          expressions.If(
    +            IsNull(inputObject),
    +            expressions.Literal.create(null, silentSchemaFor(t).dataType),
    +            CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    +              val fieldValue = Invoke(inputObject, fieldName, dataTypeFor(fieldType))
    +              val clsName = getClassNameFromType(fieldType)
    +              val newPath = s"""- field (class: "$clsName", name: "$fieldName")""" +: walkedTypePath
    +              expressions.Literal(fieldName) :: extractorFor(fieldValue, fieldType, newPath) :: Nil
    +            })
    +          )
    --- End diff --
    
    nit: we can split this into 2 parts
    ```
    val row = CreateNamedStruct(...)
    expression.If(
      IsNull...
      expression.Literal.create(null, row.dataType),
      row
    )
    ```
    which looks arguably more clear.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166788637
  
    Thanks for the review! I'm merging this to master. We should also backport this to branch-1.6 after the release.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-178158109
  
    Yeah, since its labeled experimental, I'd like to fix as many Dataset bugs as we can for 1.6.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10431#discussion_r48251442
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala ---
    @@ -380,7 +380,7 @@ object ScalaReflection extends ScalaReflection {
         val clsName = getClassNameFromType(tpe)
         val walkedTypePath = s"""- root class: "${clsName}"""" :: Nil
         extractorFor(inputObject, tpe, walkedTypePath) match {
    -      case s: CreateNamedStruct => s
    +      case expressions.If(IsNull(_), expressions.Literal(null, _), s: CreateNamedStruct) => s
    --- End diff --
    
    how about checking the type?
    ```
    case i: expressions.If if tpe <:< localTypeOf[Product] => i.falseValue.asInstanceOf[CreateNamedStruct]
    ```
    it looks safer to me.
    
    as option of product will also 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10431#discussion_r48250571
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala ---
    @@ -467,13 +467,16 @@ object ScalaReflection extends ScalaReflection {
             case t if t <:< localTypeOf[Product] =>
               val params = getConstructorParameters(t)
     
    -          CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    -            val fieldValue = Invoke(inputObject, fieldName, dataTypeFor(fieldType))
    -            val clsName = getClassNameFromType(fieldType)
    -            val newPath = s"""- field (class: "$clsName", name: "$fieldName")""" +: walkedTypePath
    -
    -            expressions.Literal(fieldName) :: extractorFor(fieldValue, fieldType, newPath) :: Nil
    -          })
    +          expressions.If(
    +            IsNull(inputObject),
    +            expressions.Literal.create(null, silentSchemaFor(t).dataType),
    +            CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    +              val fieldValue = Invoke(inputObject, fieldName, dataTypeFor(fieldType))
    +              val clsName = getClassNameFromType(fieldType)
    +              val newPath = s"""- field (class: "$clsName", name: "$fieldName")""" +: walkedTypePath
    +              expressions.Literal(fieldName) :: extractorFor(fieldValue, fieldType, newPath) :: Nil
    +            })
    +          )
    --- End diff --
    
    Thanks, fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-178237590
  
    Oh, just found that it's covered by PR #10650 and has already been backported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166616262
  
    overall LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166682348
  
    **[Test build #48207 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48207/consoleFull)** for PR 10431 at commit [`47636db`](https://github.com/apache/spark/commit/47636db147fd8cf11bf9c8e0ab43a4d8c4f4f7c0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10431


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166631395
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48195/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166631394
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10431#discussion_r48276364
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala ---
    @@ -380,7 +380,7 @@ object ScalaReflection extends ScalaReflection {
         val clsName = getClassNameFromType(tpe)
         val walkedTypePath = s"""- root class: "${clsName}"""" :: Nil
         extractorFor(inputObject, tpe, walkedTypePath) match {
    -      case s: CreateNamedStruct => s
    +      case expressions.If(IsNull(_), expressions.Literal(null, _), s: CreateNamedStruct) => s
    --- End diff --
    
    Good point.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/10431#issuecomment-166980007
  
    Should we backport it now? I guess we do not need to wait for the release, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12478][SQL] Bugfix: Dataset fields of p...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10431#discussion_r48251882
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala ---
    @@ -466,14 +466,14 @@ object ScalaReflection extends ScalaReflection {
     
             case t if t <:< localTypeOf[Product] =>
               val params = getConstructorParameters(t)
    -
    -          CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    +          val nullOutput = expressions.Literal.create(null, silentSchemaFor(t).dataType)
    +          val nonNullOutput = CreateNamedStruct(params.flatMap { case (fieldName, fieldType) =>
    --- End diff --
    
    we can move `nullOutput` after `nonNullOutput`, so that we can use `nonNullOutput.dataType` instead of `silentSchemaFor(t).dataType`, which saves one computattion of schema.(not a big deal though)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org