You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2018/10/31 05:55:16 UTC

[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/22898

    [SPARK-25746][SQL][followup] do not add unnecessary If expression

    ## What changes were proposed in this pull request?
    
    a followup of https://github.com/apache/spark/pull/22749.
    
    When we construct the new serializer in `ExpressionEncoder.tuple`, we don't need to add `if(isnull ...)` check for each field. They are either simple expressions that can propagate null correctly(e.g. `GetStructField(GetColumnByOrdinal(0, schema), index)`), or complex expression that already have the isnull check.
    
    ## How was this patch tested?
    
    existing tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22898.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22898
    
----
commit 82664439318b72d8446230515abb882b89767bb9
Author: Wenchen Fan <we...@...>
Date:   2018-10-31T05:44:44Z

    do not add unnecessary If expression

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98302/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98315/testReport)** for PR 22898 at commit [`3101406`](https://github.com/apache/spark/commit/31014067abf7d60f94d8aec663e7c2a3e9b9d58c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    cc @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22898#discussion_r229619306
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala ---
    @@ -124,14 +124,9 @@ object ExpressionEncoder {
             s"`GetColumnByOrdinal`, but there are ${getColExprs.size}")
     
           val input = GetStructField(GetColumnByOrdinal(0, schema), index)
    -      val newDeserializer = enc.objDeserializer.transformUp {
    +      enc.objDeserializer.transformUp {
             case GetColumnByOrdinal(0, _) => input
           }
    -      if (schema(index).nullable) {
    -        If(IsNull(input), Literal.create(null, newDeserializer.dataType), newDeserializer)
    -      } else {
    -        newDeserializer
    -      }
    --- End diff --
    
    Good catch!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22898#discussion_r229631397
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala ---
    @@ -124,14 +124,9 @@ object ExpressionEncoder {
             s"`GetColumnByOrdinal`, but there are ${getColExprs.size}")
     
           val input = GetStructField(GetColumnByOrdinal(0, schema), index)
    -      val newDeserializer = enc.objDeserializer.transformUp {
    +      enc.objDeserializer.transformUp {
             case GetColumnByOrdinal(0, _) => input
           }
    -      if (schema(index).nullable) {
    -        If(IsNull(input), Literal.create(null, newDeserializer.dataType), newDeserializer)
    --- End diff --
    
    oh, I see. If the child deserializer is a tuple deserializer, it is just
    
    ```scala
    val deserializer =
          NewInstance(cls, childrenDeserializers, ObjectType(cls), propagateNull = false)
    ```
    
    So it misses the `If(IsNull(..), null, ...)` pattern. We should wrap the `NewInstance` with `If(IsNull(..), null)` at L139.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4647/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98302 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98302/testReport)** for PR 22898 at commit [`8266443`](https://github.com/apache/spark/commit/82664439318b72d8446230515abb882b89767bb9).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22898#discussion_r229667653
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala ---
    @@ -124,14 +124,9 @@ object ExpressionEncoder {
             s"`GetColumnByOrdinal`, but there are ${getColExprs.size}")
     
           val input = GetStructField(GetColumnByOrdinal(0, schema), index)
    -      val newDeserializer = enc.objDeserializer.transformUp {
    +      enc.objDeserializer.transformUp {
             case GetColumnByOrdinal(0, _) => input
           }
    -      if (schema(index).nullable) {
    -        If(IsNull(input), Literal.create(null, newDeserializer.dataType), newDeserializer)
    --- End diff --
    
    good catch!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98292/testReport)** for PR 22898 at commit [`8266443`](https://github.com/apache/spark/commit/82664439318b72d8446230515abb882b89767bb9).
     * This patch **fails due to an unknown error code, -9**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98302 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98302/testReport)** for PR 22898 at commit [`8266443`](https://github.com/apache/spark/commit/82664439318b72d8446230515abb882b89767bb9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98315/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4653/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22898


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98315 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98315/testReport)** for PR 22898 at commit [`3101406`](https://github.com/apache/spark/commit/31014067abf7d60f94d8aec663e7c2a3e9b9d58c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4663/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98292/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22898
  
    **[Test build #98292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98292/testReport)** for PR 22898 at commit [`8266443`](https://github.com/apache/spark/commit/82664439318b72d8446230515abb882b89767bb9).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org