You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/28 12:29:04 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request #30170: [SPARK-33267] Fix NPE issue on 'In' filter when one of values contains null

HeartSaVioR opened a new pull request #30170:
URL: https://github.com/apache/spark/pull/30170


   
   ### What changes were proposed in this pull request?
   
   This PR proposes to fix the NPE issue on `In` filter when one of values contain null. In real case, you can trigger this issue when you try to push down the filter with `in (..., null)` against V2 source table. `DataSourceStrategy` caches the mapping (filter instance -> expression) in HashMap, which leverages hash code on the key, hence it could trigger the NPE issue.
   
   ### Why are the changes needed?
   
   This is an obvious bug as `In` filter doesn't care about null value when calculating hash code.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, previously the query with having `null` in "in" condition against data source V2 source table supporting push down filter failed with NPE, whereas after the PR the query will not fail.
   
   ### How was this patch tested?
   
   UT added. The new UT fails without the PR and passes with the PR.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717901573


   **[Test build #130367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130367/testReport)** for PR 30170 at commit [`4455edb`](https://github.com/apache/spark/commit/4455edbcfe80cba5aeb37eb9584ecc26f0fa78c3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717901573


   **[Test build #130367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130367/testReport)** for PR 30170 at commit [`4455edb`](https://github.com/apache/spark/commit/4455edbcfe80cba5aeb37eb9584ecc26f0fa78c3).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717989007






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718076902






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718075623


   **[Test build #130367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130367/testReport)** for PR 30170 at commit [`4455edb`](https://github.com/apache/spark/commit/4455edbcfe80cba5aeb37eb9584ecc26f0fa78c3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717913071


   **[Test build #130369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130369/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717955276






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717941035






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718076902






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717974835


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34973/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL} Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717913071


   **[Test build #130369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130369/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717954061


   **[Test build #130371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130371/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717941840


   **[Test build #130369 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130369/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717923208


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34970/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717942211


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130369/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #30170: [SPARK-33267] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #30170:
URL: https://github.com/apache/spark/pull/30170#discussion_r513411944



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala
##########
@@ -413,6 +413,16 @@ class DataSourceV2Suite extends QueryTest with SharedSparkSession with AdaptiveS
       }
     }
   }
+
+  test("SPARK-33267: push down with condition 'in (..., null)' should not throw NPE") {
+    Seq(classOf[AdvancedDataSourceV2], classOf[JavaAdvancedDataSourceV2]).foreach { cls =>
+      withClue(cls.getName) {
+        val df = spark.read.format(cls.getName).load()
+        // before SPARK-33267 below query just threw NPE
+        df.select('i).where("i in (1, null)").show()

Review comment:
       nit: shall we use `collect()` or `count()` instead of `show()`? Just to make the console clean :-).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718154589


   **[Test build #130371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130371/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717989007






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on a change in pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
srowen commented on a change in pull request #30170:
URL: https://github.com/apache/spark/pull/30170#discussion_r513467006



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/sources/filters.scala
##########
@@ -164,7 +164,7 @@ case class In(attribute: String, values: Array[Any]) extends Filter {
     var h = attribute.hashCode
     values.foreach { v =>
       h *= 41
-      h += v.hashCode()
+      h += (if (v != null) v.hashCode() else 0)

Review comment:
       Totally fine; Objects.hashCode has the same logic




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718087300


   Thank you, @HeartSaVioR and all!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717954061


   **[Test build #130371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130371/testReport)** for PR 30170 at commit [`aae3e0c`](https://github.com/apache/spark/commit/aae3e0ccc426282463860c2e2a76bbe5860f572d).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717948990


   retest this, please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717941008


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34970/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717942201






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717936667


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34972/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717941035






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717988980


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34973/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718155578






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-718155578






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717955276






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on a change in pull request #30170: [SPARK-33267] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on a change in pull request #30170:
URL: https://github.com/apache/spark/pull/30170#discussion_r513416388



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala
##########
@@ -413,6 +413,16 @@ class DataSourceV2Suite extends QueryTest with SharedSparkSession with AdaptiveS
       }
     }
   }
+
+  test("SPARK-33267: push down with condition 'in (..., null)' should not throw NPE") {
+    Seq(classOf[AdvancedDataSourceV2], classOf[JavaAdvancedDataSourceV2]).foreach { cls =>
+      withClue(cls.getName) {
+        val df = spark.read.format(cls.getName).load()
+        // before SPARK-33267 below query just threw NPE
+        df.select('i).where("i in (1, null)").show()

Review comment:
       Good point, will fix.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #30170:
URL: https://github.com/apache/spark/pull/30170


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30170: [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30170:
URL: https://github.com/apache/spark/pull/30170#issuecomment-717955255


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34972/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org