You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/02/24 05:27:00 UTC
[jira] [Assigned] (SPARK-34515) Fix NPE if InSet contains null
value during getPartitionsByFilter
[ https://issues.apache.org/jira/browse/SPARK-34515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-34515:
------------------------------------
Assignee: (was: Apache Spark)
> Fix NPE if InSet contains null value during getPartitionsByFilter
> -----------------------------------------------------------------
>
> Key: SPARK-34515
> URL: https://issues.apache.org/jira/browse/SPARK-34515
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: ulysses you
> Priority: Minor
>
> Spark will convert InSet to `>= and <=` if it's values size over `spark.sql.hive.metastorePartitionPruningInSetThreshold` during pruning partition . At this case, if values contain a null, we will get such exception
>
> {code:java}
> java.lang.NullPointerException
> at org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:1389)
> at org.apache.spark.unsafe.types.UTF8String.compareTo(UTF8String.java:50)
> at scala.math.LowPriorityOrderingImplicits$$anon$3.compare(Ordering.scala:153)
> at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
> at java.util.TimSort.sort(TimSort.java:220)
> at java.util.Arrays.sort(Arrays.java:1438)
> at scala.collection.SeqLike.sorted(SeqLike.scala:659)
> at scala.collection.SeqLike.sorted$(SeqLike.scala:647)
> at scala.collection.AbstractSeq.sorted(Seq.scala:45)
> at org.apache.spark.sql.hive.client.Shim_v0_13.convert$1(HiveShim.scala:772)
> at org.apache.spark.sql.hive.client.Shim_v0_13.$anonfun$convertFilters$4(HiveShim.scala:826)
> at scala.collection.immutable.Stream.flatMap(Stream.scala:489)
> at org.apache.spark.sql.hive.client.Shim_v0_13.convertFilters(HiveShim.scala:826)
> at org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:848)
> at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:750)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org