You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by va...@apache.org on 2018/03/06 00:49:31 UTC

spark git commit: [SPARK-23604][SQL] Change Statistics.isEmpty to !Statistics.hasNonNul…

Repository: spark
Updated Branches:
  refs/heads/master f6b49f9d1 -> 8c5b34c42


[SPARK-23604][SQL] Change Statistics.isEmpty to !Statistics.hasNonNul…

…lValue

## What changes were proposed in this pull request?

Parquet 1.9 will change the semantics of Statistics.isEmpty slightly
to reflect if the null value count has been set. That breaks a
timestamp interoperability test that cares only about whether there
are column values present in the statistics of a written file for an
INT96 column. Fix by using Statistics.hasNonNullValue instead.

## How was this patch tested?

Unit tests continue to pass against Parquet 1.8, and also pass against
a Parquet build including PARQUET-1217.

Author: Henry Robinson <he...@cloudera.com>

Closes #20740 from henryr/spark-23604.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8c5b34c4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8c5b34c4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8c5b34c4

Branch: refs/heads/master
Commit: 8c5b34c425bda2079a1ff969b12c067f2bb3f18f
Parents: f6b49f9
Author: Henry Robinson <he...@cloudera.com>
Authored: Mon Mar 5 16:49:24 2018 -0800
Committer: Marcelo Vanzin <va...@cloudera.com>
Committed: Mon Mar 5 16:49:24 2018 -0800

----------------------------------------------------------------------
 .../datasources/parquet/ParquetInteroperabilitySuite.scala         | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/8c5b34c4/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
index fbd83a0..9c75965 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala
@@ -184,7 +184,7 @@ class ParquetInteroperabilitySuite extends ParquetCompatibilityTest with SharedS
                 // when the data is read back as mentioned above, b/c int96 is unsigned.  This
                 // assert makes sure this holds even if we change parquet versions (if eg. there
                 // were ever statistics even on unsigned columns).
-                assert(columnStats.isEmpty)
+                assert(!columnStats.hasNonNullValue)
               }
 
               // These queries should return the entire dataset with the conversion applied,


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org