You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2020/09/02 09:45:25 UTC

[GitHub] [orc] pgaref commented on a change in pull request #508: ORC-629: PPD: Floating point NaN is not transitive across comparisons

pgaref commented on a change in pull request #508:
URL: https://github.com/apache/orc/pull/508#discussion_r481942384



##########
File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java
##########
@@ -495,6 +495,14 @@ static TruthValue evaluatePredicateProto(OrcProto.ColumnStatistics statsProto,
                    " include ORC-517. Writer version: {}",
           predicate.getColumnName(), writerVersion);
       return TruthValue.YES_NO_NULL;
+    } else if (category == TypeDescription.Category.DOUBLE) {
+      DoubleColumnStatistics dstas = (DoubleColumnStatistics) cs;
+      if (!Double.isFinite(dstas.getMinimum()) || !Double.isFinite(dstas.getMaximum())

Review comment:
       Hey @wgtmac -- the logic is replicated across Cpp and Java versions.
   
   Regarding the broad check, the way we compare Double Stats (without Double.compare) can also lead to wrong min/max values (see test case) -- this is also something we could improve. https://github.com/apache/orc/blob/ca33ce64bf1fa8b3696e2e44b32237d842c70df3/java/core/src/java/org/apache/orc/impl/ColumnStatisticsImpl.java#L563
   
   However, we could probably replace the infinity check with a Nan check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org