You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/09/27 07:31:46 UTC

[GitHub] [orc] wgtmac commented on pull request #915: ORC-98: Add support for t-digests to ORC

wgtmac commented on pull request #915:
URL: https://github.com/apache/orc/pull/915#issuecomment-927607518


   > In some environments the test has an overflow check that is not as expected. I haven't figured out why, but I believe this overflow detection code is incorrect.
   > https://github.com/apache/orc/blob/6da96bb8ceb64528d082974efed411c4c29f3408/c%2B%2B/src/Statistics.cc#L180-L190
   > 
   > 
   > A counter-example can easily be given
   > Assume `sum=1`, `update(std::numeric_limits<int64_t>::max(), 3);`
   > `value * repetitions + _stats.getSum()` is overflowed, but is still a positive number : 9223372036854775806
   > @dongjoon-hyun @wgtmac What do you think? :)
   
   This piece of code was simply copied from the java side:
   https://github.com/apache/orc/blob/main/java/core/src/java/org/apache/orc/impl/ColumnStatisticsImpl.java#L362-L379
   
   We need make sure value * repetitions does not overflow before checking the sum. : (


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org