You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/08/19 05:39:13 UTC

[GitHub] [druid] maytasm opened a new pull request #11617: Fix bug in Variance Buffer Aggregator resulting in intermittent NaN when druid.generic.useDefaultValueForNull=false

maytasm opened a new pull request #11617:
URL: https://github.com/apache/druid/pull/11617


   Fix bug in Variance Buffer Aggregator resulting in intermittent NaN when druid.generic.useDefaultValueForNull=false
   
   ### Description
   
   In aggregate method of the Buffer ObjectVarianceAggregator, we should skip merging the ObjectVarianceAggregators when the other ObjectVarianceAggregator has count == 0
   
   Here is an example of how NaN can be returned:
   Imagine two Buffer ObjectVarianceAggregators, 
   the current ObjectVarianceAggregator has count = 2, sum = 0, var =0
   and other ObjectVarianceAggregator (holder2) has count =0, sum=0, var = 0
   The aggregate method would run as follows:
   `final double ratio = count / (double) holder2.count;`
   ratio becomes infinity
   then
   `final double t = sum / ratio - holder2.sum;`
   t becomes 0
   and finally
   `nvariance += holder2.nvariance + (ratio / (count + holder2.count) * t * t);`
   is the same as
   `holder2.nvariance + (infinity / 2 * 0 * 0);`
   and infinity / 2 is NaN
   Hence, variance will be NaN
   
   This PR has:
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #11617: Fix bug in Variance Buffer Aggregator resulting in intermittent NaN when druid.generic.useDefaultValueForNull=false

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #11617:
URL: https://github.com/apache/druid/pull/11617#discussion_r691807300



##########
File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceBufferAggregator.java
##########
@@ -238,7 +238,9 @@ public void aggregate(ByteBuffer buf, int position)
         buf.putDouble(position + NVARIANCE_OFFSET, holder2.nvariance);
         return;
       }
-
+      if (holder2.count == 0) {

Review comment:
       I think this early return could happen right after `Preconditions.checkState(holder2 != null);` on line 233




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #11617: Fix bug in Variance Buffer Aggregator resulting in intermittent NaN when druid.generic.useDefaultValueForNull=false

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #11617:
URL: https://github.com/apache/druid/pull/11617#discussion_r691808158



##########
File path: extensions-core/stats/src/main/java/org/apache/druid/query/aggregation/variance/VarianceBufferAggregator.java
##########
@@ -238,7 +238,9 @@ public void aggregate(ByteBuffer buf, int position)
         buf.putDouble(position + NVARIANCE_OFFSET, holder2.nvariance);
         return;
       }
-
+      if (holder2.count == 0) {

Review comment:
       Good catch. Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s merged pull request #11617: Fix bug in Variance Buffer Aggregator resulting in intermittent NaN when druid.generic.useDefaultValueForNull=false

Posted by GitBox <gi...@apache.org>.
suneet-s merged pull request #11617:
URL: https://github.com/apache/druid/pull/11617


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org