You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Nishant Bangarwa (JIRA)" <ji...@apache.org> on 2017/03/03 14:32:45 UTC
[jira] [Updated] (CALCITE-1670) Count distinct on druid is
translated to Cardinality aggregator which is approximate
[ https://issues.apache.org/jira/browse/CALCITE-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nishant Bangarwa updated CALCITE-1670:
--------------------------------------
Description:
Right now count distinct on Druid is pushed as a 'cardinality' aggregator which uses hyperloglog and return approximate results. See cardinality aggregator here - http://druid.io/docs/latest/querying/aggregations.html for details.
https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
{code}
case COUNT:
if (aggCall.isDistinct()) {
return new JsonCardinalityAggregation("cardinality", name, list);
}
return new JsonAggregation("count", name, only);
{code}
The current recommended way in druid to get exact counts is to do a nested groupby query.
was:
Right now count distinct on Druid is pushed as a 'cardinality' aggregator which uses hyperloglog and return approximate results. See cardinality aggregator here - http://druid.io/docs/latest/querying/aggregations.html for details.
https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
{code}
case COUNT:
if (aggCall.isDistinct()) {
return new JsonCardinalityAggregation("cardinality", name, list);
}
return new JsonAggregation("count", name, only);
{code}
> Count distinct on druid is translated to Cardinality aggregator which is approximate
> ------------------------------------------------------------------------------------
>
> Key: CALCITE-1670
> URL: https://issues.apache.org/jira/browse/CALCITE-1670
> Project: Calcite
> Issue Type: Bug
> Reporter: Nishant Bangarwa
> Assignee: Julian Hyde
>
> Right now count distinct on Druid is pushed as a 'cardinality' aggregator which uses hyperloglog and return approximate results. See cardinality aggregator here - http://druid.io/docs/latest/querying/aggregations.html for details.
> https://github.com/apache/calcite/blob/master/druid/src/main/java/org/apache/calcite/adapter/druid/DruidQuery.java#L721
> {code}
> case COUNT:
> if (aggCall.isDistinct()) {
> return new JsonCardinalityAggregation("cardinality", name, list);
> }
> return new JsonAggregation("count", name, only);
> {code}
> The current recommended way in druid to get exact counts is to do a nested groupby query.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)