You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/05/17 16:18:01 UTC
[GitHub] [druid] jobar opened a new issue #11264: SQL query uses TopN when grouping by time and other dimension
jobar opened a new issue #11264:
URL: https://github.com/apache/druid/issues/11264
### Description
When querying the SQL druid endpoint with a query doing a group-by on two fields and on of them is time related, the TopN query-type could be used instead of the group-by one, with the time grouping being implemented as "granularity".
Example:
```
EXPLAIN PLAN FOR
SELECT FLOOR("__time" TO MONTH) AS "__timestamp",
"my_field" AS "my_field",
SUM(my_value) AS "sum_my_value"
FROM my_data_source
WHERE "__time" >= '...'
AND "__time" < '...'
GROUP BY "my_field", "FLOOR("__time" TO MONTH)
ORDER BY sum_my_value
LIMIT 500;
DruidQueryRel(query=[{"queryType":"groupBy","dataSource":{"type":"table","name":"my_data_source"}...
```
### Motivation
Group-by queries are a lot more expensive than TopN queries, this change would allow to get results a lot faster and cheaper.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org