You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "slim bouguerra (JIRA)" <ji...@apache.org> on 2018/02/17 05:57:00 UTC

[jira] [Assigned] (HIVE-16026) Generated query will timeout and/or kill the druid cluster.

     [ https://issues.apache.org/jira/browse/HIVE-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

slim bouguerra reassigned HIVE-16026:
-------------------------------------

    Assignee: slim bouguerra

> Generated query will timeout and/or kill the druid cluster.
> -----------------------------------------------------------
>
>                 Key: HIVE-16026
>                 URL: https://issues.apache.org/jira/browse/HIVE-16026
>             Project: Hive
>          Issue Type: Bug
>          Components: Druid integration
>            Reporter: slim bouguerra
>            Assignee: slim bouguerra
>            Priority: Major
>             Fix For: 3.0.0
>
>
> Grouping by `__time` and another dimension generate a query with granularity NONE with an interval from 1970 to 3000. This will kill the druid cluster because druid group by strategy will create cursor for every ms and there is lot of milliseconds between 1970 and 3000. Hence such query can turn into a select then do the group by within hive. This should only happen when we don't know the `__time` granularity.
> {code}
> explain select `__time`, userid from login_druid group by `__time`, userid
>     > ;
> OK
> Plan optimized by CBO.
> Stage-0
>   Fetch Operator
>     limit:-1
>     Select Operator [SEL_1]
>       Output:["_col0","_col1"]
>       TableScan [TS_0]
>         Output:["__time","userid"],properties:{"druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_user_login\",\"granularity\":\"NONE\",\"dimensions\":[\"userid\"],\"limitSpec\":{\"type\":\"default\"},\"aggregations\":[{\"type\":\"longSum\",\"name\":\"dummy_agg\",\"fieldName\":\"dummy_agg\"}],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"groupBy"}
> {code}  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)