You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "slim bouguerra (JIRA)" <ji...@apache.org> on 2018/03/19 23:01:00 UTC

[jira] [Updated] (HIVE-18780) Improve schema discovery For Druid Storage Handler

     [ https://issues.apache.org/jira/browse/HIVE-18780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

slim bouguerra updated HIVE-18780:
----------------------------------
    Attachment: HIVE-18780.patch

> Improve schema discovery For Druid Storage Handler
> --------------------------------------------------
>
>                 Key: HIVE-18780
>                 URL: https://issues.apache.org/jira/browse/HIVE-18780
>             Project: Hive
>          Issue Type: Improvement
>          Components: Druid integration
>            Reporter: slim bouguerra
>            Assignee: slim bouguerra
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HIVE-18780.patch
>
>
> Currently, Druid Storage adapter issues a Segment metadata Query every time the query is of type Select or Scan. Not only that but then every input split (map) will do the same as well since it is using the same Serde, this is very expensive and put a lot of pressure on the Druid Cluster. The way to fix this is to add the schema out of the calcite plan instead of serializing the query itself as part of the Hive query context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)