You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/08/07 20:24:00 UTC

[jira] [Comment Edited] (HIVE-20279) HiveContextAwareRecordReader slows down Druid Scan queries.

    [ https://issues.apache.org/jira/browse/HIVE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572272#comment-16572272 ] 

Gopal V edited comment on HIVE-20279 at 8/7/18 8:23 PM:
--------------------------------------------------------

[~ashutoshc]: this patch is probably too fragile - I chased this bug down to DruidQueryRecordReader not implementing ::getPos().

https://github.com/apache/hive/blob/master/druid-handler/src/java/org/apache/hadoop/hive/druid/serde/DruidQueryRecordReader.java#L145


was (Author: gopalv):
[~ashutoshc]: this patch is probably too fragile - I chased this bug down to DruidQueryRecordReader not implementing ::getPos().

> HiveContextAwareRecordReader slows down Druid Scan queries. 
> ------------------------------------------------------------
>
>                 Key: HIVE-20279
>                 URL: https://issues.apache.org/jira/browse/HIVE-20279
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Nishant Bangarwa
>            Assignee: Nishant Bangarwa
>            Priority: Major
>         Attachments: HIVE-20279.1.patch, HIVE-20279.patch, scan2.svg
>
>
> HiveContextAwareRecordReader add lots of overhead for Druid Scan Queries. 
> See attached flame graph. 
> Looks like the operations for checking for existence of footer/header buffer takes most of time For druid and other storage handlers that do not have footer buffer we should skip the logic for checking the existence for storage handlers atleast. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)