You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2018/05/30 17:22:00 UTC

[jira] [Comment Edited] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

    [ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495414#comment-16495414 ] 

Vihang Karajgaonkar edited comment on HIVE-19605 at 5/30/18 5:21 PM:
---------------------------------------------------------------------

Hi [~tlipcon] I was looking at this and I noticed that the {{getTableColumnStatistics}} in fact fetches information from {{TAB_COL_STATS}} using (CAT_NAME, DB_NAME, TBL_NAME, COL_NAME). So the index should also include COL_NAME unless you think otherwise. 


was (Author: vihangk1):
Hi [~tlipcon] I was looking at this and I noticed that the {{getTableColumnStatistics}} in fact fetches information from {{The TAB_COL_STATS}} using (CAT_NAME, DB_NAME, TBL_NAME, COL_NAME). So the index should also include COL_NAME unless you think otherwise. 

> TAB_COL_STATS table has no index on db/table name
> -------------------------------------------------
>
>                 Key: HIVE-19605
>                 URL: https://issues.apache.org/jira/browse/HIVE-19605
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Todd Lipcon
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>         Attachments: HIVE-19605.01.patch
>
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. This makes those queries take a significant amount of time in large metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)