You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vihang Karajgaonkar (JIRA)" <ji...@apache.org> on 2018/05/30 17:22:00 UTC
[jira] [Comment Edited] (HIVE-19605) TAB_COL_STATS table has no
index on db/table name
[ https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495414#comment-16495414 ]
Vihang Karajgaonkar edited comment on HIVE-19605 at 5/30/18 5:21 PM:
---------------------------------------------------------------------
Hi [~tlipcon] I was looking at this and I noticed that the {{getTableColumnStatistics}} in fact fetches information from {{TAB_COL_STATS}} using (CAT_NAME, DB_NAME, TBL_NAME, COL_NAME). So the index should also include COL_NAME unless you think otherwise.
was (Author: vihangk1):
Hi [~tlipcon] I was looking at this and I noticed that the {{getTableColumnStatistics}} in fact fetches information from {{The TAB_COL_STATS}} using (CAT_NAME, DB_NAME, TBL_NAME, COL_NAME). So the index should also include COL_NAME unless you think otherwise.
> TAB_COL_STATS table has no index on db/table name
> -------------------------------------------------
>
> Key: HIVE-19605
> URL: https://issues.apache.org/jira/browse/HIVE-19605
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Todd Lipcon
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Attachments: HIVE-19605.01.patch
>
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. This makes those queries take a significant amount of time in large metastores since they do a full table scan.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)