You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/05/20 14:32:00 UTC
[jira] [Updated] (HIVE-24911) Metastore: Create index on SDS.CD_ID
for Postgres
[ https://issues.apache.org/jira/browse/HIVE-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-24911:
--------------------------------
Fix Version/s: 4.0.0
> Metastore: Create index on SDS.CD_ID for Postgres
> -------------------------------------------------
>
> Key: HIVE-24911
> URL: https://issues.apache.org/jira/browse/HIVE-24911
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: command-output.txt
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> While investigating HIVE-24870, we found that during a long incremental replication, an SDS.CD_ID can improve the performance.
> It was tested by postgres like below:
> {code}
> CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> DROP INDEX IF EXISTS "SDS_N50";
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> {code}
> Further results can be found in: [^command-output.txt]
> After some investigation, I found that this index is also part of the schemas for a very long time:
> orcale: HIVE-2928
> mysql: HIVE-2246
> mssql: HIVE-6862 (or earlier)
> ...except Postgres.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)