You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2021/05/20 14:32:00 UTC

[jira] [Updated] (HIVE-24911) Metastore: Create index on SDS.CD_ID for Postgres

     [ https://issues.apache.org/jira/browse/HIVE-24911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor updated HIVE-24911:
--------------------------------
    Fix Version/s: 4.0.0

> Metastore: Create index on SDS.CD_ID for Postgres
> -------------------------------------------------
>
>                 Key: HIVE-24911
>                 URL: https://issues.apache.org/jira/browse/HIVE-24911
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: command-output.txt
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> While investigating HIVE-24870, we found that during a long incremental replication, an SDS.CD_ID can improve the performance.
> It was tested by postgres like below:
> {code}
> CREATE INDEX IF NOT EXISTS "SDS_N50" ON "SDS" USING btree ("CD_ID");
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> DROP INDEX IF EXISTS "SDS_N50";
> EXPLAIN (ANALYZE,BUFFERS,TIMING) select count(*) from "SDS" where "CD_ID"=THE_MOST_FREQUENTLY_USED_CD_ID_HERE;
> {code}
> Further results can be found in:  [^command-output.txt] 
> After some investigation, I found that this index is also part of the schemas for a very long time:
> orcale: HIVE-2928
> mysql: HIVE-2246
> mssql: HIVE-6862 (or earlier)
> ...except Postgres.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)