You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Stamatis Zampetakis (Jira)" <ji...@apache.org> on 2022/07/13 18:03:00 UTC

[jira] [Assigned] (HIVE-26389) ALTER TABLE CASCADE is slow for tables with many partitions

     [ https://issues.apache.org/jira/browse/HIVE-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stamatis Zampetakis reassigned HIVE-26389:
------------------------------------------


> ALTER TABLE CASCADE is slow for tables with many partitions
> -----------------------------------------------------------
>
>                 Key: HIVE-26389
>                 URL: https://issues.apache.org/jira/browse/HIVE-26389
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore, Query Planning
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>         Attachments: native_sql_queries.txt, per_partition_sql_queries.txt
>
>
> Consider the following simplified scenario with a table having two partitions.
> {code:sql}
> CREATE TABLE student (fname string, lname string) PARTITIONED BY (department string);
> INSERT INTO student VALUES ('Alex','Dumas', 'Computer Science');
> INSERT INTO student VALUES ('Victor','Hugo', 'Physics');
> {code}
> Altering a column of this table and propagating the changes to the partitions (using the CASCADE) syntax is slow.
> {code:sql}
> ALTER TABLE student CHANGE lname lastname STRING CASCADE;
> {code}
> The seemingly simple ALTER statement outlined above triggers roughly 136 SQL queries in the underlying DBMS of the metastore (see native_sql_queries.txt).
> We can observe that some of these queries are recurring and appear as many times as there are partitions in the table (see per_partition_sql_queries.txt).
> As the number of partitions grows so does the number of queries so if we manage to reduce the number of queries send per partition or make them more efficient this will have a positive impact on performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)