You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/06/04 22:46:00 UTC
[jira] [Created] (HIVE-19790) Metastore upgrade: 3.1.0 upgrade
script is slow and non-idempotent
Gopal V created HIVE-19790:
------------------------------
Summary: Metastore upgrade: 3.1.0 upgrade script is slow and non-idempotent
Key: HIVE-19790
URL: https://issues.apache.org/jira/browse/HIVE-19790
Project: Hive
Issue Type: Bug
Components: Standalone Metastore
Reporter: Gopal V
Because of the giant bit-vectors stored on mysql, the update of PART_COL_STATS is very slow and also is not idempotent.
{code}
--------------
UPDATE `PART_COL_STATS`
SET `CAT_NAME` = 'hive'
--------------
Query OK, 0 rows affected (4 min 1.57 sec)
Rows matched: 778025 Changed: 0 Warnings: 0
{code}
Adding a filter speeds it up because it will no longer overwrite
{code}
mysql> explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` ='';
--------------
explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` =''
--------------
+----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+
| 1 | SIMPLE | PART_COL_STATS | range | PCS_STATS_IDX | PCS_STATS_IDX | 258 | const | 1 | Using where; Using temporary |
+----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+
1 row in set (0.00 sec)
{code}
this would be much faster to re-run and would not accidentally overwrite any existing CAT_NAMEs.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)