You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Phabricator (JIRA)" <ji...@apache.org> on 2012/05/04 19:42:50 UTC
[jira] [Commented] (HIVE-2529) metastore 0.8 upgrade script for
PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268567#comment-13268567 ]
Phabricator commented on HIVE-2529:
-----------------------------------
zhenxiao has commented on the revision "HIVE-2529 [jira] metastore 0.8 upgrade script for PostgreSQL".
Manually tested Hive Metastore upgrade from 0.7 to 0.8, and from 0.8 to 0.9
Before the upgrade:
[~/Notes/metastore/postgres]hive
Hive history file=/tmp/cloudera/hive_job_log_cloudera_201205041018_1363096543.txt
hive> show tables;
OK
Time taken: 2.913 seconds
hive> create table test3(a string) partitioned by (b int);
OK
Time taken: 0.213 seconds
hive> create table test(c array<int>, b int);
OK
Time taken: 0.176 seconds
hive> create table test2(key string, value string) partitioned by (ds string) stored as textfile;
OK
Time taken: 0.088 seconds
hive> load data local inpath '/home/cloudera/Code/hive/data/files/kv1.txt' into table test2 partition (ds='2012-02-09');
Copying data from file:/home/cloudera/Code/hive/data/files/kv1.txt
Copying file: file:/home/cloudera/Code/hive/data/files/kv1.txt
Loading data to table default.test2 partition (ds=2012-02-09)
OK
Time taken: 0.658 seconds
hive> create table src(key string);
OK
Time taken: 0.075 seconds
hive> create index src_index_8 on table src(key) as 'compact' WITH DEFERRED REBUILD IDXPROPERTIES ("prop1"="val1", "prop2"="val2");
OK
Time taken: 0.23 seconds
hive> alter index src_index_8 on src set IDXPROPERTIES ("prop1"="val1_new", "prop3"="val3");
OK
Time taken: 0.078 seconds
hive> show tables;
OK
default__src_src_index_8__
src
test
test2
test3
Time taken: 0.142 seconds
In postgreSQL metastore:
metastore=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------+-------+----------
public | BUCKETING_COLS | table | cloudera
public | COLUMNS | table | cloudera
public | DATABASE_PARAMS | table | cloudera
public | DBS | table | cloudera
public | DB_PRIVS | table | cloudera
public | GLOBAL_PRIVS | table | cloudera
public | IDXS | table | cloudera
public | INDEX_PARAMS | table | cloudera
public | NUCLEUS_TABLES | table | cloudera
public | PARTITIONS | table | cloudera
public | PARTITION_KEYS | table | cloudera
public | PARTITION_KEY_VALS | table | cloudera
public | PARTITION_PARAMS | table | cloudera
public | PART_COL_PRIVS | table | cloudera
public | PART_PRIVS | table | cloudera
public | ROLES | table | cloudera
public | ROLE_MAP | table | cloudera
public | SDS | table | cloudera
public | SD_PARAMS | table | cloudera
public | SEQUENCE_TABLE | table | cloudera
public | SERDES | table | cloudera
public | SERDE_PARAMS | table | cloudera
public | SORT_COLS | table | cloudera
public | TABLE_PARAMS | table | cloudera
public | TBLS | table | cloudera
public | TBL_COL_PRIVS | table | cloudera
public | TBL_PRIVS | table | cloudera
public | TYPES | table | cloudera
public | TYPE_FIELDS | table | cloudera
(29 rows)
metastore=# select * from "TBLS";
TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIG
INAL_TEXT
--------+-------------+-------+------------------+----------+-----------+-------+----------------------------+---------------+--------------------+----------
----------
1 | 1336151986 | 1 | 0 | cloudera | 0 | 1 | test3 | MANAGED_TABLE | |
2 | 1336152012 | 1 | 0 | cloudera | 0 | 2 | test | MANAGED_TABLE | |
3 | 1336152016 | 1 | 0 | cloudera | 0 | 3 | test2 | MANAGED_TABLE | |
4 | 1336152032 | 1 | 0 | cloudera | 0 | 5 | src | MANAGED_TABLE | |
5 | 1336152039 | 1 | 0 | | 0 | 6 | default__src_src_index_8__ | INDEX_TABLE | |
(5 rows)
metastore=# select * from "COLUMNS";
SD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX
-------+---------+-------------+---------------+-------------
1 | | a | string | 0
2 | | c | array<int> | 0
2 | | b | int | 1
3 | | key | string | 0
3 | | value | string | 1
4 | | key | string | 0
4 | | value | string | 1
5 | | key | string | 0
6 | | key | string | 0
6 | | _bucketname | string | 1
6 | | _offsets | array<bigint> | 2
7 | | key | string | 0
(12 rows)
metastore=# select * from "PARTITIONS";
PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID
---------+-------------+------------------+---------------+-------+--------
1 | 1336152022 | 0 | ds=2012-02-09 | 4 | 3
(1 row)
metastore=# select * from "IDXS";
INDEX_ID | CREATE_TIME | DEFERRED_REBUILD | INDEX_HANDLER_CLASS | INDEX_NAME | INDEX_TBL_ID | LAST_ACCESS_TIME | OR
IG_TBL_ID | SD_ID
----------+-------------+------------------+-------------------------------------------------------------+-------------+--------------+------------------+---
----------+-------
1 | 1336152039 | t | org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler | src_index_8 | 5 | 1336152039 |
4 | 7
(1 row)
metastore=# select * from "SDS";
SD_ID | INPUT_FORMAT | IS_COMPRESSED | LOCATION | NUM_BUCKETS |
OUTPUT_FORMAT | SERDE_ID
-------+------------------------------------------+---------------+-----------------------------------------------------------------+-------------+----------
--------------------------------------------------+----------
1 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test3 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1
2 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2
3 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test2 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3
4 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test2/ds=2012-02-09 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4
5 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/src | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5
6 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/default__src_src_index_8__ | -1 | org.apach
e.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat | 6
7 | org.apache.hadoop.mapred.TextInputFormat | f | | -1 | org.apach
e.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat | 7
(7 rows)
metastore=# select * from "PARTITION_KEYS";
TBL_ID | PKEY_COMMENT | PKEY_NAME | PKEY_TYPE | INTEGER_IDX
--------+--------------+-----------+-----------+-------------
1 | | b | int | 0
3 | | ds | string | 0
(2 rows)
metastore=# select * from "SERDES";
SERDE_ID | NAME | SLIB
----------+------+----------------------------------------------------
1 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
2 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
3 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
4 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
5 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
6 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
7 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
(7 rows)
metastore=# select * from "BUCKETING_COLS";
SD_ID | BUCKET_COL_NAME | INTEGER_IDX
-------+-----------------+-------------
(0 rows)
Upgrade from hive0.7 to hive0.8:
[~/Code/hive/metastore/scripts/upgrade/postgres]psql -f upgrade-0.7.0-to-0.8.0.postgres.sql metastore
?column?
------------------------------------------------
Upgrading MetaStore schema from 0.7.0 to 0.8.0
(1 row)
SET
?column?
---------------------------------------------------------------------------------
< HIVE-2246: Dedupe tables column schemas from partitions in the metastore db >
(1 row)
CREATE TABLE
CREATE TABLE
ALTER TABLE
?column?
----------------
Tables Created
(1 row)
ALTER TABLE
ALTER TABLE
?column?
-------------------
SDS Table Updated
(1 row)
INSERT 0 5
?column?
---------------
CDS populated
(1 row)
UPDATE 5
INSERT 0 9
?column?
-----------------
Tables Migrated
(1 row)
CREATE TABLE
INSERT 0 1
UPDATE 1
?column?
---------------------
Partitions Migrated
(1 row)
INSERT 0 1
UPDATE 1
INSERT 0 1
?column?
------------------
Indexes Migrated
(1 row)
ALTER TABLE
?column?
-----------------------
Columns Table Renamed
(1 row)
?column?
-------------------------------------------------------------------------
< HIVE-2215 Add api for marking querying set of partitions for events >
(1 row)
CREATE TABLE
CREATE INDEX
?column?
---------------------------------------------------------
Finished upgrading MetaStore schema from 0.7.0 to 0.8.0
(1 row)
The postgreSQL metastore after upgrading from 0.7 to 0.8:
metastore=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------+-------+----------
public | BUCKETING_COLS | table | cloudera
public | CDS | table | cloudera
public | COLUMNS_OLD | table | cloudera
public | COLUMNS_V2 | table | cloudera
public | DATABASE_PARAMS | table | cloudera
public | DBS | table | cloudera
public | DB_PRIVS | table | cloudera
public | GLOBAL_PRIVS | table | cloudera
public | IDXS | table | cloudera
public | INDEX_PARAMS | table | cloudera
public | NUCLEUS_TABLES | table | cloudera
public | PARTITIONS | table | cloudera
public | PARTITION_EVENTS | table | cloudera
public | PARTITION_KEYS | table | cloudera
public | PARTITION_KEY_VALS | table | cloudera
public | PARTITION_PARAMS | table | cloudera
public | PART_COL_PRIVS | table | cloudera
public | PART_PRIVS | table | cloudera
public | ROLES | table | cloudera
public | ROLE_MAP | table | cloudera
public | SDS | table | cloudera
public | SD_PARAMS | table | cloudera
public | SEQUENCE_TABLE | table | cloudera
public | SERDES | table | cloudera
public | SERDE_PARAMS | table | cloudera
public | SORT_COLS | table | cloudera
public | TABLE_PARAMS | table | cloudera
public | TBLS | table | cloudera
public | TBL_COL_PRIVS | table | cloudera
public | TBL_PRIVS | table | cloudera
public | TYPES | table | cloudera
public | TYPE_FIELDS | table | cloudera
(32 rows)
metastore=# select * from "TBLS";
TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIG
INAL_TEXT
--------+-------------+-------+------------------+----------+-----------+-------+----------------------------+---------------+--------------------+----------
----------
1 | 1336151986 | 1 | 0 | cloudera | 0 | 1 | test3 | MANAGED_TABLE | |
2 | 1336152012 | 1 | 0 | cloudera | 0 | 2 | test | MANAGED_TABLE | |
3 | 1336152016 | 1 | 0 | cloudera | 0 | 3 | test2 | MANAGED_TABLE | |
4 | 1336152032 | 1 | 0 | cloudera | 0 | 5 | src | MANAGED_TABLE | |
5 | 1336152039 | 1 | 0 | | 0 | 6 | default__src_src_index_8__ | INDEX_TABLE | |
(5 rows)
metastore=# select * from "COLUMNS_V2";
CD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX
-------+---------+-------------+---------------+-------------
1 | | a | string | 0
2 | | c | array<int> | 0
2 | | b | int | 1
3 | | key | string | 0
3 | | value | string | 1
5 | | key | string | 0
6 | | key | string | 0
6 | | _bucketname | string | 1
6 | | _offsets | array<bigint> | 2
7 | | key | string | 0
(10 rows)
metastore=# select * from "COLUMNS_OLD";
SD_ID | COMMENT | COLUMN_NAME | TYPE_NAME | INTEGER_IDX
-------+---------+-------------+---------------+-------------
1 | | a | string | 0
2 | | c | array<int> | 0
2 | | b | int | 1
3 | | key | string | 0
3 | | value | string | 1
4 | | key | string | 0
4 | | value | string | 1
5 | | key | string | 0
6 | | key | string | 0
6 | | _bucketname | string | 1
6 | | _offsets | array<bigint> | 2
7 | | key | string | 0
(12 rows)
metastore=# select * from "PARTITIONS";
PART_ID | CREATE_TIME | LAST_ACCESS_TIME | PART_NAME | SD_ID | TBL_ID
---------+-------------+------------------+---------------+-------+--------
1 | 1336152022 | 0 | ds=2012-02-09 | 4 | 3
(1 row)
metastore=# select * from "IDXS";
INDEX_ID | CREATE_TIME | DEFERRED_REBUILD | INDEX_HANDLER_CLASS | INDEX_NAME | INDEX_TBL_ID | LAST_ACCESS_TIME | OR
IG_TBL_ID | SD_ID
----------+-------------+------------------+-------------------------------------------------------------+-------------+--------------+------------------+---
----------+-------
1 | 1336152039 | t | org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler | src_index_8 | 5 | 1336152039 |
4 | 7
(1 row)
metastore=# select * from "SDS";
SD_ID | INPUT_FORMAT | IS_COMPRESSED | LOCATION | NUM_BUCKETS |
OUTPUT_FORMAT | SERDE_ID | CD_ID
-------+------------------------------------------+---------------+-----------------------------------------------------------------+-------------+----------
--------------------------------------------------+----------+-------
1 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test3 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1 | 1
2 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2 | 2
3 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test2 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3 | 3
5 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/src | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5 | 5
6 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/default__src_src_index_8__ | -1 | org.apach
e.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat | 6 | 6
4 | org.apache.hadoop.mapred.TextInputFormat | f | hdfs://localhost/user/hive/warehouse/test2/ds=2012-02-09 | -1 | org.apach
e.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4 | 3
7 | org.apache.hadoop.mapred.TextInputFormat | f | | -1 | org.apach
e.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat | 7 | 7
(7 rows)
metastore=# select * from "CDS";
CD_ID
-------
1
2
3
5
6
7
(6 rows)
metastore=# select * from "PARTITION_KEYS";
TBL_ID | PKEY_COMMENT | PKEY_NAME | PKEY_TYPE | INTEGER_IDX
--------+--------------+-----------+-----------+-------------
1 | | b | int | 0
3 | | ds | string | 0
(2 rows)
metastore=# select * from "SERDES";
SERDE_ID | NAME | SLIB
----------+------+----------------------------------------------------
1 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
2 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
3 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
4 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
5 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
6 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
7 | | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
(7 rows)
Upgrading to 0.9 is similar, and there is no schema object change:
[~/Code/hive/metastore/scripts/upgrade/postgres]psql -f upgrade-0.8.0-to-0.9.0.postgres.sql metastore
?column?
------------------------------------------------
Upgrading MetaStore schema from 0.8.0 to 0.9.0
(1 row)
?column?
---------------------------------------------------------
Finished upgrading MetaStore schema from 0.8.0 to 0.9.0
(1 row)
REVISION DETAIL
https://reviews.facebook.net/D3027
> metastore 0.8 upgrade script for PostgreSQL
> --------------------------------------------
>
> Key: HIVE-2529
> URL: https://issues.apache.org/jira/browse/HIVE-2529
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Affects Versions: 0.8.0
> Reporter: John Sichi
> Assignee: Zhenxiao Luo
> Priority: Blocker
> Attachments: HIVE-2529.1.patch.txt, HIVE-2529.D3027.1.patch
>
>
> I think you mentioned that this was in the works.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira