You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2017/11/16 15:49:00 UTC
[jira] [Commented] (HIVE-18083) Support UTF8 in MySQL Metastore
Backend
[ https://issues.apache.org/jira/browse/HIVE-18083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255504#comment-16255504 ]
Alan Gates commented on HIVE-18083:
-----------------------------------
Conceptually I like this, but I'm worried in terms of compatibility and testing. In particular, what do we do for the many users who already have a metastore with tables set to Latin1? We need to make sure as they create new tables (e.g. the workload manager tables in Hive 3) that they match their existing tables. I also suspect this will create a whole different set of bugs in terms of table names, column names, etc. We will need a plan to test it very thoroughly.
> Support UTF8 in MySQL Metastore Backend
> ---------------------------------------
>
> Key: HIVE-18083
> URL: https://issues.apache.org/jira/browse/HIVE-18083
> Project: Hive
> Issue Type: Improvement
> Components: Metastore, Standalone Metastore
> Affects Versions: 3.0.0, 2.4.0
> Reporter: BELUGA BEHR
>
> {code:sql|title=hive-schema-2.2.0.mysql.sql}
> CREATE TABLE IF NOT EXISTS `COLUMNS_V2` (
> `CD_ID` bigint(20) NOT NULL,
> `COMMENT` varchar(256) CHARACTER SET latin1 COLLATE latin1_bin DEFAULT NULL,
> `COLUMN_NAME` varchar(767) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL,
> `TYPE_NAME` varchar(4000) DEFAULT NULL,
> `INTEGER_IDX` int(11) NOT NULL,
> PRIMARY KEY (`CD_ID`,`COLUMN_NAME`),
> KEY `COLUMNS_V2_N49` (`CD_ID`),
> CONSTRAINT `COLUMNS_V2_FK1` FOREIGN KEY (`CD_ID`) REFERENCES `CDS` (`CD_ID`)
> ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
> {code}
> Hive explicitly defines a {{CHARACTER SET latin1 COLLATE latin1_bin}} in the schema design. This explicit definition should either be removed, so that it can fallback onto the database administrator's defaults, or changed to {{CHARACTER SET utf8 COLLATE utf8_bin}} to change the explicit definition to utf8.
> This will allow Hive to support UTF8 characters in MySQL backend databases for our international friends.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)