You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Nikhil Gupta (Jira)" <ji...@apache.org> on 2021/01/12 05:34:00 UTC

[jira] [Assigned] (HIVE-24621) TEXT and varchar datatype does not support unicode encoding in MSSQL

     [ https://issues.apache.org/jira/browse/HIVE-24621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nikhil Gupta reassigned HIVE-24621:
-----------------------------------


> TEXT and varchar datatype does not support unicode encoding in MSSQL
> --------------------------------------------------------------------
>
>                 Key: HIVE-24621
>                 URL: https://issues.apache.org/jira/browse/HIVE-24621
>             Project: Hive
>          Issue Type: Bug
>          Components: Standalone Metastore
>    Affects Versions: 4.0.0
>            Reporter: Nikhil Gupta
>            Assignee: Nikhil Gupta
>            Priority: Critical
>
> Why Unicode is required?
> In following example the Chinese character cannot be properly interpreted. 
> {noformat}
> CREATE VIEW `test_view` AS select `test_tbl_char`.`col1` from `test_db5`.`test_tbl_char` where `test_tbl_char`.`col1`='你好'; 
> show create table test_view;
> +----------------------------------------------------+
> |                   createtab_stmt                   |
> +----------------------------------------------------+
> | CREATE VIEW `test_view` AS select `test_tbl_char`.`col1` from `test_db5`.`test_tbl_char` where `test_tbl_char`.`col1`='??' |
> +----------------------------------------------------+ {noformat}
>  
> This issue comes because TBLS is defined as follows:
>  
> CREATE TABLE TBLS
> (
>  TBL_ID bigint NOT NULL,
>  CREATE_TIME int NOT NULL,
>  DB_ID bigint NULL,
>  LAST_ACCESS_TIME int NOT NULL,
>  OWNER nvarchar(767) NULL,
>  OWNER_TYPE nvarchar(10) NULL,
>  RETENTION int NOT NULL,
>  SD_ID bigint NULL,
>  TBL_NAME nvarchar(256) NULL,
>  TBL_TYPE nvarchar(128) NULL,
>  VIEW_EXPANDED_TEXT text NULL,
>  VIEW_ORIGINAL_TEXT text NULL,
>  IS_REWRITE_ENABLED bit NOT NULL DEFAULT 0,
>  WRITE_ID bigint NOT NULL DEFAULT 0
> );
> Text data type does not support unicode encoding irrespective of collation
> varchar data type does not support unicode encoding prior to SQL Server 2019. Also UTF8 enabled Collation needs to be defined for use of unicode characters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)