You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Lefty Leverenz (JIRA)" <ji...@apache.org> on 2014/11/11 03:46:34 UTC
[jira] [Commented] (HIVE-7142) Hive multi serialization encoding
support
[ https://issues.apache.org/jira/browse/HIVE-7142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14205856#comment-14205856 ]
Lefty Leverenz commented on HIVE-7142:
--------------------------------------
[~chengxiang li], did you document this in the wiki yet? If so, we can remove the TODOC14 label.
If not, suggested doc locations are listed in a [previous comment| https://issues.apache.org/jira/browse/HIVE-7142?focusedCommentId=14096642&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14096642].
> Hive multi serialization encoding support
> -----------------------------------------
>
> Key: HIVE-7142
> URL: https://issues.apache.org/jira/browse/HIVE-7142
> Project: Hive
> Issue Type: Improvement
> Components: Serializers/Deserializers
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7142.1.patch.txt, HIVE-7142.2.patch, HIVE-7142.3.patch, HIVE-7142.4.patch
>
>
> Currently Hive only support serialize data into UTF-8 charset bytes or deserialize from UTF-8 bytes, real world users may want to load different kinds of encoded data into hive directly. This jira is dedicated to support serialize/deserialize all kinds of encoded data in SerDe layer.
> For user, only need to configure serialization encoding on table level by set serialization encoding through serde parameter, for example:
> {code:sql}
> CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES("serialization.encoding"='GBK');
> {code}
> or
> {code:sql}
> ALTER TABLE person SET SERDEPROPERTIES ('serialization.encoding'='GBK');
> {code}
> LIMITATIONS: Only LazySimpleSerDe support "serialization.encoding" property in this patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)