You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Niklaus Xiao (JIRA)" <ji...@apache.org> on 2016/09/30 10:44:20 UTC

[jira] [Commented] (HIVE-14867) "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe

    [ https://issues.apache.org/jira/browse/HIVE-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535660#comment-15535660 ] 

Niklaus Xiao commented on HIVE-14867:
-------------------------------------

{{LazySimpleSerDe}} works as expected:
{code}
create table t__2(a string, b string) row format delimited fields terminated by ',' stored as textfile;
{code}

load data into table t__2:
{code}
1,Lily,HW,abc
2,Lucy,LX,asdf
3,Lilei,XX,ss
{code}

select from t_2:
{code}
select * from t__2;
+---------+---------+--+
| t__2.a  | t__2.b  |
+---------+---------+--+
| 1       | Lily    |
| 2       | Lucy    |
| 3       | Lilei   |
+---------+---------+--+
3 rows selected (0.382 seconds)
{code}


> "serialization.last.column.takes.rest" does not work for MultiDelimitSerDe
> --------------------------------------------------------------------------
>
>                 Key: HIVE-14867
>                 URL: https://issues.apache.org/jira/browse/HIVE-14867
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 1.3.0
>            Reporter: Niklaus Xiao
>            Assignee: Niklaus Xiao
>
> Create table with MultiDelimitSerde:
> {code}
> CREATE TABLE foo (a string, b string) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="|@|","collection.delim"=":","mapkey.delim"="@") stored as textfile;
> {code}
> load data into table:
> {code}
> 1|@|Lily|@|HW|@|abc
> 2|@|Lucy|@|LX|@|123
> 3|@|Lilei|@|XX|@|3434
> {code}
> select data from this table:
> {code}
> select * from foo;
> +---------+----------------+--+
> | foo.a  |     foo.b     |
> +---------+----------------+--+
> | 1       | Lily^AHW^Aabc    |
> | 2       | Lucy^ALX^A123    |
> | 3       | Lilei^AXX^A3434  |
> +---------+----------------+--+
> 3 rows selected (0.905 seconds)
> {code}
> You can see the last column takes all the data, and replace the delimiter to default ^A.
> lastColumnTakesRestString should be false by default: 
> {code}
>     String lastColumnTakesRestString = tbl
>         .getProperty(serdeConstants.SERIALIZATION_LAST_COLUMN_TAKES_REST);
>     lastColumnTakesRest = (lastColumnTakesRestString != null && lastColumnTakesRestString
>         .equalsIgnoreCase("true"));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)