You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eric Lin (JIRA)" <ji...@apache.org> on 2015/11/24 07:05:11 UTC

[jira] [Updated] (HIVE-12506) SHOW CREATE TABLE command creates a table that does not work for RCFile format

     [ https://issues.apache.org/jira/browse/HIVE-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Lin updated HIVE-12506:
----------------------------
    Description: 
See the following test case:

1) Create a table with RCFile format:

{code}
DROP TABLE IF EXISTS test;
CREATE TABLE test (a int) PARTITIONED BY (p int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS RCFILE;
{code}

2) run "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:      	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

shows that SerDe used is "ColumnarSerDe"

3) run "SHOW CREATE TABLE" and get the output:

{code}
CREATE TABLE `test`(
  `a` int)
PARTITIONED BY (
  `p` int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
TBLPROPERTIES (
  'transient_lastDdlTime'='1448343875')
{code}

Note that there is no mention of "ColumnarSerDe"

4) Drop the table and then create the table again using the output from 3)

5) Check the output of "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

The SerDe falls back to "LazySimpleSerDe", which is not correct.

Any further query tries to INSERT or SELECT this table will fail with errors

I suspect that we can't specify ROW FORMAT DELIMITED with ROW FORMAT SERDE at the same time at table creation, this causes confusion to end users as copy table structure using "SHOW CREATE TABLE" will not work.


  was:
See the following test case:

1) Create a table with RCFile format:

{code}
DROP TABLE IF EXISTS test;
CREATE TABLE test (a int) PARTITIONED BY (p int)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS RCFILE;
{code}

2) run "DESC FORMATTED test"

{code}
# Storage Information
SerDe Library:      	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

shows that SerDe used is "ColumnarSerDe"

3) run "SHOW CREATE TABLE" and get the output:

{code}
CREATE TABLE `test`(
  `a` int)
PARTITIONED BY (
  `p` int)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '|'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
TBLPROPERTIES (
  'transient_lastDdlTime'='1448343875')
{code}

Note that there is no mention of "ColumnarSerDe"

4) Drop the table and then create the table again using the output from 3)

5) Check the output of "DESC FORMATTED test"

{code}

# Storage Information
SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
{code}

The SerDe falls back to "LazySimpleSerDe", which is not correct.

Any further query tries to INSERT or SELECT this table will fail with errors



> SHOW CREATE TABLE command creates a table that does not work for RCFile format
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-12506
>                 URL: https://issues.apache.org/jira/browse/HIVE-12506
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 1.1.1
>            Reporter: Eric Lin
>
> See the following test case:
> 1) Create a table with RCFile format:
> {code}
> DROP TABLE IF EXISTS test;
> CREATE TABLE test (a int) PARTITIONED BY (p int)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
> STORED AS RCFILE;
> {code}
> 2) run "DESC FORMATTED test"
> {code}
> # Storage Information
> SerDe Library:      	org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe
> InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
> OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
> {code}
> shows that SerDe used is "ColumnarSerDe"
> 3) run "SHOW CREATE TABLE" and get the output:
> {code}
> CREATE TABLE `test`(
>   `a` int)
> PARTITIONED BY (
>   `p` int)
> ROW FORMAT DELIMITED
>   FIELDS TERMINATED BY '|'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
> LOCATION
>   'hdfs://node5.lab.cloudera.com:8020/user/hive/warehouse/case_78732.db/test'
> TBLPROPERTIES (
>   'transient_lastDdlTime'='1448343875')
> {code}
> Note that there is no mention of "ColumnarSerDe"
> 4) Drop the table and then create the table again using the output from 3)
> 5) Check the output of "DESC FORMATTED test"
> {code}
> # Storage Information
> SerDe Library:      	org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat:        	org.apache.hadoop.hive.ql.io.RCFileInputFormat
> OutputFormat:       	org.apache.hadoop.hive.ql.io.RCFileOutputFormat
> {code}
> The SerDe falls back to "LazySimpleSerDe", which is not correct.
> Any further query tries to INSERT or SELECT this table will fail with errors
> I suspect that we can't specify ROW FORMAT DELIMITED with ROW FORMAT SERDE at the same time at table creation, this causes confusion to end users as copy table structure using "SHOW CREATE TABLE" will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)