You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Yin Huai (JIRA)" <ji...@apache.org> on 2013/12/20 20:16:12 UTC
[jira] [Commented] (HIVE-6083) User provided table properties are
not assigned to the TableDesc of the FileSinkDesc in a CTAS query
[ https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854449#comment-13854449 ]
Yin Huai commented on HIVE-6083:
--------------------------------
With .1 patch ...
* Snappy compression
{code}
create table web_sales_correct_orc_snappy
stored as orc tblproperties ("orc.compress"="SNAPPY")
as select * from web_sales;
{code}
{code}
describe formatted web_sales_correct_orc_snappy;
....
Location: hdfs://localhost:54310/user/hive/warehouse/web_sales_correct_orc_snappy
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 719384
orc.compress SNAPPY
rawDataSize 97815412
totalSize 51042245
transient_lastDdlTime 1387566737
....
{code}
{code}
bin/hive --orcfiledump /user/hive/warehouse/web_sales_correct_orc_snappy/000000_0
Rows: 719384
Compression: SNAPPY
Compression size: 262144
...
{code}
* No compression
{code}
create table web_sales_correct_orc_none
stored as orc tblproperties ("orc.compress"="NONE")
as select * from web_sales;
{code}
{code}
describe formatted web_sales_correct_orc_none;
....
Location: hdfs://localhost:54310/user/hive/warehouse/web_sales_correct_orc_none
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 719384
orc.compress NONE
rawDataSize 97815412
totalSize 53968823
transient_lastDdlTime 1387566788
....
{code}
{code}
bin/hive --orcfiledump /user/hive/warehouse/web_sales_correct_orc_none/000000_0
Rows: 719384
Compression: NONE
...
{code}
> User provided table properties are not assigned to the TableDesc of the FileSinkDesc in a CTAS query
> ----------------------------------------------------------------------------------------------------
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.12.0, 0.13.0
> Reporter: Yin Huai
> Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and orc.compress was set to SNAPPY. However, the table was still compressed as ZLIB (although the result of DESCRIBE still shows that this table is compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not see user provided table properties are assigned to the returned TableDesc (CreateTableDesc.getTblProps was not called in this method ).
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> ....
> Location: hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE true
> numFiles 1
> numRows 719384
> orc.compress SNAPPY
> rawDataSize 97815412
> totalSize 40625243
> transient_lastDdlTime 1387566015
> ....
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_snappy/000000_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> ....
> Location: hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE true
> numFiles 1
> numRows 719384
> orc.compress NONE
> rawDataSize 97815412
> totalSize 40625243
> transient_lastDdlTime 1387566064
> ....
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/000000_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)