You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Deepak Jaiswal (JIRA)" <ji...@apache.org> on 2017/12/07 20:12:00 UTC
[jira] [Resolved] (HIVE-17923) 'cluster by' should not be needed
for a bucketed table
[ https://issues.apache.org/jira/browse/HIVE-17923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Deepak Jaiswal resolved HIVE-17923.
-----------------------------------
Resolution: Duplicate
Duplicate of https://issues.apache.org/jira/browse/HIVE-18157
> 'cluster by' should not be needed for a bucketed table
> ------------------------------------------------------
>
> Key: HIVE-17923
> URL: https://issues.apache.org/jira/browse/HIVE-17923
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Assignee: Deepak Jaiswal
> Priority: Blocker
>
> given
> {noformat}
> CREATE TABLE over10k_orc_bucketed(t tinyint,
> si smallint,
> i int,
> b bigint,
> f float,
> d double,
> bo boolean,
> s string,
> ts timestamp,
> `dec` decimal(4,2),
> bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
> {noformat}
> insert into over10k_orc_bucketed select * from over10k
> {noformat}
> produces 1 data file (bucket 0). It should produce 4 based on input data.
> {noformat}
> insert into over10k_orc_bucketed select * from over10k cluster by si
> {noformat}
> does the right thing.
> acid_vectorization_original.q has the full script (HIVE-17458)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)