You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2013/03/07 17:46:13 UTC

[jira] [Created] (HIVE-4136) hive should optimize the scenario when the input and output are bucketed/sorted on the same keys

Namit Jain created HIVE-4136:
--------------------------------

             Summary: hive should optimize the scenario when the input and output are bucketed/sorted on the same keys 
                 Key: HIVE-4136
                 URL: https://issues.apache.org/jira/browse/HIVE-4136
             Project: Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain


Consider a common scenario like:

create table T1 (...) clustered by (key) sorted by (key) into 2 buckets;
create table T2 (...) clustered by (key) sorted by (key) into 2 buckets;


SET hive.enforce.sorting=true;
SET hive.enforce.bucketing=true;

insert overwrite table T2
select * from T1;


The above query creates a reducer to make sure T2 is bucketed/sorted.
That is not needed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira