You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2012/07/20 06:16:33 UTC

[jira] [Created] (HIVE-3283) bucket information should be used from the partition instead of the table

Namit Jain created HIVE-3283:
--------------------------------

             Summary: bucket information should be used from the partition instead of the table
                 Key: HIVE-3283
                 URL: https://issues.apache.org/jira/browse/HIVE-3283
             Project: Hive
          Issue Type: Bug
            Reporter: Namit Jain


Current Hive uses the number of buckets from the table object.
Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Attachment: HIVE-3283.1.patch.txt
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452345#comment-13452345 ] 

Kevin Wilfong commented on HIVE-3283:
-------------------------------------

https://reviews.facebook.net/D5319
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Attachment: HIVE-3283.2.patch.txt
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453392#comment-13453392 ] 

Kevin Wilfong commented on HIVE-3283:
-------------------------------------

Updated according to comments on phabricator.
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456322#comment-13456322 ] 

Namit Jain commented on HIVE-3283:
----------------------------------

+1 
running tests
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3283:
-----------------------------

    Status: Open  (was: Patch Available)

comments on phabricator
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449502#comment-13449502 ] 

Namit Jain commented on HIVE-3283:
----------------------------------

Once https://issues.apache.org/jira/browse/HIVE-3171 is in, it would be useful to have the partition metadata be used for bucketing information.
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3283:
-----------------------------

    Description: 
Currently Hive uses the number of buckets from the table object.
Ideally, the number of buckets from the partition should be used

  was:
Current Hive uses the number of buckets from the table object.
Ideally, the number of buckets from the partition should be used

    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456499#comment-13456499 ] 

Hudson commented on HIVE-3283:
------------------------------

Integrated in Hive-trunk-h0.21 #1671 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1671/])
    HIVE-3283 bucket information should be used from the partition instead of the table
(Kevin Wilfong via namit) (Revision 1385084)

     Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1385084
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientnegative/alter_numbuckets_partitioned_table.q
* /hive/trunk/ql/src/test/queries/clientpositive/alter_numbuckets_partitioned_table.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin10.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin11.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin12.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin8.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin9.q
* /hive/trunk/ql/src/test/queries/clientpositive/sort_merge_join_desc_5.q
* /hive/trunk/ql/src/test/queries/clientpositive/sort_merge_join_desc_6.q
* /hive/trunk/ql/src/test/queries/clientpositive/sort_merge_join_desc_7.q
* /hive/trunk/ql/src/test/results/clientnegative/alter_numbuckets_partitioned_table.q.out
* /hive/trunk/ql/src/test/results/clientpositive/alter_numbuckets_partitioned_table.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin10.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin11.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin12.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin8.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin9.q.out
* /hive/trunk/ql/src/test/results/clientpositive/sort_merge_join_desc_5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/sort_merge_join_desc_6.q.out
* /hive/trunk/ql/src/test/results/clientpositive/sort_merge_join_desc_7.q.out

                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Status: Patch Available  (was: Open)
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong reassigned HIVE-3283:
-----------------------------------

    Assignee: Kevin Wilfong
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3283:
-----------------------------

    Status: Open  (was: Patch Available)

comments on phabricator
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Attachment: HIVE-3283.3.patch.txt
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452346#comment-13452346 ] 

Kevin Wilfong commented on HIVE-3283:
-------------------------------------

With the recent improvements to bucketing and sorting made primarily by Namit and Navis recently, this already seems like it's supported, it's just a matter of making the switch to use partition metadata.

I re-enabled allowing users to change the number of buckets/bucketed and sorted columns of a partitioned table containing data (otherwise this change won't provide much benefit).
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455971#comment-13455971 ] 

Kevin Wilfong commented on HIVE-3283:
-------------------------------------

Updated according to comments on phabricator.
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Status: Patch Available  (was: Open)
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3283:
-----------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin
                
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt, HIVE-3283.2.patch.txt, HIVE-3283.3.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3283) bucket information should be used from the partition instead of the table

Posted by "Kevin Wilfong (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Wilfong updated HIVE-3283:
--------------------------------

    Affects Version/s: 0.10.0
               Status: Patch Available  (was: Open)
    
> bucket information should be used from the partition instead of the table
> -------------------------------------------------------------------------
>
>                 Key: HIVE-3283
>                 URL: https://issues.apache.org/jira/browse/HIVE-3283
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Namit Jain
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3283.1.patch.txt
>
>
> Currently Hive uses the number of buckets from the table object.
> Ideally, the number of buckets from the partition should be used

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira