You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gang Tim Liu (JIRA)" <ji...@apache.org> on 2012/05/31 20:53:22 UTC

[jira] [Created] (HIVE-3072) Hive List Bucketing - DDL support

Gang Tim Liu created HIVE-3072:
----------------------------------

             Summary: Hive List Bucketing - DDL support
                 Key: HIVE-3072
                 URL: https://issues.apache.org/jira/browse/HIVE-3072
             Project: Hive
          Issue Type: New Feature
          Components: SQL
            Reporter: Gang Tim Liu
            Assignee: Gang Tim Liu


If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-3072:
---------------------------------

    Status: Open  (was: Patch Available)

I left comments on phabricator. Thanks.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293082#comment-13293082 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

Yes, we are heading to a single patch approach.

Yes, this feature requires metastore change.

                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-3072:
---------------------------------

    Status: Open  (was: Patch Available)

@Tim: please see my comments on phabricator. Thanks.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.2
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3072:
-----------------------------

    Status: Open  (was: Patch Available)

comments on phabricator
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.4
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433859#comment-13433859 ] 

Namit Jain commented on HIVE-3072:
----------------------------------

Also, the DML support would require complete support for running tests on hadoop 23 (or some version of hadoop
which has support for nested directories). I know, there are a couple of jiras for making it work, like HIVE-3341,
but they are not done yet.
Also, these are independent and we should reduce dependencies on HIVE-3341 as far as possible.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3072:
-----------------------------

    Status: Open  (was: Patch Available)

Comments on phabricator.

I think, we should limit this jira to skew join DDL and not try to club in
hive code cleanup here.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433903#comment-13433903 ] 

Carl Steinbach commented on HIVE-3072:
--------------------------------------

Are you planning to submit this patch for review?
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch is available on both jira and phabricator.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435503#comment-13435503 ] 

Carl Steinbach commented on HIVE-3072:
--------------------------------------

How far away is HIVE-3073 from completion? Is it possible to post a work-in-progress patch for HIVE-3073 so we can see how this all fits together?
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293008#comment-13293008 ] 

Carl Steinbach commented on HIVE-3072:
--------------------------------------

If this feature requires metastore changes then I'd like to request that the first patch contain only changes to the metastore schema and metastore Thrift API. I would also prefer that the DML and DDL changes go in as a single patch since it a) prevents half-implemented features from showing up in releases and b) demonstrates that the feature actually works.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436084#comment-13436084 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

thank you for providing comments. will address them and provide patch. thanks
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435747#comment-13435747 ] 

Namit Jain commented on HIVE-3072:
----------------------------------

The changes look good to me.
Will start tests
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440461#comment-13440461 ] 

Namit Jain commented on HIVE-3072:
----------------------------------

some minor comments.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.1
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.3
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.6
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433326#comment-13433326 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

https://reviews.facebook.net/D4599
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.5
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292987#comment-13292987 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

We rethink "release approach". We can deliver DDL and DML as separate patches or a single patch. Either has pros and cons. not perfect. Separate patch approach can make release more manageable. A single patch makes release make more sense because with DDL but no DML you can't experience list bucketing.

We have to pick up one. We choose a single patch approach. It reduces overhead of multiple-patch release, gives community more time to review proposal and reserves room for us to adjust according to proposal review.

I will call proposal review again today.


                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-3072:
-----------------------------

    Status: Open  (was: Patch Available)

I had some minor comments on the patch.
Otherwise, it looks good to me.

@Carl, do you have any additional comments ?
Otherwise, I will start testing once Tim has addressed the new comments.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435527#comment-13435527 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

Good points. Work-in-progress patch might not be available within days. But will post to HIVE-3073 once available. should be soon.

In the meanwhile, interface between DDL and DML is data structure in metastore. DML takes information saved by DDL and uses it. Hope the data structure can help to see how all fit together. 
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-3072:
---------------------------------

    Status: Open  (was: Patch Available)

@Tim: Please see my comments on phabricator. Thanks.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

patch is available on both jira and phabricator. thanks
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Description: 
If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

  was:
If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature. It's for single skewed column.

    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch available in phabricator and jira.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch is in both jira and phabricator.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433328#comment-13433328 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

There are a few reasons to release a DDL as a patch:
1. the content is getting bigger. might be better to get it review so that we can capture issues earlier.
2. the skewed grammar will not only benefit list bucketing feature but also other feature like skewed join. releasing grammar patch will unblock others' development.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch is available for review.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch is available.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6, HIVE-3072.patch.7
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292978#comment-13292978 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

making progress on DML. The following syntax started to work: 
        create table T (c1 string, c2 string) list bucketed by (c1) with skew ('x1');
        create table T (c1 string, c2 string, c3 string) list bucketed by (c1, c2) with skew (('x1', 'x2'), ('y1', 'y2'));

                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch.7
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6, HIVE-3072.patch.7
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Summary: Hive List Bucketing - DDL support  (was: Hive List Bucketing - DDL support (single column))
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for single skewed column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440644#comment-13440644 ] 

Carl Steinbach commented on HIVE-3072:
--------------------------------------

@Namit: I'll make another pass through the patch later today. One thing I'd like to request is that we add an internal configuration property that disables the new DDL by default. We can remove this once the rest of the DML changes get committed, but in the meantime I don't think it makes sense to make the DDL visible to users. So to recap, I'm proposing the following:

* Add a configuration property named "hive.internal.ddl.list.bucketing.enable" and set the default value to false.
* Add a comment in HiveConf explaining that this will be removed once the rest of the DML changes are committed.
* Do *not* add this property to hive-default.xml.template since we don't want users messing with it.
* Throw an error if the user tries to use the DDL with hive.internal.ddl.list.bucketing.enable set to false.

                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HIVE-3072 started by Gang Tim Liu.

> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Attachment: HIVE-3072.patch
    
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support (single column)

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Description: 
If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature. It's for single skewed column.

  was:
If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature.

        Summary: Hive List Bucketing - DDL support (single column)  (was: Hive List Bucketing - DDL support)
    
> Hive List Bucketing - DDL support (single column)
> -------------------------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for single skewed column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440881#comment-13440881 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

@Carl, yes , will code after dinner. Thanks a lot
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440845#comment-13440845 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

Patch is available on both jara and phabricator.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433904#comment-13433904 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

yes, I am.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gang Tim Liu updated HIVE-3072:
-------------------------------

    Status: Patch Available  (was: In Progress)

Patch is ready for review. It's in both jira and phabricator.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440875#comment-13440875 ] 

Carl Steinbach commented on HIVE-3072:
--------------------------------------

@Tim: Can you please add the configuration property to disable this DDL? Thanks.
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3, HIVE-3072.patch.4, HIVE-3072.patch.5, HIVE-3072.patch.6
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

Posted by "Gang Tim Liu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438451#comment-13438451 ] 

Gang Tim Liu commented on HIVE-3072:
------------------------------------

@Carl, thank you very much for quick review. will address them right now and get a patch tonight. thanks
                
> Hive List Bucketing - DDL support
> ---------------------------------
>
>                 Key: HIVE-3072
>                 URL: https://issues.apache.org/jira/browse/HIVE-3072
>             Project: Hive
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3072.patch, HIVE-3072.patch.1, HIVE-3072.patch.2, HIVE-3072.patch.3
>
>
> If a hive table column has skewed keys, query performance on non-skewed key is always impacted. Hive List Bucketing feature will address it:
> https://cwiki.apache.org/Hive/listbucketing.html
> This jira issue will track DDL change for the feature. It's for both single skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira