You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/08/27 03:22:59 UTC

[jira] Created: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Support deletion of partitions based on a prefix partition spefication
----------------------------------------------------------------------

                 Key: HIVE-804
                 URL: https://issues.apache.org/jira/browse/HIVE-804
             Project: Hadoop Hive
          Issue Type: New Feature
            Reporter: Zheng Shao


Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:

{code}
ALTER TABLE test DROP PARTITION (date='2009-08-26');
{code}



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-804:
----------------------------

    Attachment: HIVE-804.1.patch

This patch allows deletion of partitions by specifying any subset of the partition names.
This should be good enough for most cases for now. We can add arbitrary expression later when we see such a need.


> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>         Attachments: HIVE-804.1.patch
>
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774835#action_12774835 ] 

Namit Jain commented on HIVE-804:
---------------------------------

+1

looks good - will commit if the tests pass

> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>         Attachments: HIVE-804.1.patch
>
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749032#action_12749032 ] 

Prasad Chakka commented on HIVE-804:
------------------------------------

* Do we ever want to support arbitrary partition key/vals instead of just a prefix? What about range queries?

# for dropping partitions that matches a given prefix,
## 2A is good enough
# for dropping partitions whose keys match the given key/vals
## partition key/values are stored in model.Partition.values attribute which is a list. So you can possibly generate the query based on which partion key is needed.
# for dropping ranges
## if the range is applied on a prefix, then 2A can be used otherwise need to bring in all partitions

IMO, the trickier problem is to make sure the partition data directory(ies) gets deleted atomically along with the partitions. Even for a prefix, the partition data directories can be different based on how the partition is created. we need to get those semantics correctly.




> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749035#action_12749035 ] 

Zheng Shao commented on HIVE-804:
---------------------------------

bq. IMO, the trickier problem is to make sure the partition data directory(ies) gets deleted atomically along with the partitions. Even for a prefix, the partition data directories can be different based on how the partition is created. we need to get those semantics correctly

Can you explain what do you mean by "the partition data directories can be different based on how the partition is created". Is that only for external tables? I thought internal tables always creates the directories in the order that is specified in "create table", isn't it?



> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749037#action_12749037 ] 

Prasad Chakka commented on HIVE-804:
------------------------------------

I think you can give a location to 'alter table add partition' even if the table is not external.

> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HIVE-804:
-------------------------------

    Assignee: Zheng Shao

> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-804.1.patch
>
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-804:
--------------------------------

    Component/s: Metastore
                 Query Processor

> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-804.1.patch
>
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749022#action_12749022 ] 

Zheng Shao commented on HIVE-804:
---------------------------------

There are 2 questions:

1. Where to match the partitions against the prefix and do the deletion:
A. In DDLTask
B. In ObjectStore

B is better not only because of efficiency but also because other clients can share this functionality.

2. How to match the partitions against the prefix.
A. Get the name of the partition, then convert the String name to Map<String,String>, then match against the prefix (also Map<String,String>). If every entry matches (there can be misses) then it's a match.
B. Retreive the whole partition object.

A is much more efficient.



> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-804) Support deletion of partitions based on a prefix partition spefication

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain resolved HIVE-804.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.5.0
     Hadoop Flags: [Reviewed]

Committed. Thanks Zheng

> Support deletion of partitions based on a prefix partition spefication
> ----------------------------------------------------------------------
>
>                 Key: HIVE-804
>                 URL: https://issues.apache.org/jira/browse/HIVE-804
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.5.0
>
>         Attachments: HIVE-804.1.patch
>
>
> Sometimes users create partitions like (date='...', time='...'). It is useful if user can delete all the partitions of the same day (and different time) with a single command:
> {code}
> ALTER TABLE test DROP PARTITION (date='2009-08-26');
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.