You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sohan Jain (JIRA)" <ji...@apache.org> on 2011/08/11 01:30:36 UTC
[jira] [Created] (HIVE-2368) Determining whether a Column
Descriptor is unused may take too long
Determining whether a Column Descriptor is unused may take too long
-------------------------------------------------------------------
Key: HIVE-2368
URL: https://issues.apache.org/jira/browse/HIVE-2368
Project: Hive
Issue Type: Bug
Components: Metastore
Reporter: Sohan Jain
Attachments: HIVE-2368.1.patch
To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Slow dropping of partitions caused
by full listing of storage descriptors
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135623#comment-13135623 ]
Hudson commented on HIVE-2368:
------------------------------
Integrated in Hive-trunk-h0.21 #1034 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1034/])
HIVE-2368. Slow dropping of partitions caused by full listing of storage descriptors (Sohan Jain via pauly)
pauly : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188886
Files :
* /hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Determining whether a Column
Descriptor is unused may take too long
Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082765#comment-13082765 ]
jiraposter@reviews.apache.org commented on HIVE-2368:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1462/
-----------------------------------------------------------
Review request for hive and Paul Yang.
Summary
-------
To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
This addresses bug HIVE-2368.
https://issues.apache.org/jira/browse/HIVE-2368
Diffs
-----
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1156401
Diff: https://reviews.apache.org/r/1462/diff
Testing
-------
Thanks,
Sohan
> Determining whether a Column Descriptor is unused may take too long
> -------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Slow dropping of partitions caused
by full listing of storage descriptors
Posted by "Paul Yang (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135387#comment-13135387 ]
Paul Yang commented on HIVE-2368:
---------------------------------
Committed. Thanks Sohan!
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Determining whether a Column
Descriptor is unused may take too long
Posted by "Paul Yang (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132215#comment-13132215 ]
Paul Yang commented on HIVE-2368:
---------------------------------
The patch was good, will test and commit.
> Determining whether a Column Descriptor is unused may take too long
> -------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2368) Slow dropping of partitions caused by
full listing of storage descriptors
Posted by "Paul Yang (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang reassigned HIVE-2368:
-------------------------------
Assignee: Sohan Jain
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Slow dropping of partitions caused
by full listing of storage descriptors
Posted by "Paul Yang (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135386#comment-13135386 ]
Paul Yang commented on HIVE-2368:
---------------------------------
+1
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Slow dropping of partitions caused
by full listing of storage descriptors
Posted by "Paul Yang (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138065#comment-13138065 ]
Paul Yang commented on HIVE-2368:
---------------------------------
Backporting this to branch-0.8 as well.
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Determining whether a Column
Descriptor is unused may take too long
Posted by "Ning Zhang (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132188#comment-13132188 ]
Ning Zhang commented on HIVE-2368:
----------------------------------
@Paul Yang, this looks good to me. Can you also review the patch? We'll need this soon.
> Determining whether a Column Descriptor is unused may take too long
> -------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2368) Slow dropping of partitions caused by
full listing of storage descriptors
Posted by "Paul Yang (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang resolved HIVE-2368.
-----------------------------
Resolution: Fixed
Fix Version/s: 0.9.0
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2368) Determining whether a Column
Descriptor is unused may take too long
Posted by "Sohan Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sohan Jain updated HIVE-2368:
-----------------------------
Attachment: HIVE-2368.1.patch
> Determining whether a Column Descriptor is unused may take too long
> -------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2368) Slow dropping of partitions caused by
full listing of storage descriptors
Posted by "Carl Steinbach (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-2368:
---------------------------------
Fix Version/s: (was: 0.9.0)
0.8.0
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.8.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2368) Slow dropping of partitions caused by
full listing of storage descriptors
Posted by "Paul Yang (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang updated HIVE-2368:
----------------------------
Summary: Slow dropping of partitions caused by full listing of storage descriptors (was: Determining whether a Column Descriptor is unused may take too long)
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2368) Slow dropping of partitions caused
by full listing of storage descriptors
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13147389#comment-13147389 ]
Hudson commented on HIVE-2368:
------------------------------
Integrated in Hive-0.8.0-SNAPSHOT-h0.21 #89 (See [https://builds.apache.org/job/Hive-0.8.0-SNAPSHOT-h0.21/89/])
HIVE-2368. Slow dropping of partitions caused by full listing of storage descriptors (Sohan Jain via pauly)
pauly : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1199915
Files :
* /hive/branches/branch-0.8/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> Slow dropping of partitions caused by full listing of storage descriptors
> -------------------------------------------------------------------------
>
> Key: HIVE-2368
> URL: https://issues.apache.org/jira/browse/HIVE-2368
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Sohan Jain
> Assignee: Sohan Jain
> Fix For: 0.9.0
>
> Attachments: HIVE-2368.1.patch
>
>
> To determine if a column descriptor is unused, we call listStorageDescriptorsWithCD(), which may return a big list of SDs. This can severely slow down dropping partitions.
> We can add a maximum number of SDs to return, and just ask for 1 SD, since we are just doing an existential check.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira