You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2016/05/31 18:42:12 UTC

[jira] [Commented] (HIVE-13884) Disallow queries fetching more than a configured number of partitions in PartitionPruner

    [ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308319#comment-15308319 ] 

Sergey Shelukhin commented on HIVE-13884:
-----------------------------------------

Should the limit rather be passed to metastore to avoid 2 network roundtrips for normal cases?

> Disallow queries fetching more than a configured number of partitions in PartitionPruner
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-13884
>                 URL: https://issues.apache.org/jira/browse/HIVE-13884
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>
> Currently the PartitionPruner requests either all partitions or partitions based on filter expression. In either scenarios, if the number of partitions accessed is large there can be significant memory pressure at the HMS server end.
> We already have a config {{hive.limit.query.max.table.partition}} that enforces limits on number of partitions that may be scanned per operator. But this check happens after the PartitionPruner has already fetched all partitions.
> We should add an option at PartitionPruner level to disallow queries that attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition filter in PartitionPruner, but this check accepts any query with a pruning condition, even if partitions fetched are large. In multi-tenant environments, admins could use more control w.r.t. number of partitions allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names (instead of partition specs) and throw an exception if number of partitions exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)