You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2013/08/02 04:01:50 UTC

[jira] [Updated] (HIVE-4983) Hive metastore client doesn't use batching for filter pushdown

     [ https://issues.apache.org/jira/browse/HIVE-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin updated HIVE-4983:
-----------------------------------

    Description: 
When getting partitions the usual way (get names, filter, get partitions by filtered names) MS client batches the partition get requests; the default batch size is 300. However for filter pushdown there's no such logic.
This can already cause problems (metastore can OOM when getting many partitions if it's not given enough memory).
When filter pushdown is improved to be used in more cases the problem will become worse, and the name filtering is moved to server to avoid roundtrips with large number of names going to the client/to decide on pushdown on server, batching will disappear entirely.
We might want to introduce it for filter pushdown, direct SQL and all these other cases.

  was:
When getting partitions the usual way (get names, filter, get partitions by filtered names) MS client batches the partition get requests; the default batch size is 300. 
This can already cause problems (metastore can OOM when getting many partitions if it's not given enough memory).
When filter pushdown is improved to be used in more cases the problem will become worse, and the name filtering is moved to server to avoid roundtrips with large number of names going to the client/to decide on pushdown on server, batching will disappear entirely.
We might want to introduce it for filter pushdown, direct SQL and all these other cases.

    
> Hive metastore client doesn't use batching for filter pushdown
> --------------------------------------------------------------
>
>                 Key: HIVE-4983
>                 URL: https://issues.apache.org/jira/browse/HIVE-4983
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Minor
>
> When getting partitions the usual way (get names, filter, get partitions by filtered names) MS client batches the partition get requests; the default batch size is 300. However for filter pushdown there's no such logic.
> This can already cause problems (metastore can OOM when getting many partitions if it's not given enough memory).
> When filter pushdown is improved to be used in more cases the problem will become worse, and the name filtering is moved to server to avoid roundtrips with large number of names going to the client/to decide on pushdown on server, batching will disappear entirely.
> We might want to introduce it for filter pushdown, direct SQL and all these other cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira