You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2017/09/28 19:13:00 UTC

[jira] [Commented] (HIVE-17638) SparkDynamicPartitionPruner loads all partition metadata into memory

    [ https://issues.apache.org/jira/browse/HIVE-17638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184690#comment-16184690 ] 

Sahil Takiar commented on HIVE-17638:
-------------------------------------

CC: [~janulatha]

> SparkDynamicPartitionPruner loads all partition metadata into memory
> --------------------------------------------------------------------
>
>                 Key: HIVE-17638
>                 URL: https://issues.apache.org/jira/browse/HIVE-17638
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>
> The {{SparkDynamicPartitionPruner}} first loads the contents of each partition pruning file into memory, and then prunes all the partitions from the {{MapWork}}. This can cause increased memory pressure on the HoS Remote Driver because it requires loading all the partition metadata into memory. It would be more efficient if pruning of partitions was done while scanning the files, so that all the partition metadata doesn't need to be buffered in memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)