You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "Vineet Garg (JIRA)" <ji...@apache.org> on 2018/06/27 18:31:03 UTC

[jira] [Updated] (HIVE-16904) during repl load for large number of partitions the metadata file can be huge and can lead to out of memory

     [ https://issues.apache.org/jira/browse/HIVE-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vineet Garg updated HIVE-16904:
-------------------------------
    Fix Version/s:     (was: 3.1.0)
                   3.2.0

Deferring this to 3.2.0 since the branch for 3.1.0 has been cut off.

> during repl load for large number of partitions the metadata file can be huge and can lead to out of memory 
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16904
>                 URL: https://issues.apache.org/jira/browse/HIVE-16904
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: anishek
>            Assignee: anishek
>            Priority: Major
>             Fix For: 3.2.0
>
>
> the metadata pertaining to a table + its partitions is stored in a single file, During repl load all the data is loaded in memory in one shot and then individual partitions processed. This can lead to huge memory overhead as the entire file is read in memory. try to deserialize the partition objects with some sort of streaming json deserializer. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)