You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vineet Garg (JIRA)" <ji...@apache.org> on 2018/06/27 18:31:02 UTC

[jira] [Updated] (HIVE-17814) Reduce Memory footprint for large database bootstrap replication load

     [ https://issues.apache.org/jira/browse/HIVE-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vineet Garg updated HIVE-17814:
-------------------------------
    Fix Version/s:     (was: 3.1.0)
                   3.2.0

Deferring this to 3.2.0 since the branch for 3.1.0 has been cut off.

> Reduce Memory footprint for large database bootstrap replication load 
> ----------------------------------------------------------------------
>
>                 Key: HIVE-17814
>                 URL: https://issues.apache.org/jira/browse/HIVE-17814
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: anishek
>            Assignee: anishek
>            Priority: Major
>             Fix For: 3.2.0
>
>
> As part of HIVE-16896 we are doing dynamic Query Task generation for bootstrap repl load. This was done since the number of tasks for large databases will generate a very large graph with hundreds of thousands of objects, this would put additional memory pressure on hive. 
> The execution hook's however still keep reference to the query plan which gets dynamically modified and at the end of all task execution hive will have the whole DAG in memory which is what we have to prevent, Additionally for PostExecution Hive hooks we are additionally storing the TaskRunner objects for each task that is executed. 
> We have to handle these issues to prevent excessive memory usage for replication specifically bootstrap replication. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)