You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2018/05/09 21:28:00 UTC

[jira] [Commented] (HIVE-19480) Implement and Incorporate MAPREDUCE-207

    [ https://issues.apache.org/jira/browse/HIVE-19480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469539#comment-16469539 ] 

Gopal V commented on HIVE-19480:
--------------------------------

From the HiveConf in Hive2 and 3.

{code}
While MR remains the default engine for historical reasons, it is itself a historical engine
and is deprecated in Hive 2 line. It may be removed without further warning.
{code}

> Implement and Incorporate MAPREDUCE-207
> ---------------------------------------
>
>                 Key: HIVE-19480
>                 URL: https://issues.apache.org/jira/browse/HIVE-19480
>             Project: Hive
>          Issue Type: New Feature
>          Components: HiveServer2
>    Affects Versions: 1.2.3
>            Reporter: BELUGA BEHR
>            Priority: Major
>
> * HiveServer2 has the ability to run many MapReduce jobs in parallel.
>  * Each MapReduce application calculates the job's file splits at the client level
>  * = HiveServer2 loading many file splits at the same time, putting pressure on memory
> {quote}"The client running the job calculates the splits for the job by calling getSplits(), then sends them to the application master, which uses their storage locations to schedule map tasks that will process them on the cluster."
>  - "Hadoop: The Definitive Guide"{quote}
> MAPREDUCE-207 should address this memory pressure by moving split calculations into ApplicationMaster. Spark and Tez already take this approach.
> Once MAPREDUCE-207 is completed, leverage the capability in HiveServer2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)