You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2015/01/14 17:11:35 UTC

[jira] [Commented] (TEZ-1945) Remove 2 GB memlimit restriction in MergeManager

    [ https://issues.apache.org/jira/browse/TEZ-1945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277132#comment-14277132 ] 

Rajesh Balamohan commented on TEZ-1945:
---------------------------------------

*Job:*
1. 10 TB scale
2. Hive query with tez "create table testData as select * from lineitem distribute by l_shipdate;"

Saw 5-7% improvement in runtime of job.  Counter details are given below, which shows good reduction in resource usages during shuffle (e.g NUM_MEM_TO_DISK_MERGES, ADDITIONAL_SPILLS_BYTES_WRITTEN, SPILLED_RECORDS)

Counter details TaskCounter_Reducer_2_INPUT_Map_1

||counter||4 GB, tez.runtime.shuffle.fetch.buffer.percent=0.5, tez.runtime.shuffle.merge.percent=0.5,application_1421164610335_0059 ||4 GB, tez.runtime.shuffle.fetch.buffer.percent=0.8, tez.runtime.shuffle.merge.percent=0.8,application_1421164610335_0064||8 GB, tez.runtime.shuffle.memory.limit.percent=0.1, tez.runtime.shuffle.fetch.buffer.percent=0.14,application_1421164610335_0063||8 GB, tez.runtime.shuffle.memory.limit.percent=0.2, tez.runtime.shuffle.fetch.buffer.percent=0.5,application_1421164610335_0058||
|ADDITIONAL_SPILLS_BYTES_READ|200812472683|125413261965|331929593129|31373505945|
|ADDITIONAL_SPILLS_BYTES_WRITTEN|181649974257|106277188725|312660112747|12149251314|
|COMBINE_INPUT_RECORDS|0|0|0|0|
|FIRST_EVENT_RECEIVED|10292|12048|7404|6012|
|LAST_EVENT_RECEIVED|31296182|28215975|10513984|7342057|
|MERGED_MAP_OUTPUTS|244976|244976|244976|244976|
|MERGE_PHASE_TIME|39177076|36337714|15940783|11425071|
|NUM_DISK_TO_DISK_MERGES|0|0|0|0|
|NUM_FAILED_SHUFFLE_INPUTS|0|0|0|0|
|NUM_MEM_TO_DISK_MERGES|491|3|4537|0|
|NUM_SHUFFLED_INPUTS|244976|244976|244976|244976|
|NUM_SKIPPED_INPUTS|8283|8283|8283|8283|
|REDUCE_INPUT_GROUPS|0|0|0|0|
|REDUCE_INPUT_RECORDS|5999989709|5999989709|5999989709|5999989709|
|SHUFFLE_BYTES|365219732545|365204956818|365241417228|365215810254|
|SHUFFLE_BYTES_DECOMPRESSED|801646699974|801646699974|801646699974|801646699974|
|SHUFFLE_BYTES_DISK_DIRECT|19162498426|19136073240|19269480382|19224254631|
|SHUFFLE_BYTES_TO_DISK|0|0|0|0|
|SHUFFLE_BYTES_TO_MEM|346057234119|346068883578|345971936846|345991555623|
|SHUFFLE_PHASE_TIME|38339256|34248855|15332154|11018423|
|SPILLED_RECORDS|3272861488|2042317909|5452541545|511585624|

*Merge memory details for the above runs (applicationIds for reference)*

4 GB container Runs:
application_1421164610335_0059:
MergerManager: memoryLimit=1564475392, maxSingleShuffleLimit=391118848, mergeThreshold=782237696, ioSortFactor=200, memToMemMergeOutputsThreshold=200

application_1421164610335_0064:
memoryLimit=2296271339, maxSingleShuffleLimit=574067840, mergeThreshold=1837017088, ioSortFactor=200, memToMemMergeOutputsThreshold=200

8 GB container Runs:
application_1421164610335_0058:
MergerManager: memoryLimit=4437280030, maxSingleShuffleLimit=1109320064, mergeThreshold=3993552128, ioSortFactor=200, memToMemMergeOutputsThreshold=200

application_1421164610335_0058:
MergerManager: memoryLimit=891079872, maxSingleShuffleLimit=89107992, mergeThreshold=139008464, ioSortFactor=200, memToMemMergeOutputsThreshold=200


> Remove 2 GB memlimit restriction in MergeManager
> ------------------------------------------------
>
>                 Key: TEZ-1945
>                 URL: https://issues.apache.org/jira/browse/TEZ-1945
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1945.1.patch
>
>
> In certain situations (data coming in larger chunks, but yet to complete), fetchers might wait in MerManager.waitForShuffleToMergeMemory() for memory to become available.  
> Removing the 2 GB resitrction on MergeManager.memlimit would help in such situations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)