You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "TezQA (Jira)" <ji...@apache.org> on 2020/08/03 06:56:00 UTC

[jira] [Commented] (TEZ-4211) Optimise MergeManager final merge

    [ https://issues.apache.org/jira/browse/TEZ-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169763#comment-17169763 ] 

TezQA commented on TEZ-4211:
----------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 29s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 56s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  0m 14s{color} | {color:orange} tez-runtime-library: The patch generated 1 new + 53 unchanged - 3 fixed = 54 total (was 56) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 45s{color} | {color:green} tez-runtime-library in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m  8s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 55s{color} | {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/PreCommit-TEZ-Build/506/artifact/out/Dockerfile |
| JIRA Issue | TEZ-4211 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13008918/TEZ-4211.2.patch |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs checkstyle compile |
| uname | Linux 4e6ec696d0fc 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 69a2bdeb1 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | https://builds.apache.org/job/PreCommit-TEZ-Build/506/artifact/out/diff-checkstyle-tez-runtime-library.txt |
|  Test Results | https://builds.apache.org/job/PreCommit-TEZ-Build/506/testReport/ |
| Max. process+thread count | 152 (vs. ulimit of 5500) |
| modules | C: tez-runtime-library U: tez-runtime-library |
| Console output | https://builds.apache.org/job/PreCommit-TEZ-Build/506/console |
| versions | git=2.17.1 maven=3.6.0 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Optimise MergeManager final merge
> ---------------------------------
>
>                 Key: TEZ-4211
>                 URL: https://issues.apache.org/jira/browse/TEZ-4211
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>         Attachments: TEZ-4211.2.patch, TEZ-4211.wip.patch
>
>
> There are cases, when entire data is held in memory and no disk segments are present in MergeManager. Currently, mergemanager spills mem segments to disk before proceeding.
>  
> [https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/MergeManager.java#L1184]
>  
> {code:java}
> if (numMemDiskSegments > 0 && ioSortFactor > onDiskMapOutputs.size()) {
> ...
> ..
> TezMerger.writeFile(rIter, writer, progressable, TezRuntimeConfiguration.TEZ_RUNTIME_RECORDS_BEFORE_PROGRESS_DEFAULT);
> ...
> ..
>  {code}
> This can be optimised not to spill to disk when only mem segments are present.
> Snippet from logs in one of the apps (Q78)
> {noformat}
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=839646500 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=859378362 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=856145179 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=849878734 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=842666749 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=839533127 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=860448335 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=844468505 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=850099810 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=849206236 and #onDiskOutputs=0, size=0
>  [ShuffleAndMergeRunner {Map_1} ()] org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: finalMerge with #inMemoryOutputs=4112, size=840238680 and #onDiskOutputs=0, size=0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)