You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Saikat (JIRA)" <ji...@apache.org> on 2015/09/08 22:42:45 UTC

[jira] [Comment Edited] (TEZ-2643) Minimize number of empty spills in Pipelined Sorter

    [ https://issues.apache.org/jira/browse/TEZ-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14735571#comment-14735571 ] 

Saikat edited comment on TEZ-2643 at 9/8/15 8:41 PM:
-----------------------------------------------------

Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded.

Comment 1: 
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where  only one event is sent out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc.


was (Author: saikatr):
Thanks [~rajesh.balamohan] for the review comments.
Made the following changes in patchset 2643.1
comment 2: move the spillrecords init to after the check for ignoreSpillIfNeeded.

Comment 1: 
I didnt want to put the sendPipelinedShuffleEvents() inside spill because of the following scenario:
a. in flush(), if sendPipelinedShuffleEvents() is called in spill(), then we need to change the if (!isFinalMergeEnabled) {} where  only event is sent out for last spill.
b. Also, in sendPipelinedShuffleEvents() hard codes isLastEvent to false, so we wud need to pass lastEvent flag to sendPipelinedShuffleEvents().
There would be too many chanes, so I return a boolean value from spill, to let the caller know it there was actually a spill, and then the caller can take a decision to send events and if its a last event etc.

> Minimize number of empty spills in Pipelined Sorter
> ---------------------------------------------------
>
>                 Key: TEZ-2643
>                 URL: https://issues.apache.org/jira/browse/TEZ-2643
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Saikat
>            Assignee: Saikat
>         Attachments: TEZ-2643.1.patch, TEZ-2643.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)