You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2015/11/19 07:03:11 UTC

[jira] [Updated] (TEZ-2950) Poor performance of UnorderedPartitionedKVWriter

     [ https://issues.apache.org/jira/browse/TEZ-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Eagles updated TEZ-2950:
---------------------------------
    Assignee:     (was: Jonathan Eagles)

> Poor performance of UnorderedPartitionedKVWriter
> ------------------------------------------------
>
>                 Key: TEZ-2950
>                 URL: https://issues.apache.org/jira/browse/TEZ-2950
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>
> Came across a job which was taking a long time in UnorderedPartitionedKVWriter.mergeAll. It was decompressing and reading data from spill files (8500 spills) and then writing the final compressed merge file. Why do we need spill files for UnorderedPartitionedKVWriter? Why not just buffer and keep directly writing to the final file which will save a lot of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)