You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2014/11/05 05:23:33 UTC
[jira] [Commented] (TEZ-1733) TezMerger should sort FileChunks on
size when merging
[ https://issues.apache.org/jira/browse/TEZ-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14197466#comment-14197466 ]
Gopal V commented on TEZ-1733:
------------------------------
This is sufficient & low-risk enough to satisfy the merger IO issues.
LGTM - +1.
> TezMerger should sort FileChunks on size when merging
> -----------------------------------------------------
>
> Key: TEZ-1733
> URL: https://issues.apache.org/jira/browse/TEZ-1733
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.2
> Reporter: Gopal V
> Assignee: Prakash Ramachandran
> Priority: Critical
> Attachments: TEZ-1733.1.patch, TEZ-1733.1.patch, TEZ-1733.2.patch, TEZ-1733.3.patch
>
>
> MAPREDUCE-3685 fixed the Merger sort order for file chunks to use the decompressed size, to cut-down on CPU and IO costs.
> TezMerger needs an equivalent sorted TreeSet which sorts by the data by size.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)