You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2015/06/25 02:45:04 UTC

[jira] [Commented] (TEZ-2575) Handle KeyValue pairs size which donot fit in a single block

    [ https://issues.apache.org/jira/browse/TEZ-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600449#comment-14600449 ] 

Rajesh Balamohan commented on TEZ-2575:
---------------------------------------

In regular of PipelinedSorter, requested "blkSize" would be 0 and in such cases, the min block size would be 2 GB.  So if 1 mb is the total memory, it would end up creating single block.  If single KV pair is greater than this limit, it would end up throwing buffer exception which is the similar behavior in DefaultSorter as well.

> Handle KeyValue pairs size which donot fit in a single block
> ------------------------------------------------------------
>
>                 Key: TEZ-2575
>                 URL: https://issues.apache.org/jira/browse/TEZ-2575
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Saikat
>            Assignee: Saikat
>
> In the present implementation, the available buffer is divided into blocks (specified in the constructor for pipeline sort). and a linked list of these block byte buffers is maintained. 
> A span is created out of the buffers. 
> The present logic, doesnot handle scenario where a single key-value pair size doesnot fit into any of the blocks.
> example if 1mb total memory is divided into 4 blocks, (256 kb each),
> if a single KV pair is greater than the blocksize(~ignoring meta data size), 
> then it fails with buffer exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)