You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2014/04/19 04:12:15 UTC

[jira] [Commented] (TEZ-500) RLE in IFile does not seem to work correctly

    [ https://issues.apache.org/jira/browse/TEZ-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974689#comment-13974689 ] 

Gopal V commented on TEZ-500:
-----------------------------

Is this still a problem?

I turned on RLE today to test out my theory and did a few ETL loads with it.

> RLE in IFile does not seem to work correctly
> --------------------------------------------
>
>                 Key: TEZ-500
>                 URL: https://issues.apache.org/jira/browse/TEZ-500
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>
> The compressed length reported by the writer is typically larger than the Uncompressed length. The size of the output file generated matches the uncompressed length.
> The Shuffle fetchers allocate buffers based on the compressed length, and pull that much data. As a result the entire contents are not pulled in.
> Also, even if the entire content is pulled in - nextRawKey, nextRawValue ends up failing the moment a repeated key is hit.



--
This message was sent by Atlassian JIRA
(v6.2#6252)