You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/03/11 04:26:40 UTC

[jira] [Commented] (TEZ-3163) Reuse and tune Inflaters and Deflaters to speed DME processing

    [ https://issues.apache.org/jira/browse/TEZ-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190419#comment-15190419 ] 

Jonathan Eagles commented on TEZ-3163:
--------------------------------------

In a task attempt with 16M DME events, noticed that there were 35000 Inflater objects in memory (unreachable, finalizing). This patch attempts to reuse the Inflater/Deflater objects and reduces GC and CPU for large jobs. PERF patch above has two tests showing the value in reusing the Inflaters/Deflaters. In addition, added the NOWRAP flag to reduce 6 bytes per DME message (header and trailer)

References on improvements
https://github.com/ning/jvm-compressor-benchmark/blob/master/src/main/java/com/ning/jcbm/gzip/JDKGzipDriver.java
http://stackoverflow.com/questions/13059533/how-to-use-java-deflateroutputstream
http://java-performance.info/performance-general-compression/

> Reuse and tune Inflaters and Deflaters to speed DME processing
> --------------------------------------------------------------
>
>                 Key: TEZ-3163
>                 URL: https://issues.apache.org/jira/browse/TEZ-3163
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3163.1-branch-0.7.patch, TEZ-3163.1.patch, TEZ-3163.PERF.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)