You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2014/06/21 01:36:24 UTC

[jira] [Updated] (HADOOP-10591) Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed

     [ https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Patrick McCabe updated HADOOP-10591:
------------------------------------------

    Attachment: HADOOP-10591.001.patch

This is a patch which makes the one-argument forms of {{Codec#createOutputStream}} and  {{Codec#createInputStream}} take the codecs from the global {{CodecPool}}, rather than allocating a new one each time.

> Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10591
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10591
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Hari Shreedharan
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-10591.001.patch
>
>
> Currently direct buffers allocated by compression codecs like Gzip (which allocates 2 direct buffers per instance) are not deallocated when the stream is closed. Eventually for long running processes which create a huge number of files, these direct buffers are left hanging till a full gc, which may or may not happen in a reasonable amount of time - especially if the process does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the stream is closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)