You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2014/06/21 01:36:24 UTC
[jira] [Updated] (HADOOP-10591) Compression codecs must used pooled
direct buffers or deallocate direct buffers when stream is closed
[ https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Patrick McCabe updated HADOOP-10591:
------------------------------------------
Attachment: HADOOP-10591.001.patch
This is a patch which makes the one-argument forms of {{Codec#createOutputStream}} and {{Codec#createInputStream}} take the codecs from the global {{CodecPool}}, rather than allocating a new one each time.
> Compression codecs must used pooled direct buffers or deallocate direct buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10591
> URL: https://issues.apache.org/jira/browse/HADOOP-10591
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Hari Shreedharan
> Assignee: Colin Patrick McCabe
> Attachments: HADOOP-10591.001.patch
>
>
> Currently direct buffers allocated by compression codecs like Gzip (which allocates 2 direct buffers per instance) are not deallocated when the stream is closed. Eventually for long running processes which create a huge number of files, these direct buffers are left hanging till a full gc, which may or may not happen in a reasonable amount of time - especially if the process does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the stream is closed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)