You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Eric Newton (JIRA)" <ji...@apache.org> on 2015/12/28 19:13:49 UTC
[jira] [Commented] (ACCUMULO-624) iterators may open lots of
compressors
[ https://issues.apache.org/jira/browse/ACCUMULO-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072964#comment-15072964 ]
Eric Newton commented on ACCUMULO-624:
--------------------------------------
I wrote a little experiment: 10 threads allocate 100K decompressors each.
Using {{gz.returnDecompressor(gz.getDecompressor()}} all threads complete in 1.4 seconds.
Using {{gz.getCodec().createDecompressor()}} all threads complete in 20 seconds.
So, it is quite a bit faster to use the pool. But, allocating decompressors without the pool still takes less than a millisecond.
It seems we are not the only ones that think that [codec reuse may not be worth it | https://github.com/prestodb/presto-hive-apache/blob/master/src/main/java/org/apache/hadoop/hive/ql/io/CodecPool.java].
> iterators may open lots of compressors
> --------------------------------------
>
> Key: ACCUMULO-624
> URL: https://issues.apache.org/jira/browse/ACCUMULO-624
> Project: Accumulo
> Issue Type: Bug
> Components: tserver
> Reporter: Eric Newton
>
> A large iterator tree may create many instances of Compressors. These instances are pulled from a pool that never decreases in size. So, if 50 simultaneous queries are run over dozens of files, each with a complex iterator stack, there will be thousands of compressors created. Each of these holds a large buffer. This can cause the server to run out of memory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)