You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Dongjoon Hyun (Jira)" <ji...@apache.org> on 2021/08/27 03:53:00 UTC
[jira] [Resolved] (ORC-848) Recycle Internal Buffer in
StringHashTableDictionary
[ https://issues.apache.org/jira/browse/ORC-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun resolved ORC-848.
-------------------------------
Fix Version/s: 1.7.0
Resolution: Fixed
This is resolved via https://github.com/apache/orc/pull/751
> Recycle Internal Buffer in StringHashTableDictionary
> ----------------------------------------------------
>
> Key: ORC-848
> URL: https://issues.apache.org/jira/browse/ORC-848
> Project: ORC
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
> Fix For: 1.7.0
>
>
> {code:java|title=StringHashTableDictionary.java}
> private void initHashBuckets(int capacity) {
> DynamicIntArray[] buckets = new DynamicIntArray[capacity];
> for (int i = 0; i < capacity; i++) {
> // We don't need large bucket: If we have more than a handful of collisions,
> // then the table is too small or the function isn't good.
> buckets[i] = createBucket();
> }
> hashBuckets = buckets;
> }
> {code}
> This code was highlighted for me in a JMH run of the perf test. The {{Dictionary}} is regularly cleared out and is reset back to its default state. I'm sure most of the time is spent generating {{capacity}} buckets (buffers), but we can save one buffer initialization by only creating {{buckets}} if the capacity is different than requested (which is not the case with a {{clear()}}}).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)