You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Apache Arrow JIRA Bot (Jira)" <ji...@apache.org> on 2022/09/29 17:52:00 UTC

[jira] [Commented] (ARROW-16920) [Java] DictionaryProvider leaks memory while adding dictionaries with duplicate encoding

    [ https://issues.apache.org/jira/browse/ARROW-16920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611159#comment-17611159 ] 

Apache Arrow JIRA Bot commented on ARROW-16920:
-----------------------------------------------

This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per [project policy|https://arrow.apache.org/docs/dev/developers/bug_reports.html#issue-assignment]. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

> [Java] DictionaryProvider leaks memory while adding dictionaries with duplicate encoding
> ----------------------------------------------------------------------------------------
>
>                 Key: ARROW-16920
>                 URL: https://issues.apache.org/jira/browse/ARROW-16920
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 7.0.0
>            Reporter: Vimal Varghese
>            Assignee: David Dali Susanibar Arce
>            Priority: Major
>
> DictionaryProvider leaks memory while adding dictionaries with duplicate encoding. Is this expected? Should the provider release the memory of the existing dictionary vector if it accepts another one with same encoding id ?
> Sample code:
> {code:java}
> "dictionaryProvider" should " not leak memory while adding dictionaries with duplicate encoding" in {
>   val allocator: RootAllocator = new RootAllocator()
>   val vector: ListVector = ListVector.empty("vector", allocator)
>   val dictionaryVector1: ListVector = ListVector.empty("dict1", allocator)
>   val dictionaryVector2: ListVector = ListVector.empty("dict2", allocator)
>   val writer1: UnionListWriter = vector.getWriter
>   writer1.allocate
>   writer1.setValueCount(1)
>   val dictWriter1: UnionListWriter = dictionaryVector1.getWriter
>   dictWriter1.allocate
>   dictWriter1.setValueCount(1)
>   val dictWriter2: UnionListWriter = dictionaryVector2.getWriter
>   dictWriter2.allocate
>   dictWriter2.setValueCount(1)
>   val dictionary1: Dictionary = new Dictionary(dictionaryVector1, new DictionaryEncoding(1L, false, None.orNull))
>   val dictionary2: Dictionary = new Dictionary(dictionaryVector2, new DictionaryEncoding(1L, false, None.orNull))
>   val provider = new DictionaryProvider.MapDictionaryProvider
>   provider.put(dictionary1)
>   provider.put(dictionary2)
>   vector.clear()
>   provider.getDictionaryIds.asScala.map(id => provider.lookup(id).getVector.clear())
>   allocator.getAllocatedMemory shouldBe 0
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)