You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "niranda perera (Jira)" <ji...@apache.org> on 2021/05/11 20:19:00 UTC

[jira] [Comment Edited] (ARROW-12301) [C++][Compute] Use generic hash-aggregate for DictionaryArrays

    [ https://issues.apache.org/jira/browse/ARROW-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342842#comment-17342842 ] 

niranda perera edited comment on ARROW-12301 at 5/11/21, 8:18 PM:
------------------------------------------------------------------

[~rokm] do you think this is similar to ARROW-9773?


was (Author: niranda):
[~rokm] do you think this is related to ARROW-9773?

> [C++][Compute] Use generic hash-aggregate for DictionaryArrays
> --------------------------------------------------------------
>
>                 Key: ARROW-12301
>                 URL: https://issues.apache.org/jira/browse/ARROW-12301
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Rok Mihevc
>            Priority: Major
>
> When calculating unique for chunked DictionaryArrays we currently run through all chunks and unify their dictionaries and then collect chunk indices. We could avoid the dictionary unification by using a generic hash.
> [See discussion here|https://github.com/apache/arrow/pull/9683] and [here|https://issues.apache.org/jira/browse/ARROW-10403]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)