You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "niranda perera (Jira)" <ji...@apache.org> on 2021/05/11 20:19:00 UTC
[jira] [Comment Edited] (ARROW-12301) [C++][Compute] Use generic
hash-aggregate for DictionaryArrays
[ https://issues.apache.org/jira/browse/ARROW-12301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342842#comment-17342842 ]
niranda perera edited comment on ARROW-12301 at 5/11/21, 8:18 PM:
------------------------------------------------------------------
[~rokm] do you think this is similar to ARROW-9773?
was (Author: niranda):
[~rokm] do you think this is related to ARROW-9773?
> [C++][Compute] Use generic hash-aggregate for DictionaryArrays
> --------------------------------------------------------------
>
> Key: ARROW-12301
> URL: https://issues.apache.org/jira/browse/ARROW-12301
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Rok Mihevc
> Priority: Major
>
> When calculating unique for chunked DictionaryArrays we currently run through all chunks and unify their dictionaries and then collect chunk indices. We could avoid the dictionary unification by using a generic hash.
> [See discussion here|https://github.com/apache/arrow/pull/9683] and [here|https://issues.apache.org/jira/browse/ARROW-10403]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)