You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/06/21 18:40:00 UTC

[jira] [Updated] (ARROW-12632) [C++][Dataset][Compute] Add support for dictionary_encode to Expression

     [ https://issues.apache.org/jira/browse/ARROW-12632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Kietzman updated ARROW-12632:
---------------------------------
    Fix Version/s:     (was: 5.0.0)
                   6.0.0

> [C++][Dataset][Compute] Add support for dictionary_encode to Expression
> -----------------------------------------------------------------------
>
>                 Key: ARROW-12632
>                 URL: https://issues.apache.org/jira/browse/ARROW-12632
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>    Affects Versions: 4.0.0
>            Reporter: Ben Kietzman
>            Assignee: Ben Kietzman
>            Priority: Major
>              Labels: dataset
>             Fix For: 6.0.0
>
>
> dictionary_encode should be usable in the context of ExecuteScalarExpression, but is not currently supported because it requires mutable state (the hash table). Currently scanning assumes that Expression state will not be mutated so only one instance is initialized and is shared between all threads of execution. Supporting dictionary_encode will require adding support for multiple states to Expression and usage of that by dataset scans.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)