You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2021/02/20 03:41:00 UTC

[jira] [Closed] (ARROW-5345) [C++] Relax Field hashing in DictionaryMemo

     [ https://issues.apache.org/jira/browse/ARROW-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney closed ARROW-5345.
-------------------------------
    Resolution: Later

There isn't a clear enough problem here to solve, so closing

> [C++] Relax Field hashing in DictionaryMemo
> -------------------------------------------
>
>                 Key: ARROW-5345
>                 URL: https://issues.apache.org/jira/browse/ARROW-5345
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>
> Follow up to ARROW-3144
> Currently we associate dictionaries with a hash table mapping a Field's memory address to a dictionary id. This poses an issue if two RecordBatches are equal (equal field names, equal types) but were instantiated separately. We don't have a hash function in C++ for Field so we should consider implementing one and using that instead (if it is not too expensive) so that same but "different" (different C++ objects) won't blow up in the user's face with an unintuitive error (this did in fact occur once in the Python test suite, not sure exactly why it wasn't a problem before, I think it worked "by accident")



--
This message was sent by Atlassian Jira
(v8.3.4#803005)