You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2021/09/22 14:28:00 UTC

[jira] [Resolved] (ARROW-13532) [C++][Compute] Join: add set membership test method to the grouper

     [ https://issues.apache.org/jira/browse/ARROW-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Kietzman resolved ARROW-13532.
----------------------------------
    Resolution: Fixed

Issue resolved by pull request 10858
[https://github.com/apache/arrow/pull/10858]

> [C++][Compute] Join: add set membership test method to the grouper
> ------------------------------------------------------------------
>
>                 Key: ARROW-13532
>                 URL: https://issues.apache.org/jira/browse/ARROW-13532
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Michal Nowakiewicz
>            Assignee: Michal Nowakiewicz
>            Priority: Major
>              Labels: pull-request-available, query-engine
>             Fix For: 6.0.0
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hash table used in group by provides one main method: map. 
> This method will find an existing matching key in the hash table and output the corresponding group id, it the key already has been inserted in the hash table. Otherwise it will insert a new key and assign a new group id value to it.
> This interface is tailored for the group by. In order to reuse the same hash table implementation in join, there must be a way to skip insertion of new keys into the hash table when looking up existing keys. When join processes probe side it needs to filter input rows based on finding a match in the hash table, but keeping hash table immutable and not automatically adding missing keys to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)