You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Daniël Heres (Jira)" <ji...@apache.org> on 2021/04/07 17:46:00 UTC

[jira] [Created] (ARROW-12266) [Rust][DataFusion] Fix null handling hash join

Daniël Heres created ARROW-12266:
------------------------------------

             Summary: [Rust][DataFusion] Fix null handling hash join
                 Key: ARROW-12266
                 URL: https://issues.apache.org/jira/browse/ARROW-12266
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust - DataFusion
            Reporter: Daniël Heres
            Assignee: Daniël Heres


Improve null handling of 


SELECT id1, id2 FROM (SELECT null AS id1) t1
LEFT JOIN (SELECT 0 AS id2) t2 ON id1 = id2

> NULL, NULL

(should be empty result set)

We should filter beforehand to make this result correct. Also this can make things more efficient as the non-null filter can be pushed down which can lead to efficiency gains (making data-set smaller, not having to deal with nullable data, or even entire files could be skipped when they only contain nulls).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)