You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2015/05/09 00:20:02 UTC

[jira] [Updated] (DRILL-2046) Merge join inconsistent results

     [ https://issues.apache.org/jira/browse/DRILL-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jacques Nadeau updated DRILL-2046:
----------------------------------
    Fix Version/s:     (was: 1.0.0)
                   1.1.0

> Merge join inconsistent results
> -------------------------------
>
>                 Key: DRILL-2046
>                 URL: https://issues.apache.org/jira/browse/DRILL-2046
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>            Reporter: Rahul Challapalli
>            Assignee: Aman Sinha
>            Priority: Critical
>             Fix For: 1.1.0
>
>         Attachments: widestrings_small.parquet
>
>
> git.commit.id.abbrev=a418af1
> The below queries should result in the same no of records. However the counts do not match when we use merge join.
> {code}
> alter session set `planner.enable_hashjoin` = false;
> select ws1.* from widestrings_small ws1 INNER JOIN widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty where ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not null and ws1.tinyint_var > 120;
> 6 records
> select count(*) from widestrings_small ws1 INNER JOIN widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty where ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not null and ws1.tinyint_var > 120;
> 60 records
> select count(ws1.str_var) from widestrings_small ws1 INNER JOIN widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty where ws1.str_fixed_null_empty is not null and ws2.str_var_null_empty is not null and ws1.tinyint_var > 120;
> 4 records
> {code}
> For hash join all the above queries result in 60 records. I attached the dataset used. Let me know if you have any questions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)