You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by arina-ielchiieva <gi...@git.apache.org> on 2017/03/23 14:19:05 UTC

[GitHub] drill pull request #794: DRILL-5375: Nested loop join: return correct result...

GitHub user arina-ielchiieva opened a pull request:

    https://github.com/apache/drill/pull/794

    DRILL-5375: Nested loop join: return correct result for left join

    With this fix nested loop join will correctly process INNER and LEFT joins with non-equality conditions.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/arina-ielchiieva/drill DRILL-5375

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/794.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #794
    
----
commit 71628e70a525d9bd27b4f5f56259dce84c75154d
Author: Arina Ielchiieva <ar...@gmail.com>
Date:   2017-03-22T15:07:23Z

    DRILL-5375: Nested loop join: return correct result for left join

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #794: DRILL-5375: Nested loop join: return correct result for le...

Posted by arina-ielchiieva <gi...@git.apache.org>.

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/794

Thanks for bringing up this point. I have done some investigation and found out that implicit casts for nested loop join are already included during materialization.
Join condition is transformed into FunctionCall [1] which is later on materialized using ExpressionTreeMaterializer [2].
`ExpressionTreeMaterializer.visitFunctionCall` includes section which implicit casts [3].
Actually these casts are more enhanced that during hash and merge joins.
For example, during hash and merge joins only casts between numeric types, date and timestamp, varchar and varbinary are supported, i.e.
join by int and varchar columns won't be performed. The following error will be returned: `Join only supports implicit casts between 1. Numeric data
2. Varchar, Varbinary data 3. Date, Timestamp data Left type: INT, Right type: VARCHAR. Add explicit casts to avoid this error`.
In our case nested loop join will be able to perform join by int and varchar columns without adding explicit casts.

[1] https://github.com/arina-ielchiieva/drill/blob/71628e70a525d9bd27b4f5f56259dce84c75154d/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/NestedLoopJoinPrel.java#L95
[2] https://github.com/arina-ielchiieva/drill/blob/71628e70a525d9bd27b4f5f56259dce84c75154d/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/NestedLoopJoinBatch.java#L265
[3] https://github.com/apache/drill/blob/9411b26ece34ed8b2f498deea5e41f1901eb1013/exec/java-exec/src/main/java/org/apache/drill/exec/expr/ExpressionTreeMaterializer.java#L362

---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---