You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@calcite.apache.org by "Ruben Quesada Lopez (Jira)" <ji...@apache.org> on 2019/08/21 14:11:00 UTC

[jira] [Commented] (CALCITE-2979) Add a block-based nested loop join algorithm

    [ https://issues.apache.org/jira/browse/CALCITE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912315#comment-16912315 ] 

Ruben Quesada Lopez commented on CALCITE-2979:
----------------------------------------------

Hi everyone,
I think the PR is in a pretty good shape, if anyone has the opportunity to take a look at it, it will be very helpful.
In any case, I believe we can safely push this into 1.21; even though this is a "brand new" join implementation, the risk is very limited since the new rule that generates the batch nested loop operator is not part of the "default" Calcite rule set, so this change should not break anything.

> Add a block-based nested loop join algorithm
> --------------------------------------------
>
>                 Key: CALCITE-2979
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2979
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Khawla Mouhoubi
>            Priority: Major
>              Labels: performance, pull-request-available
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, Calcite provides a tuple-based nested loop join algorithm implemented through EnumerableCorrelate and EnumerableDefaults.correlateJoin. This means that for each tuple of the outer relation we probe (set variables) in the inner relation.
> The goal of this issue is to add new algorithm (or extend the correlateJoin method) which first gathers blocks (batches) of tuples from the outer relation and then probes the inner relation once per block.
> There are cases (eg., indexes) where the inner relation can be accessed by more than one value which can greatly improve the performance in particular when the outer relation is big.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)