You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by James Starr <ja...@gmail.com> on 2021/03/31 04:27:16 UTC

Supporting subqueries in ON clauses

I have been working on supporting subqueries in ON clauses for some time.

My current PR supports non-correlated subqueries in ON clauses of left and
inner joins where subquery rewrite does not attempt to use the root to do
the rewrite.  For correlated subqueries and  subqueries for joins other
than inner or left an exception will be thrown.

https://github.com/apache/calcite/pull/2364

The subqueries cannot just be applied on top of the join as a filter
because ON clauses for non-inner joins are not just filters.  I am
currently supporting left joins by rewriting the subqueries on the right
side of the join, allowing multiplicity to be maintained.  To support outer
joins, I plan on rewriting the query to a left join + union of all values
from the right that did not join to the left.  In order to support
correlated queries in On clauses, calcite's correlate type is insufficient
because the join being created could be the correlate join but would also
have a filter that needs to be applied.  I could either add a new node or
enhance correlate/join.

Concerning Blackboard.register() vs mutating Blackboard.root to capture the
subqueries after they have been rewritten, unless there is an easy way to
do the necessary rewrite for reworking the query, then subquery rewrite
needs to move away from directly mutating the root node.

James

Re: Supporting subqueries in ON clauses

Posted by Julian Hyde <jh...@apache.org>.
Thanks for taking this on.

There are two ways of handling sub-queries: one is to have
SqlToRelConverter expand them during translation, and the other is to
wrap them in RexSubQuery and treat them as scalar expressions until
they are handled by SubQueryRemoveRule. I couldn't tell which approach
you are taking, but I recommend the latter because it is less
disruptive to the intricate machinery of SqlToRelConverter.

I see you have some SqlToRelConverter tests checking that particular
exceptions are thrown. I know that the exceptions reflect current
limitations of SqlToRelConverter, but still, we shouldn't expose such
implementation details to the user. I think those exceptions should be
thrown by the validator, and tested in SqlValidatorTest. Feel free to
have SqlToRelConverter throw AssertionError if it hits queries that it
can't handle; SqlValidator should have caught them first.

I've added the same comments to the JIRA case [1]; let's continue the
conversation there.

Julian

[1] https://issues.apache.org/jira/browse/CALCITE-4340

On Tue, Mar 30, 2021 at 9:27 PM James Starr <ja...@gmail.com> wrote:
>
> I have been working on supporting subqueries in ON clauses for some time.
>
> My current PR supports non-correlated subqueries in ON clauses of left and
> inner joins where subquery rewrite does not attempt to use the root to do
> the rewrite.  For correlated subqueries and  subqueries for joins other
> than inner or left an exception will be thrown.
>
> https://github.com/apache/calcite/pull/2364
>
> The subqueries cannot just be applied on top of the join as a filter
> because ON clauses for non-inner joins are not just filters.  I am
> currently supporting left joins by rewriting the subqueries on the right
> side of the join, allowing multiplicity to be maintained.  To support outer
> joins, I plan on rewriting the query to a left join + union of all values
> from the right that did not join to the left.  In order to support
> correlated queries in On clauses, calcite's correlate type is insufficient
> because the join being created could be the correlate join but would also
> have a filter that needs to be applied.  I could either add a new node or
> enhance correlate/join.
>
> Concerning Blackboard.register() vs mutating Blackboard.root to capture the
> subqueries after they have been rewritten, unless there is an easy way to
> do the necessary rewrite for reworking the query, then subquery rewrite
> needs to move away from directly mutating the root node.
>
> James