You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Jiajun Xie <ji...@gmail.com> on 2023/01/12 08:46:36 UTC

Why we cannot get all predicates for outer join query?

Hello, all.

I try to use RelMetadataQuery#getAllPredicates get predicate,
but I get null for outer join query that left column name is same as right
column name.
```
final RelNode rel = sql("select name as dname from emp left outer join dept"
    + " on emp.deptno = dept.deptno").toRel();
final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
final RelOptPredicateList r = mq.getAllPredicates(rel);
assertNull(r);
```


After commenting on two pieces of code:
1. RelMdAllPredicates#getAllPredicates
```
if (join.getJoinType().isOuterJoin()) {
  // We cannot map origin of this expression.
  return null;
}
2. RelMdExpressionLineage#getExpressionLineage
```
if (rel.getJoinType().isOuterJoin()) {
  // If we reference the inner side, we will bail out
  if (rel.getJoinType() == JoinRelType.LEFT) {
    ImmutableBitSet rightFields = ImmutableBitSet.range(
        nLeftColumns, rel.getRowType().getFieldCount());
    if (inputFieldsUsed.intersects(rightFields)) {
      // We cannot map origin of this expression.
      return null;
    }
  } else if (rel.getJoinType() == JoinRelType.RIGHT) {
    ImmutableBitSet leftFields = ImmutableBitSet.range(
        0, nLeftColumns);
    if (inputFieldsUsed.intersects(leftFields)) {
      // We cannot map origin of this expression.
      return null;
    }
  } else {
    // We cannot map origin of this expression.
    return null;
  }
}
I can get the results I need
```
final RelNode rel = sql("select name as dname from emp left outer join dept"
    + " on emp.deptno = dept.deptno").toRel();
final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
final RelOptPredicateList r = mq.getAllPredicates(rel);
assertThat(r.pulledUpPredicates.get(0).toString(),
    equalTo("=([CATALOG, SALES, EMP].#0.$7, [CATALOG, SALES,
DEPT].#0.$0)"));
```


It seems that we deliberately return null
in RelMetadataQuery#getAllPredicates . Can someone tell me why? Thanks!

Re: Why we cannot get all predicates for outer join query?

Posted by Jiajun Xie <ji...@gmail.com>.
Hi, Benchao and Renhe
Thanks for your information. They are useful for me.

Re: Why we cannot get all predicates for outer join query?

Posted by 徐仁和 <xu...@gmail.com>.
Hi jiajun
Here is a jira about this problem.
JIRA: https://issues.apache.org/jira/browse/CALCITE-5109
I've been too busy for a while, and I will continue to improve this jira in
the next month.

Best,
XuRenhe

Benchao Li <li...@apache.org> 于2023年1月17日周二 10:21写道:

> The comment for `AllPredicates`[1] has explained the difference. Can this
> answer your question?
>
> The difference with respect to BuiltInMetadata.Predicates provider is that
> > this provider tries to extract ALL predicates even if they are not
> applied
> > on the output expressions of the relational expression; we rely on
> > RexTableInputRef to reference origin columns in
> > org.apache.calcite.rel.core.TableScan for the result predicates.
>
>
> [1]
>
> https://github.com/apache/calcite/blob/9a07b1194bf82410c8583292f4eef7b0f289c594/core/src/main/java/org/apache/calcite/rel/metadata/BuiltInMetadata.java#L670-L675
>
> Jiajun Xie <ji...@gmail.com> 于2023年1月16日周一 16:40写道:
>
> > Hi Benchao,
> > Thanks very much for your email.
> >
> > I understand `RelMetadataQuery#getAllPredicates` cannot get all
> predicates
> > because some optimized cases cannot push/pull predicates.
> >
> > But I think the function name and code comment[1] are misleading,
> > *getAllPredicates* cannot get all predicates.
> > And your case should be handled by
> > `RelMetadataQuery#getPulledUpPredicates`[2]
> > instead of `RelMetadataQuery#getAllPredicates`.
> >
> > [1]
> >
> >
> https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L855
> > [2]
> >
> >
> https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L836
> >
> > On Sun, 15 Jan 2023 at 21:12, Benchao Li <li...@apache.org> wrote:
> >
> > > Hi Jiajun,
> > >
> > > For outer join, the semantic is different for predicates in condition
> and
> > > where, for example:
> > > Q1: select * from emp left join dept on emp.deptno = dept.deptno
> > > Q2: select * from emp left join dept on true where emp.deptno =
> > dept.deptno
> > >
> > > The semantic is different for Q1 and Q2. Q1 will output all the records
> > > from emp, including the records which fail to join from dept. However,
> Q2
> > > will only output the records which successfully join some records from
> > > dept.
> > >
> > > This is the reason why we cannot push/pull the predicates from outer
> > joins
> > > conditions. Hope this helps.
> > >
> > > Jiajun Xie <ji...@gmail.com> 于2023年1月12日周四 16:47写道:
> > >
> > > > Hello, all.
> > > >
> > > > I try to use RelMetadataQuery#getAllPredicates get predicate,
> > > > but I get null for outer join query that left column name is same as
> > > right
> > > > column name.
> > > > ```
> > > > final RelNode rel = sql("select name as dname from emp left outer
> join
> > > > dept"
> > > >     + " on emp.deptno = dept.deptno").toRel();
> > > > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > > > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > > > assertNull(r);
> > > > ```
> > > >
> > > >
> > > > After commenting on two pieces of code:
> > > > 1. RelMdAllPredicates#getAllPredicates
> > > > ```
> > > > if (join.getJoinType().isOuterJoin()) {
> > > >   // We cannot map origin of this expression.
> > > >   return null;
> > > > }
> > > > 2. RelMdExpressionLineage#getExpressionLineage
> > > > ```
> > > > if (rel.getJoinType().isOuterJoin()) {
> > > >   // If we reference the inner side, we will bail out
> > > >   if (rel.getJoinType() == JoinRelType.LEFT) {
> > > >     ImmutableBitSet rightFields = ImmutableBitSet.range(
> > > >         nLeftColumns, rel.getRowType().getFieldCount());
> > > >     if (inputFieldsUsed.intersects(rightFields)) {
> > > >       // We cannot map origin of this expression.
> > > >       return null;
> > > >     }
> > > >   } else if (rel.getJoinType() == JoinRelType.RIGHT) {
> > > >     ImmutableBitSet leftFields = ImmutableBitSet.range(
> > > >         0, nLeftColumns);
> > > >     if (inputFieldsUsed.intersects(leftFields)) {
> > > >       // We cannot map origin of this expression.
> > > >       return null;
> > > >     }
> > > >   } else {
> > > >     // We cannot map origin of this expression.
> > > >     return null;
> > > >   }
> > > > }
> > > > I can get the results I need
> > > > ```
> > > > final RelNode rel = sql("select name as dname from emp left outer
> join
> > > > dept"
> > > >     + " on emp.deptno = dept.deptno").toRel();
> > > > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > > > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > > > assertThat(r.pulledUpPredicates.get(0).toString(),
> > > >     equalTo("=([CATALOG, SALES, EMP].#0.$7, [CATALOG, SALES,
> > > > DEPT].#0.$0)"));
> > > > ```
> > > >
> > > >
> > > > It seems that we deliberately return null
> > > > in RelMetadataQuery#getAllPredicates . Can someone tell me why?
> Thanks!
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>

Re: Why we cannot get all predicates for outer join query?

Posted by Benchao Li <li...@apache.org>.
The comment for `AllPredicates`[1] has explained the difference. Can this
answer your question?

The difference with respect to BuiltInMetadata.Predicates provider is that
> this provider tries to extract ALL predicates even if they are not applied
> on the output expressions of the relational expression; we rely on
> RexTableInputRef to reference origin columns in
> org.apache.calcite.rel.core.TableScan for the result predicates.


[1]
https://github.com/apache/calcite/blob/9a07b1194bf82410c8583292f4eef7b0f289c594/core/src/main/java/org/apache/calcite/rel/metadata/BuiltInMetadata.java#L670-L675

Jiajun Xie <ji...@gmail.com> 于2023年1月16日周一 16:40写道:

> Hi Benchao,
> Thanks very much for your email.
>
> I understand `RelMetadataQuery#getAllPredicates` cannot get all predicates
> because some optimized cases cannot push/pull predicates.
>
> But I think the function name and code comment[1] are misleading,
> *getAllPredicates* cannot get all predicates.
> And your case should be handled by
> `RelMetadataQuery#getPulledUpPredicates`[2]
> instead of `RelMetadataQuery#getAllPredicates`.
>
> [1]
>
> https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L855
> [2]
>
> https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L836
>
> On Sun, 15 Jan 2023 at 21:12, Benchao Li <li...@apache.org> wrote:
>
> > Hi Jiajun,
> >
> > For outer join, the semantic is different for predicates in condition and
> > where, for example:
> > Q1: select * from emp left join dept on emp.deptno = dept.deptno
> > Q2: select * from emp left join dept on true where emp.deptno =
> dept.deptno
> >
> > The semantic is different for Q1 and Q2. Q1 will output all the records
> > from emp, including the records which fail to join from dept. However, Q2
> > will only output the records which successfully join some records from
> > dept.
> >
> > This is the reason why we cannot push/pull the predicates from outer
> joins
> > conditions. Hope this helps.
> >
> > Jiajun Xie <ji...@gmail.com> 于2023年1月12日周四 16:47写道:
> >
> > > Hello, all.
> > >
> > > I try to use RelMetadataQuery#getAllPredicates get predicate,
> > > but I get null for outer join query that left column name is same as
> > right
> > > column name.
> > > ```
> > > final RelNode rel = sql("select name as dname from emp left outer join
> > > dept"
> > >     + " on emp.deptno = dept.deptno").toRel();
> > > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > > assertNull(r);
> > > ```
> > >
> > >
> > > After commenting on two pieces of code:
> > > 1. RelMdAllPredicates#getAllPredicates
> > > ```
> > > if (join.getJoinType().isOuterJoin()) {
> > >   // We cannot map origin of this expression.
> > >   return null;
> > > }
> > > 2. RelMdExpressionLineage#getExpressionLineage
> > > ```
> > > if (rel.getJoinType().isOuterJoin()) {
> > >   // If we reference the inner side, we will bail out
> > >   if (rel.getJoinType() == JoinRelType.LEFT) {
> > >     ImmutableBitSet rightFields = ImmutableBitSet.range(
> > >         nLeftColumns, rel.getRowType().getFieldCount());
> > >     if (inputFieldsUsed.intersects(rightFields)) {
> > >       // We cannot map origin of this expression.
> > >       return null;
> > >     }
> > >   } else if (rel.getJoinType() == JoinRelType.RIGHT) {
> > >     ImmutableBitSet leftFields = ImmutableBitSet.range(
> > >         0, nLeftColumns);
> > >     if (inputFieldsUsed.intersects(leftFields)) {
> > >       // We cannot map origin of this expression.
> > >       return null;
> > >     }
> > >   } else {
> > >     // We cannot map origin of this expression.
> > >     return null;
> > >   }
> > > }
> > > I can get the results I need
> > > ```
> > > final RelNode rel = sql("select name as dname from emp left outer join
> > > dept"
> > >     + " on emp.deptno = dept.deptno").toRel();
> > > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > > assertThat(r.pulledUpPredicates.get(0).toString(),
> > >     equalTo("=([CATALOG, SALES, EMP].#0.$7, [CATALOG, SALES,
> > > DEPT].#0.$0)"));
> > > ```
> > >
> > >
> > > It seems that we deliberately return null
> > > in RelMetadataQuery#getAllPredicates . Can someone tell me why? Thanks!
> > >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>


-- 

Best,
Benchao Li

Re: Why we cannot get all predicates for outer join query?

Posted by Jiajun Xie <ji...@gmail.com>.
Hi Benchao,
Thanks very much for your email.

I understand `RelMetadataQuery#getAllPredicates` cannot get all predicates
because some optimized cases cannot push/pull predicates.

But I think the function name and code comment[1] are misleading,
*getAllPredicates* cannot get all predicates.
And your case should be handled by `RelMetadataQuery#getPulledUpPredicates`[2]
instead of `RelMetadataQuery#getAllPredicates`.

[1]
https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L855
[2]
https://github.com/apache/calcite/blob/413eded693a9087402cc1a6eefeca7a29445d337/core/src/main/java/org/apache/calcite/rel/metadata/RelMetadataQuery.java#L836

On Sun, 15 Jan 2023 at 21:12, Benchao Li <li...@apache.org> wrote:

> Hi Jiajun,
>
> For outer join, the semantic is different for predicates in condition and
> where, for example:
> Q1: select * from emp left join dept on emp.deptno = dept.deptno
> Q2: select * from emp left join dept on true where emp.deptno = dept.deptno
>
> The semantic is different for Q1 and Q2. Q1 will output all the records
> from emp, including the records which fail to join from dept. However, Q2
> will only output the records which successfully join some records from
> dept.
>
> This is the reason why we cannot push/pull the predicates from outer joins
> conditions. Hope this helps.
>
> Jiajun Xie <ji...@gmail.com> 于2023年1月12日周四 16:47写道:
>
> > Hello, all.
> >
> > I try to use RelMetadataQuery#getAllPredicates get predicate,
> > but I get null for outer join query that left column name is same as
> right
> > column name.
> > ```
> > final RelNode rel = sql("select name as dname from emp left outer join
> > dept"
> >     + " on emp.deptno = dept.deptno").toRel();
> > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > assertNull(r);
> > ```
> >
> >
> > After commenting on two pieces of code:
> > 1. RelMdAllPredicates#getAllPredicates
> > ```
> > if (join.getJoinType().isOuterJoin()) {
> >   // We cannot map origin of this expression.
> >   return null;
> > }
> > 2. RelMdExpressionLineage#getExpressionLineage
> > ```
> > if (rel.getJoinType().isOuterJoin()) {
> >   // If we reference the inner side, we will bail out
> >   if (rel.getJoinType() == JoinRelType.LEFT) {
> >     ImmutableBitSet rightFields = ImmutableBitSet.range(
> >         nLeftColumns, rel.getRowType().getFieldCount());
> >     if (inputFieldsUsed.intersects(rightFields)) {
> >       // We cannot map origin of this expression.
> >       return null;
> >     }
> >   } else if (rel.getJoinType() == JoinRelType.RIGHT) {
> >     ImmutableBitSet leftFields = ImmutableBitSet.range(
> >         0, nLeftColumns);
> >     if (inputFieldsUsed.intersects(leftFields)) {
> >       // We cannot map origin of this expression.
> >       return null;
> >     }
> >   } else {
> >     // We cannot map origin of this expression.
> >     return null;
> >   }
> > }
> > I can get the results I need
> > ```
> > final RelNode rel = sql("select name as dname from emp left outer join
> > dept"
> >     + " on emp.deptno = dept.deptno").toRel();
> > final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> > final RelOptPredicateList r = mq.getAllPredicates(rel);
> > assertThat(r.pulledUpPredicates.get(0).toString(),
> >     equalTo("=([CATALOG, SALES, EMP].#0.$7, [CATALOG, SALES,
> > DEPT].#0.$0)"));
> > ```
> >
> >
> > It seems that we deliberately return null
> > in RelMetadataQuery#getAllPredicates . Can someone tell me why? Thanks!
> >
>
>
> --
>
> Best,
> Benchao Li
>

Re: Why we cannot get all predicates for outer join query?

Posted by Benchao Li <li...@apache.org>.
Hi Jiajun,

For outer join, the semantic is different for predicates in condition and
where, for example:
Q1: select * from emp left join dept on emp.deptno = dept.deptno
Q2: select * from emp left join dept on true where emp.deptno = dept.deptno

The semantic is different for Q1 and Q2. Q1 will output all the records
from emp, including the records which fail to join from dept. However, Q2
will only output the records which successfully join some records from dept.

This is the reason why we cannot push/pull the predicates from outer joins
conditions. Hope this helps.

Jiajun Xie <ji...@gmail.com> 于2023年1月12日周四 16:47写道:

> Hello, all.
>
> I try to use RelMetadataQuery#getAllPredicates get predicate,
> but I get null for outer join query that left column name is same as right
> column name.
> ```
> final RelNode rel = sql("select name as dname from emp left outer join
> dept"
>     + " on emp.deptno = dept.deptno").toRel();
> final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> final RelOptPredicateList r = mq.getAllPredicates(rel);
> assertNull(r);
> ```
>
>
> After commenting on two pieces of code:
> 1. RelMdAllPredicates#getAllPredicates
> ```
> if (join.getJoinType().isOuterJoin()) {
>   // We cannot map origin of this expression.
>   return null;
> }
> 2. RelMdExpressionLineage#getExpressionLineage
> ```
> if (rel.getJoinType().isOuterJoin()) {
>   // If we reference the inner side, we will bail out
>   if (rel.getJoinType() == JoinRelType.LEFT) {
>     ImmutableBitSet rightFields = ImmutableBitSet.range(
>         nLeftColumns, rel.getRowType().getFieldCount());
>     if (inputFieldsUsed.intersects(rightFields)) {
>       // We cannot map origin of this expression.
>       return null;
>     }
>   } else if (rel.getJoinType() == JoinRelType.RIGHT) {
>     ImmutableBitSet leftFields = ImmutableBitSet.range(
>         0, nLeftColumns);
>     if (inputFieldsUsed.intersects(leftFields)) {
>       // We cannot map origin of this expression.
>       return null;
>     }
>   } else {
>     // We cannot map origin of this expression.
>     return null;
>   }
> }
> I can get the results I need
> ```
> final RelNode rel = sql("select name as dname from emp left outer join
> dept"
>     + " on emp.deptno = dept.deptno").toRel();
> final RelMetadataQuery mq = rel.getCluster().getMetadataQuery();
> final RelOptPredicateList r = mq.getAllPredicates(rel);
> assertThat(r.pulledUpPredicates.get(0).toString(),
>     equalTo("=([CATALOG, SALES, EMP].#0.$7, [CATALOG, SALES,
> DEPT].#0.$0)"));
> ```
>
>
> It seems that we deliberately return null
> in RelMetadataQuery#getAllPredicates . Can someone tell me why? Thanks!
>


-- 

Best,
Benchao Li