You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Igor Lozynskyi <lo...@gmail.com> on 2020/10/09 16:23:06 UTC

Potential issue with RexSimplify/Sarg in Calcite 1.26

Hi all!

In a downstream project, we are trying to migrate from Calcite 1.25 to 1.26.
The migration took us a bit longer than usual, so we could not give feedback during the release voting time frame. The migration seems to be successful except for one issue with filter simplification.

Now, the Rex expressions like follows:
`(deptno = 20 OR deptno IS NULL) AND deptno = 10`
Are simplified to:
`deptno IS NULL`
Instead of:
`FALSE`

Similarly, the following expression:
`(deptno <> 20 OR deptno IS NULL) AND deptno = 10`
Is simplified to:
`deptno = 10 OR deptno IS NULL`
Instead of:
`deptno = 10`

These discrepancies were identified by our test suites.

A diff with related unit tests (for RelBuilderTest.java) is in the attachment.

Is this change is desired, or this is an issue?

With best regards,
Igor Lozynskyi

Re: Potential issue with RexSimplify/Sarg in Calcite 1.26

Posted by Igor Lozynskyi <lo...@gmail.com>.
Hi!

I created the following issue: https://issues.apache.org/jira/browse/CALCITE-4352

Regards, Igor

Re: Potential issue with RexSimplify/Sarg in Calcite 1.26

Posted by Julian Hyde <jh...@apache.org>.
Thanks for raising this. Please open a new JIRA case.

The attachments didn't come through (Apache mail strips them). Best
thing is to put the test case in a github PR.

Julian

On Thu, Oct 22, 2020 at 1:34 PM Igor Lozynskyi <lo...@gmail.com> wrote:
>
> Hi Julian!
>
> Thank you for the quick fix of CALCITE-4325.
> It fixed our issue with filter simplification. I was able to proceed with our migration to the latest and greatest Calcite (now 1.27-SNAPSHOT).
> However, after some additional testing, I suppose, I found one more issue with filters.
>
> Now, the following query:
>
> SELECT *
> FROM emp
> WHERE deptno > 20 AND deptno < 30 AND mgr IS NOT NULL
>
> Is simplified to:
>
> SELECT *
> FROM emp
> WHERE deptno > 20 AND deptno < 30
>
> It seems that the simplification of complex `AND` filters incorrectly simplify `IS NOT NULL` to `TRUE`.
> A simpler query does not show this issue (stays the same):
>
> SELECT *
> FROM emp
> WHERE  mgr IS NOT NULL
>
> Should I report a new issue or better reopen CALCITE-4325?
>
> The patch with the corresponding test cases is in attachments.
>
> Regards, Igor
>
> On 10 Oct 2020, 12:56 +0300, Igor Lozynskyi <lo...@gmail.com>, wrote:
>
> Hi!
>
> I created a JIRA ticket regarding this issue:
> https://issues.apache.org/jira/browse/CALCITE-4325
>
> Regards,
> Igor Lozynskyi
> On 9 Oct 2020, 23:24 +0300, Julian Hyde <jh...@apache.org>, wrote:
>
> Those cases you have found are indeed regressions. Please log a JIRA case.
>
> For the record, I am still a strong believer in the Sarg approach.
> Expressions such as
>
> (deptno <> 20 OR deptno IS NULL) AND deptno = 10
>
> are complex when expressed as ANDs and ORs but can become a single
> Sarg and therefore are easy to optimize. I suspect that we regressed
> on these particular expressions because we did not have tests, and
> converting part of the expression to a Sarg broke some delicate
> simplification that was looking for a particular pattern of ANDs and
> ORs.
>
> Julian
>
> On Fri, Oct 9, 2020 at 9:23 AM Igor Lozynskyi <lo...@gmail.com> wrote:
>
>
> Hi all!
>
> In a downstream project, we are trying to migrate from Calcite 1.25 to 1.26.
> The migration took us a bit longer than usual, so we could not give feedback during the release voting time frame. The migration seems to be successful except for one issue with filter simplification.
>
> Now, the Rex expressions like follows:
> `(deptno = 20 OR deptno IS NULL) AND deptno = 10`
> Are simplified to:
> `deptno IS NULL`
> Instead of:
> `FALSE`
>
> Similarly, the following expression:
> `(deptno <> 20 OR deptno IS NULL) AND deptno = 10`
> Is simplified to:
> `deptno = 10 OR deptno IS NULL`
> Instead of:
> `deptno = 10`
>
> These discrepancies were identified by our test suites.
>
> A diff with related unit tests (for RelBuilderTest.java) is in the attachment.
>
> Is this change is desired, or this is an issue?
>
> With best regards,
> Igor Lozynskyi

Re: Potential issue with RexSimplify/Sarg in Calcite 1.26

Posted by Igor Lozynskyi <lo...@gmail.com>.
Hi Julian!

Thank you for the quick fix of CALCITE-4325.
It fixed our issue with filter simplification. I was able to proceed with our migration to the latest and greatest Calcite (now 1.27-SNAPSHOT).
However, after some additional testing, I suppose, I found one more issue with filters.

Now, the following query:

SELECT *
FROM emp
WHERE deptno > 20 AND deptno < 30 AND mgr IS NOT NULL

Is simplified to:

SELECT *
FROM emp
WHERE deptno > 20 AND deptno < 30

It seems that the simplification of complex `AND` filters incorrectly simplify `IS NOT NULL` to `TRUE`.
A simpler query does not show this issue (stays the same):

SELECT *
FROM emp
WHERE  mgr IS NOT NULL

Should I report a new issue or better reopen CALCITE-4325?

The patch with the corresponding test cases is in attachments.

Regards, Igor

On 10 Oct 2020, 12:56 +0300, Igor Lozynskyi <lo...@gmail.com>, wrote:
> Hi!
>
> I created a JIRA ticket regarding this issue:
> https://issues.apache.org/jira/browse/CALCITE-4325
>
> Regards,
> Igor Lozynskyi
> On 9 Oct 2020, 23:24 +0300, Julian Hyde <jh...@apache.org>, wrote:
> > Those cases you have found are indeed regressions. Please log a JIRA case.
> >
> > For the record, I am still a strong believer in the Sarg approach.
> > Expressions such as
> >
> > (deptno <> 20 OR deptno IS NULL) AND deptno = 10
> >
> > are complex when expressed as ANDs and ORs but can become a single
> > Sarg and therefore are easy to optimize. I suspect that we regressed
> > on these particular expressions because we did not have tests, and
> > converting part of the expression to a Sarg broke some delicate
> > simplification that was looking for a particular pattern of ANDs and
> > ORs.
> >
> > Julian
> >
> > On Fri, Oct 9, 2020 at 9:23 AM Igor Lozynskyi <lo...@gmail.com> wrote:
> > >
> > > Hi all!
> > >
> > > In a downstream project, we are trying to migrate from Calcite 1.25 to 1.26.
> > > The migration took us a bit longer than usual, so we could not give feedback during the release voting time frame. The migration seems to be successful except for one issue with filter simplification.
> > >
> > > Now, the Rex expressions like follows:
> > > `(deptno = 20 OR deptno IS NULL) AND deptno = 10`
> > > Are simplified to:
> > > `deptno IS NULL`
> > > Instead of:
> > > `FALSE`
> > >
> > > Similarly, the following expression:
> > > `(deptno <> 20 OR deptno IS NULL) AND deptno = 10`
> > > Is simplified to:
> > > `deptno = 10 OR deptno IS NULL`
> > > Instead of:
> > > `deptno = 10`
> > >
> > > These discrepancies were identified by our test suites.
> > >
> > > A diff with related unit tests (for RelBuilderTest.java) is in the attachment.
> > >
> > > Is this change is desired, or this is an issue?
> > >
> > > With best regards,
> > > Igor Lozynskyi

Re: Potential issue with RexSimplify/Sarg in Calcite 1.26

Posted by Igor Lozynskyi <lo...@gmail.com>.
Hi!

I created a JIRA ticket regarding this issue:
https://issues.apache.org/jira/browse/CALCITE-4325

Regards,
Igor Lozynskyi
On 9 Oct 2020, 23:24 +0300, Julian Hyde <jh...@apache.org>, wrote:
> Those cases you have found are indeed regressions. Please log a JIRA case.
>
> For the record, I am still a strong believer in the Sarg approach.
> Expressions such as
>
> (deptno <> 20 OR deptno IS NULL) AND deptno = 10
>
> are complex when expressed as ANDs and ORs but can become a single
> Sarg and therefore are easy to optimize. I suspect that we regressed
> on these particular expressions because we did not have tests, and
> converting part of the expression to a Sarg broke some delicate
> simplification that was looking for a particular pattern of ANDs and
> ORs.
>
> Julian
>
> On Fri, Oct 9, 2020 at 9:23 AM Igor Lozynskyi <lo...@gmail.com> wrote:
> >
> > Hi all!
> >
> > In a downstream project, we are trying to migrate from Calcite 1.25 to 1.26.
> > The migration took us a bit longer than usual, so we could not give feedback during the release voting time frame. The migration seems to be successful except for one issue with filter simplification.
> >
> > Now, the Rex expressions like follows:
> > `(deptno = 20 OR deptno IS NULL) AND deptno = 10`
> > Are simplified to:
> > `deptno IS NULL`
> > Instead of:
> > `FALSE`
> >
> > Similarly, the following expression:
> > `(deptno <> 20 OR deptno IS NULL) AND deptno = 10`
> > Is simplified to:
> > `deptno = 10 OR deptno IS NULL`
> > Instead of:
> > `deptno = 10`
> >
> > These discrepancies were identified by our test suites.
> >
> > A diff with related unit tests (for RelBuilderTest.java) is in the attachment.
> >
> > Is this change is desired, or this is an issue?
> >
> > With best regards,
> > Igor Lozynskyi

Re: Potential issue with RexSimplify/Sarg in Calcite 1.26

Posted by Julian Hyde <jh...@apache.org>.
Those cases you have found are indeed regressions. Please log a JIRA case.

For the record, I am still a strong believer in the Sarg approach.
Expressions such as

  (deptno <> 20 OR deptno IS NULL) AND deptno = 10

are complex when expressed as ANDs and ORs but can become a single
Sarg and therefore are easy to optimize. I suspect that we regressed
on these particular expressions because we did not have tests, and
converting part of the expression to a Sarg broke some delicate
simplification that was looking for a particular pattern of ANDs and
ORs.

Julian

On Fri, Oct 9, 2020 at 9:23 AM Igor Lozynskyi <lo...@gmail.com> wrote:
>
> Hi all!
>
> In a downstream project, we are trying to migrate from Calcite 1.25 to 1.26.
> The migration took us a bit longer than usual, so we could not give feedback during the release voting time frame. The migration seems to be successful except for one issue with filter simplification.
>
> Now, the Rex expressions like follows:
> `(deptno = 20 OR deptno IS NULL) AND deptno = 10`
> Are simplified to:
> `deptno IS NULL`
> Instead of:
> `FALSE`
>
> Similarly, the following expression:
> `(deptno <> 20 OR deptno IS NULL) AND deptno = 10`
> Is simplified to:
> `deptno = 10 OR deptno IS NULL`
> Instead of:
> `deptno = 10`
>
> These discrepancies were identified by our test suites.
>
> A diff with related unit tests (for RelBuilderTest.java) is in the attachment.
>
> Is this change is desired, or this is an issue?
>
> With best regards,
> Igor Lozynskyi