You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Gavin Ray <ra...@gmail.com> on 2021/12/27 18:09:28 UTC

Converting GraphQL query predicates AST -> Calcite "RelNode" mostly working, but "RelBuilder.or()" incorrectly generating an AND condition

Hi all,

Sometime earlier I emailed asking about how it might be possible to
auto-generate a GraphQL API for Calcite schemas, and to convert GraphQL
queries
into Calcite queries.

Stamatis Zampetakis replied pointing me towards the Pig examples (Thank
you! I
wasn't subscribed to the list and didn't know how to reply) and I'm happy to
report I've made what I consider good progress =)

Currently, I am able to:

- Consume an "org.calcite.schema.Schema" and generate a "GraphQLSchema"
from it
- Consume a GraphQL query for the generated GraphQLSchema, and convert the
"where" clause into the corresponding RelNode expressions

However, I've been stuck on a small issue and I've not been able to figure
it
out.

The RelNode I am generating is incorrectly translating an "OR" expression
into
an "AND" expression. I am using "RelBuilder.or()":

https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L197-L262

For reference, here is the GraphQL query (for the "HR" schema I am testing
with):
https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L460-L486

And the SQL I expect it to generate roughly:
https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L447-L458

For the expected SQL, the plan that is generated is:
===============================================================
LogicalProject(inputs=[0], exprs=[[$2, $3, $1, $4]])
  LogicalFilter(condition=[AND(=(CAST($1):INTEGER NOT NULL, 20), >($3,
8000),
  <($3, 10000), OR(=(CAST($2):VARCHAR, 'Eric'), =(CAST($4):INTEGER, 10)))])
    LogicalTableScan(table=[[hr, emps]])

But for the GraphQL query, what is generated is the below plan:
===============================================================
LogicalProject(empid=[$0], name=[$2])
  LogicalFilter(condition=[AND(=($1, 20), SEARCH($3, Sarg[[8000..10000]]),
=($2,
  'Eric'), =($4, 10))])
    LogicalTableScan(table=[[emps]])


Using "RelToSqlConverter", it gives:
===============================================================
SELECT
    "empid", "name"
FROM
    "emps"
WHERE
    "deptno" = 20 AND ("salary" >= 8000 AND "salary" <= 10000) AND "name" =
    'Eric' AND "commission" = 10

The last line should be "AND (name = 'Eric' OR commission = 10)" If anyone
has
any ideas, I would be grateful for them. Probably I am using some part of
the API
wrong I assume.

Thank you =)

Re: Converting GraphQL query predicates AST -> Calcite "RelNode" mostly working, but "RelBuilder.or()" incorrectly generating an AND condition

Posted by Gavin Ray <ra...@gmail.com>.
Thanks Julian, your comment:

"...  and you are passing it a single argument (therefore a trivial OR)."

Made me double check what my code was saying there.
You were dead on with that, it was as simple as swapping the order.

This says "Make an OR statement with 1 condition for every value":
===================================================
orPredicates.values.map { innerValue ->
    builder.or(
        recursiveGQLWherePredicateToRexNode(
            builder,
            innerValue as ObjectValue
        )
    )
}

What I meant to say was, "Make 1 OR condition, with a merged condition
unioning all the values"
===================================================
builder.or(
    orPredicates.values.flatMap { innerValue ->
        recursiveGQLWherePredicateToRexNode(
            builder,
            innerValue as ObjectValue
        )
    }
)

Sure enough, it works now! Really neat to see.
Now I get another error, which seems much easier to debug =)
===================================================

Exception in thread "main"
org.apache.calcite.plan.RelOptPlanner$CannotPlanException:
There are not enough rules to produce a node with desired properties:
convention=NONE, sort=[0]. All the inputs have relevant nodes, however the
cost is still infinite.

Root: rel#34:RelSubset#2.NONE.[0]
Original rel:
LogicalProject(empid=[$0], name=[$2]): rowcount = 1.0, cumulative cost =
{6.0 rows, 11.0 cpu, 0.0 io}, id = 28
  LogicalFilter(condition=[AND(=($1, 20), SEARCH($3, Sarg[[8000..10000]]),
OR(=($2, 'Eric'), =($4, 10)))]): rowcount = 1.0, cumulative cost = {5.0
rows, 9.0 cpu, 0.0 io}, id = 27
    LogicalTableScan(table=[[emps]]): rowcount = 4.0, cumulative cost =
{4.0 rows, 5.0 cpu, 0.0 io}, id = 26

On Mon, Dec 27, 2021 at 3:02 PM Julian Hyde <jh...@gmail.com> wrote:

> I can think of three possibilities:
>  * You are calling the wrong overload of RelBuilder.or and you are passing
> it a single argument (therefore a trivial OR).
>  * RelBuilder.or does the right thing, but Calcite has a bug in
> simplifying SEARCH (see https://issues.apache.org/jira/browse/CALCITE-4173
> <https://issues.apache.org/jira/browse/CALCITE-4173> and Sarg). (Are you
> using 1.28 or higher?)
>  * I have found the semantics of JSON/pattern-based query languages
> (especially MongoDB, which doesn’t really have ANDs and Ors) can be a bit
> squishy. I don’t know whether GraphQL is better. Check that the semantics
> are well-defined.
>
> Steps:
>  * Print out the expression returned by RelBuilder.or and make sure that
> it looks OK
>  * Try to make your GraphQL simpler by removing terms, and see at what
> step the bug disappears. (In other words, make a minimal test case.)
>  * Try to write a test case in RexProgramTest
>
> Julian
>
>
> > On Dec 27, 2021, at 10:09 AM, Gavin Ray <ra...@gmail.com> wrote:
> >
> > Hi all,
> >
> > Sometime earlier I emailed asking about how it might be possible to
> > auto-generate a GraphQL API for Calcite schemas, and to convert GraphQL
> > queries
> > into Calcite queries.
> >
> > Stamatis Zampetakis replied pointing me towards the Pig examples (Thank
> > you! I
> > wasn't subscribed to the list and didn't know how to reply) and I'm
> happy to
> > report I've made what I consider good progress =)
> >
> > Currently, I am able to:
> >
> > - Consume an "org.calcite.schema.Schema" and generate a "GraphQLSchema"
> > from it
> > - Consume a GraphQL query for the generated GraphQLSchema, and convert
> the
> > "where" clause into the corresponding RelNode expressions
> >
> > However, I've been stuck on a small issue and I've not been able to
> figure
> > it
> > out.
> >
> > The RelNode I am generating is incorrectly translating an "OR" expression
> > into
> > an "AND" expression. I am using "RelBuilder.or()":
> >
> >
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L197-L262
> >
> > For reference, here is the GraphQL query (for the "HR" schema I am
> testing
> > with):
> >
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L460-L486
> >
> > And the SQL I expect it to generate roughly:
> >
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L447-L458
> >
> > For the expected SQL, the plan that is generated is:
> > ===============================================================
> > LogicalProject(inputs=[0], exprs=[[$2, $3, $1, $4]])
> >  LogicalFilter(condition=[AND(=(CAST($1):INTEGER NOT NULL, 20), >($3,
> > 8000),
> >  <($3, 10000), OR(=(CAST($2):VARCHAR, 'Eric'), =(CAST($4):INTEGER,
> 10)))])
> >    LogicalTableScan(table=[[hr, emps]])
> >
> > But for the GraphQL query, what is generated is the below plan:
> > ===============================================================
> > LogicalProject(empid=[$0], name=[$2])
> >  LogicalFilter(condition=[AND(=($1, 20), SEARCH($3, Sarg[[8000..10000]]),
> > =($2,
> >  'Eric'), =($4, 10))])
> >    LogicalTableScan(table=[[emps]])
> >
> >
> > Using "RelToSqlConverter", it gives:
> > ===============================================================
> > SELECT
> >    "empid", "name"
> > FROM
> >    "emps"
> > WHERE
> >    "deptno" = 20 AND ("salary" >= 8000 AND "salary" <= 10000) AND "name"
> =
> >    'Eric' AND "commission" = 10
> >
> > The last line should be "AND (name = 'Eric' OR commission = 10)" If
> anyone
> > has
> > any ideas, I would be grateful for them. Probably I am using some part of
> > the API
> > wrong I assume.
> >
> > Thank you =)
>
>

Re: Converting GraphQL query predicates AST -> Calcite "RelNode" mostly working, but "RelBuilder.or()" incorrectly generating an AND condition

Posted by Julian Hyde <jh...@gmail.com>.
I can think of three possibilities:
 * You are calling the wrong overload of RelBuilder.or and you are passing it a single argument (therefore a trivial OR).
 * RelBuilder.or does the right thing, but Calcite has a bug in simplifying SEARCH (see https://issues.apache.org/jira/browse/CALCITE-4173 <https://issues.apache.org/jira/browse/CALCITE-4173> and Sarg). (Are you using 1.28 or higher?)
 * I have found the semantics of JSON/pattern-based query languages (especially MongoDB, which doesn’t really have ANDs and Ors) can be a bit squishy. I don’t know whether GraphQL is better. Check that the semantics are well-defined.

Steps:
 * Print out the expression returned by RelBuilder.or and make sure that it looks OK
 * Try to make your GraphQL simpler by removing terms, and see at what step the bug disappears. (In other words, make a minimal test case.)
 * Try to write a test case in RexProgramTest

Julian


> On Dec 27, 2021, at 10:09 AM, Gavin Ray <ra...@gmail.com> wrote:
> 
> Hi all,
> 
> Sometime earlier I emailed asking about how it might be possible to
> auto-generate a GraphQL API for Calcite schemas, and to convert GraphQL
> queries
> into Calcite queries.
> 
> Stamatis Zampetakis replied pointing me towards the Pig examples (Thank
> you! I
> wasn't subscribed to the list and didn't know how to reply) and I'm happy to
> report I've made what I consider good progress =)
> 
> Currently, I am able to:
> 
> - Consume an "org.calcite.schema.Schema" and generate a "GraphQLSchema"
> from it
> - Consume a GraphQL query for the generated GraphQLSchema, and convert the
> "where" clause into the corresponding RelNode expressions
> 
> However, I've been stuck on a small issue and I've not been able to figure
> it
> out.
> 
> The RelNode I am generating is incorrectly translating an "OR" expression
> into
> an "AND" expression. I am using "RelBuilder.or()":
> 
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L197-L262
> 
> For reference, here is the GraphQL query (for the "HR" schema I am testing
> with):
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L460-L486
> 
> And the SQL I expect it to generate roughly:
> https://gist.github.com/GavinRay97/b40b3359674e94dfa7dd9c26146f8573#file-main-kt-L447-L458
> 
> For the expected SQL, the plan that is generated is:
> ===============================================================
> LogicalProject(inputs=[0], exprs=[[$2, $3, $1, $4]])
>  LogicalFilter(condition=[AND(=(CAST($1):INTEGER NOT NULL, 20), >($3,
> 8000),
>  <($3, 10000), OR(=(CAST($2):VARCHAR, 'Eric'), =(CAST($4):INTEGER, 10)))])
>    LogicalTableScan(table=[[hr, emps]])
> 
> But for the GraphQL query, what is generated is the below plan:
> ===============================================================
> LogicalProject(empid=[$0], name=[$2])
>  LogicalFilter(condition=[AND(=($1, 20), SEARCH($3, Sarg[[8000..10000]]),
> =($2,
>  'Eric'), =($4, 10))])
>    LogicalTableScan(table=[[emps]])
> 
> 
> Using "RelToSqlConverter", it gives:
> ===============================================================
> SELECT
>    "empid", "name"
> FROM
>    "emps"
> WHERE
>    "deptno" = 20 AND ("salary" >= 8000 AND "salary" <= 10000) AND "name" =
>    'Eric' AND "commission" = 10
> 
> The last line should be "AND (name = 'Eric' OR commission = 10)" If anyone
> has
> any ideas, I would be grateful for them. Probably I am using some part of
> the API
> wrong I assume.
> 
> Thank you =)