You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Daniel (Spotify) Ståhl" <st...@spotify.com.INVALID> on 2021/11/16 13:30:19 UTC

UNNEST in relational algebra

Hi
We are trying to generate relational algebra for SQL that uses UNNEST of
arrays in BigQuery. Here are three examples of SQL (in Big Query standard
query syntax) that we want to be able to handle and generate from
relational algebra. Any tips on how you would do that in Calcite?

SELECT id,
  (SELECT h.is_active FROM UNNEST(history.all_of_history) h
    WHERE start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR
end_date IS NULL))
FROM `table`;

SELECT id, h.is_active
FROM `table`
CROSS JOIN UNNEST(history.all_of_history) h;

SELECT id, count(h) as h
from `table`
CROSS JOIN UNNEST(history.all_of_history) h
GROUP BY id;

The table "table" has two columns: "id" and "history". history is a record
and "history.all_of_history"
is a repeated record with three fields ("is_active", "start_date" and
"end_date").

Thanks,
Daniel Ståhl

Re: UNNEST in relational algebra

Posted by "Daniel (Spotify) Ståhl" <st...@spotify.com.INVALID>.
Hi Stamatis
I created a JIRA bug report for this.
https://issues.apache.org/jira/browse/CALCITE-4904. We discovered that it
happened for all SqlDialects that return false for the
"supportsAliasedValues" method.

Regards,
/Daniel

On Wed, Nov 24, 2021 at 4:14 PM Stamatis Zampetakis <za...@gmail.com>
wrote:

> Hi Daniel,
>
> If you are able to create a valid SQL for the plan using the
> CalciteSqlDialect then the fact that it is not possible to do so using
> BigQuerySqlDialect is likely a bug.
> Please log a JIRA if that's the case.
>
> What happens when you parse some of the example SQL queries you mentioned
> above. What's the generated plan after SqlToRelConverter?
> Did you generate a similar plan when you directly used the RelBuilder API?
> Maybe you can take some inspiration on how to use the RelBuilder by looking
> into tests that introduce an uncollect operator in SqlToRelConverterTest
> [1].
>
> Best,
> Stamatis
>
> [1]
>
> https://github.com/apache/calcite/blob/7c423ef23878271b1c50c03629ebfff674985681/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1908
>
> On Tue, Nov 16, 2021 at 3:18 PM Daniel (Spotify) Ståhl
> <st...@spotify.com.invalid> wrote:
>
> > Hi,
> > Sorry for being unclear. We have tried to generate SQL similar to the
> three
> > examples using the RelBuilder and the Uncollect expression. Something
> like
> > this.
> >
> > relBuilder.scan("table");
> > final Holder<RexCorrelVariable> correlVar = Holder.empty();
> > relBuilder.variable(correlVar);
> > relBuilder.values(new String[] {"dummy"}, "");
> > relBuilder.project(
> >     relBuilder.alias(
> >         relBuilder.dot(relBuilder.field(correlVar.get(), "history"),
> > "all_of_history"),
> >         "unnest_all_of_history"));
> > relBuilder.uncollect(List.of("unnest_all_of_history"), false);
> > relBuilder.correlate(JoinRelType.INNER, correlVar.get().id);
> > RexNode userIdNode = relBuilder.alias(relBuilder.field("id"), "id");
> > RexNode isActive =
> >     relBuilder.alias(
> >         relBuilder.dot(relBuilder.field("unnest_all_of_history"),
> > "is_active"),
> >         "is_active");
> > relBuilder.project(userIdNode, reportingRegion);
> > RelNode result = relBuilder.build();
> >
> > We can generate SQL using the CalciteSqlDialect but we get an exception
> > when using the BigQuerySqlDialect. We wonder what the best way to work
> with
> > UNNEST in relational algebra using RelBuilder for BigQuery.
> >
> > Thanks.
> >
> >
> >
> > On Tue, Nov 16, 2021 at 2:49 PM Thomas Rebele <trebele@tibco.com.invalid
> >
> > wrote:
> >
> > > Hello,
> > >
> > > I don't understand what you want to do here. There's an UNNEST operator
> > in
> > > Calcite, see the comment at the bottom of this section:
> > > https://calcite.apache.org/docs/reference.html#collection-functions.
> > >
> > > Cordialement / Best Regards,
> > > *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
> > >
> > >
> > > On Tue, Nov 16, 2021 at 2:30 PM Daniel (Spotify) Ståhl
> > > <st...@spotify.com.invalid> wrote:
> > >
> > > > Hi
> > > > We are trying to generate relational algebra for SQL that uses UNNEST
> > of
> > > > arrays in BigQuery. Here are three examples of SQL (in Big Query
> > standard
> > > > query syntax) that we want to be able to handle and generate from
> > > > relational algebra. Any tips on how you would do that in Calcite?
> > > >
> > > > SELECT id,
> > > >   (SELECT h.is_active FROM UNNEST(history.all_of_history) h
> > > >     WHERE start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR
> > > > end_date IS NULL))
> > > > FROM `table`;
> > > >
> > > > SELECT id, h.is_active
> > > > FROM `table`
> > > > CROSS JOIN UNNEST(history.all_of_history) h;
> > > >
> > > > SELECT id, count(h) as h
> > > > from `table`
> > > > CROSS JOIN UNNEST(history.all_of_history) h
> > > > GROUP BY id;
> > > >
> > > > The table "table" has two columns: "id" and "history". history is a
> > > record
> > > > and "history.all_of_history"
> > > > is a repeated record with three fields ("is_active", "start_date" and
> > > > "end_date").
> > > >
> > > > Thanks,
> > > > Daniel Ståhl
> > > >
> > >
> >
>

Re: UNNEST in relational algebra

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hi Daniel,

If you are able to create a valid SQL for the plan using the
CalciteSqlDialect then the fact that it is not possible to do so using
BigQuerySqlDialect is likely a bug.
Please log a JIRA if that's the case.

What happens when you parse some of the example SQL queries you mentioned
above. What's the generated plan after SqlToRelConverter?
Did you generate a similar plan when you directly used the RelBuilder API?
Maybe you can take some inspiration on how to use the RelBuilder by looking
into tests that introduce an uncollect operator in SqlToRelConverterTest
[1].

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/7c423ef23878271b1c50c03629ebfff674985681/core/src/test/java/org/apache/calcite/test/SqlToRelConverterTest.java#L1908

On Tue, Nov 16, 2021 at 3:18 PM Daniel (Spotify) Ståhl
<st...@spotify.com.invalid> wrote:

> Hi,
> Sorry for being unclear. We have tried to generate SQL similar to the three
> examples using the RelBuilder and the Uncollect expression. Something like
> this.
>
> relBuilder.scan("table");
> final Holder<RexCorrelVariable> correlVar = Holder.empty();
> relBuilder.variable(correlVar);
> relBuilder.values(new String[] {"dummy"}, "");
> relBuilder.project(
>     relBuilder.alias(
>         relBuilder.dot(relBuilder.field(correlVar.get(), "history"),
> "all_of_history"),
>         "unnest_all_of_history"));
> relBuilder.uncollect(List.of("unnest_all_of_history"), false);
> relBuilder.correlate(JoinRelType.INNER, correlVar.get().id);
> RexNode userIdNode = relBuilder.alias(relBuilder.field("id"), "id");
> RexNode isActive =
>     relBuilder.alias(
>         relBuilder.dot(relBuilder.field("unnest_all_of_history"),
> "is_active"),
>         "is_active");
> relBuilder.project(userIdNode, reportingRegion);
> RelNode result = relBuilder.build();
>
> We can generate SQL using the CalciteSqlDialect but we get an exception
> when using the BigQuerySqlDialect. We wonder what the best way to work with
> UNNEST in relational algebra using RelBuilder for BigQuery.
>
> Thanks.
>
>
>
> On Tue, Nov 16, 2021 at 2:49 PM Thomas Rebele <tr...@tibco.com.invalid>
> wrote:
>
> > Hello,
> >
> > I don't understand what you want to do here. There's an UNNEST operator
> in
> > Calcite, see the comment at the bottom of this section:
> > https://calcite.apache.org/docs/reference.html#collection-functions.
> >
> > Cordialement / Best Regards,
> > *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
> >
> >
> > On Tue, Nov 16, 2021 at 2:30 PM Daniel (Spotify) Ståhl
> > <st...@spotify.com.invalid> wrote:
> >
> > > Hi
> > > We are trying to generate relational algebra for SQL that uses UNNEST
> of
> > > arrays in BigQuery. Here are three examples of SQL (in Big Query
> standard
> > > query syntax) that we want to be able to handle and generate from
> > > relational algebra. Any tips on how you would do that in Calcite?
> > >
> > > SELECT id,
> > >   (SELECT h.is_active FROM UNNEST(history.all_of_history) h
> > >     WHERE start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR
> > > end_date IS NULL))
> > > FROM `table`;
> > >
> > > SELECT id, h.is_active
> > > FROM `table`
> > > CROSS JOIN UNNEST(history.all_of_history) h;
> > >
> > > SELECT id, count(h) as h
> > > from `table`
> > > CROSS JOIN UNNEST(history.all_of_history) h
> > > GROUP BY id;
> > >
> > > The table "table" has two columns: "id" and "history". history is a
> > record
> > > and "history.all_of_history"
> > > is a repeated record with three fields ("is_active", "start_date" and
> > > "end_date").
> > >
> > > Thanks,
> > > Daniel Ståhl
> > >
> >
>

Re: UNNEST in relational algebra

Posted by "Daniel (Spotify) Ståhl" <st...@spotify.com.INVALID>.
Hi,
Sorry for being unclear. We have tried to generate SQL similar to the three
examples using the RelBuilder and the Uncollect expression. Something like
this.

relBuilder.scan("table");
final Holder<RexCorrelVariable> correlVar = Holder.empty();
relBuilder.variable(correlVar);
relBuilder.values(new String[] {"dummy"}, "");
relBuilder.project(
    relBuilder.alias(
        relBuilder.dot(relBuilder.field(correlVar.get(), "history"),
"all_of_history"),
        "unnest_all_of_history"));
relBuilder.uncollect(List.of("unnest_all_of_history"), false);
relBuilder.correlate(JoinRelType.INNER, correlVar.get().id);
RexNode userIdNode = relBuilder.alias(relBuilder.field("id"), "id");
RexNode isActive =
    relBuilder.alias(
        relBuilder.dot(relBuilder.field("unnest_all_of_history"),
"is_active"),
        "is_active");
relBuilder.project(userIdNode, reportingRegion);
RelNode result = relBuilder.build();

We can generate SQL using the CalciteSqlDialect but we get an exception
when using the BigQuerySqlDialect. We wonder what the best way to work with
UNNEST in relational algebra using RelBuilder for BigQuery.

Thanks.



On Tue, Nov 16, 2021 at 2:49 PM Thomas Rebele <tr...@tibco.com.invalid>
wrote:

> Hello,
>
> I don't understand what you want to do here. There's an UNNEST operator in
> Calcite, see the comment at the bottom of this section:
> https://calcite.apache.org/docs/reference.html#collection-functions.
>
> Cordialement / Best Regards,
> *Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com
>
>
> On Tue, Nov 16, 2021 at 2:30 PM Daniel (Spotify) Ståhl
> <st...@spotify.com.invalid> wrote:
>
> > Hi
> > We are trying to generate relational algebra for SQL that uses UNNEST of
> > arrays in BigQuery. Here are three examples of SQL (in Big Query standard
> > query syntax) that we want to be able to handle and generate from
> > relational algebra. Any tips on how you would do that in Calcite?
> >
> > SELECT id,
> >   (SELECT h.is_active FROM UNNEST(history.all_of_history) h
> >     WHERE start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR
> > end_date IS NULL))
> > FROM `table`;
> >
> > SELECT id, h.is_active
> > FROM `table`
> > CROSS JOIN UNNEST(history.all_of_history) h;
> >
> > SELECT id, count(h) as h
> > from `table`
> > CROSS JOIN UNNEST(history.all_of_history) h
> > GROUP BY id;
> >
> > The table "table" has two columns: "id" and "history". history is a
> record
> > and "history.all_of_history"
> > is a repeated record with three fields ("is_active", "start_date" and
> > "end_date").
> >
> > Thanks,
> > Daniel Ståhl
> >
>

Re: UNNEST in relational algebra

Posted by Thomas Rebele <tr...@tibco.com.INVALID>.
Hello,

I don't understand what you want to do here. There's an UNNEST operator in
Calcite, see the comment at the bottom of this section:
https://calcite.apache.org/docs/reference.html#collection-functions.

Cordialement / Best Regards,
*Thomas Rebele, PhD* | R&D Developer | Germany | www.tibco.com


On Tue, Nov 16, 2021 at 2:30 PM Daniel (Spotify) Ståhl
<st...@spotify.com.invalid> wrote:

> Hi
> We are trying to generate relational algebra for SQL that uses UNNEST of
> arrays in BigQuery. Here are three examples of SQL (in Big Query standard
> query syntax) that we want to be able to handle and generate from
> relational algebra. Any tips on how you would do that in Calcite?
>
> SELECT id,
>   (SELECT h.is_active FROM UNNEST(history.all_of_history) h
>     WHERE start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR
> end_date IS NULL))
> FROM `table`;
>
> SELECT id, h.is_active
> FROM `table`
> CROSS JOIN UNNEST(history.all_of_history) h;
>
> SELECT id, count(h) as h
> from `table`
> CROSS JOIN UNNEST(history.all_of_history) h
> GROUP BY id;
>
> The table "table" has two columns: "id" and "history". history is a record
> and "history.all_of_history"
> is a repeated record with three fields ("is_active", "start_date" and
> "end_date").
>
> Thanks,
> Daniel Ståhl
>