You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Igor Guzenko <ih...@gmail.com> on 2019/06/20 07:14:47 UTC

[DISCUSSION] Problem caused by flattening of struct fields

Hello everyone,

I've got issue while converting query with struct column. Struct's fields
are flattened while conversion to rel is performed.
But later they aren't collected back so query plan may produce incorrect
result.

For example [1], consider table *str_table* with just one column *str* of
type STRUCT<name VARCHAR(10), age INTEGER>,
then query SELECT *str* FROM *str_table* produces plan

LogicalProject(STR=[$0])
   LogicalProject(STR=[$0.name], STR1=[$0.age])
      LogicalTableScan(table=[[CATALOG, STRUCT, STR_TABLE]])

where top level project returns nested field `name` as `str` instead of
original struct column. My question is, what is the correct
way to collect back the flattened fields and produce correct result for the
query ?

[1] -
https://github.com/ihuzenko/calcite/commit/e24eaa22fbb5c950a0bd5290cc09ca56ea7f1e44

Thank you in advance,
Igor Guzenko

Re: Re: Re: [DISCUSSION] Problem caused by flattening of struct fields

Posted by Haisheng Yuan <h....@alibaba-inc.com>.
I am not familiar with the usage of NEW. But checking the comments of SqlNewOperator, the NEW operator accepts 1 argument, so I guess you are right. But need someone else's confirmation.

- Haisheng

------------------------------------------------------------------
发件人:Igor Guzenko<ih...@gmail.com>
日 期:2019年06月21日 22:21:44
收件人:Haisheng Yuan<h....@alibaba-inc.com>
抄 送:Apache Calcite dev list<de...@calcite.apache.org>
主 题:Re: Re: [DISCUSSION] Problem caused by flattening of struct fields

Thanks for quick answer, I'll work on the CALCITE-3138 jira.
If I'm not wrong this collection of fields back after flatten supposed to be done like rex invocation to create new row.
For example, collecting projection might look like: 

LogicalProject(HOME_ADDRESS=[NEW(ROW($1, $2, $3, $4)):ObjectSqlType(ADDRESS) NOT NULL]).

Please let me know if I'm wrong and there is more proper way to restructure. 

Thanks, 
Igor
On Fri, Jun 21, 2019 at 4:21 AM Haisheng Yuan <h....@alibaba-inc.com> wrote:

I have created 2 JIRA tickets to track the issues:
https://issues.apache.org/jira/browse/CALCITE-3137
https://issues.apache.org/jira/browse/CALCITE-3138

- Haisheng

------------------------------------------------------------------
发件人:Haisheng Yuan<h....@alibaba-inc.com>
日 期:2019年06月21日 08:54:55
收件人:Igor Guzenko<ih...@gmail.com>; Apache Calcite dev list<de...@calcite.apache.org>
主 题:Re: [DISCUSSION] Problem caused by flattening of struct fields

Consider creating ObjectSqlType like Fixture.addressType, which is STRUCTURED type. 
typeFactory.createStructType() actually creates a ROW type, which is not supported to reconstruct fields at the moment, see method RelStructuredTypeFlattener.restructureFields.

But even with ObjectSqlType, you will still see an assert error. The assertion should be removed.

- Haisheng

------------------------------------------------------------------
发件人:Igor Guzenko<ih...@gmail.com>
日 期:2019年06月20日 15:14:47
收件人:<de...@calcite.apache.org>
主 题:[DISCUSSION] Problem caused by flattening of struct fields

Hello everyone,

I've got issue while converting query with struct column. Struct's fields
are flattened while conversion to rel is performed.
But later they aren't collected back so query plan may produce incorrect
result.

For example [1], consider table *str_table* with just one column *str* of
type STRUCT<name VARCHAR(10), age INTEGER>,
then query SELECT *str* FROM *str_table* produces plan

LogicalProject(STR=[$0])
   LogicalProject(STR=[$0.name], STR1=[$0.age])
      LogicalTableScan(table=[[CATALOG, STRUCT, STR_TABLE]])

where top level project returns nested field `name` as `str` instead of
original struct column. My question is, what is the correct
way to collect back the flattened fields and produce correct result for the
query ?

[1] -
https://github.com/ihuzenko/calcite/commit/e24eaa22fbb5c950a0bd5290cc09ca56ea7f1e44

Thank you in advance,
Igor Guzenko




Re: Re: [DISCUSSION] Problem caused by flattening of struct fields

Posted by Igor Guzenko <ih...@gmail.com>.
Thanks for quick answer, I'll work on the CALCITE-3138
<https://issues.apache.org/jira/browse/CALCITE-3138> jira.

If I'm not wrong this collection of fields back after flatten supposed to
be done like rex invocation to create new row.
For example, collecting projection might look like:

LogicalProject(HOME_ADDRESS=[NEW(ROW($1, $2, $3,
$4)):ObjectSqlType(ADDRESS) NOT NULL]).

Please let me know if I'm wrong and there is more proper way to
restructure.

Thanks,
Igor

On Fri, Jun 21, 2019 at 4:21 AM Haisheng Yuan <h....@alibaba-inc.com>
wrote:

> I have created 2 JIRA tickets to track the issues:
> https://issues.apache.org/jira/browse/CALCITE-3137
> https://issues.apache.org/jira/browse/CALCITE-3138
>
> - Haisheng
>
> ------------------------------------------------------------------
> 发件人:Haisheng Yuan<h....@alibaba-inc.com>
> 日 期:2019年06月21日 08:54:55
> 收件人:Igor Guzenko<ih...@gmail.com>; Apache Calcite dev list<
> dev@calcite.apache.org>
> 主 题:Re: [DISCUSSION] Problem caused by flattening of struct fields
>
> Consider creating ObjectSqlType like Fixture.addressType, which is
> STRUCTURED type.
> typeFactory.createStructType() actually creates a ROW type, which is not
> supported to reconstruct fields at the moment, see
> method RelStructuredTypeFlattener.restructureFields.
>
> But even with ObjectSqlType, you will still see an assert error. The
> assertion should be removed.
>
> - Haisheng
>
> ------------------------------------------------------------------
> 发件人:Igor Guzenko<ih...@gmail.com>
> 日 期:2019年06月20日 15:14:47
> 收件人:<de...@calcite.apache.org>
> 主 题:[DISCUSSION] Problem caused by flattening of struct fields
>
> Hello everyone,
>
> I've got issue while converting query with struct column. Struct's fields
> are flattened while conversion to rel is performed.
> But later they aren't collected back so query plan may produce incorrect
> result.
>
> For example [1], consider table *str_table* with just one column *str* of
> type STRUCT<name VARCHAR(10), age INTEGER>,
> then query SELECT *str* FROM *str_table* produces plan
>
> LogicalProject(STR=[$0])
>    LogicalProject(STR=[$0.name], STR1=[$0.age])
>       LogicalTableScan(table=[[CATALOG, STRUCT, STR_TABLE]])
>
> where top level project returns nested field `name` as `str` instead of
> original struct column. My question is, what is the correct
> way to collect back the flattened fields and produce correct result for the
> query ?
>
> [1] -
>
> https://github.com/ihuzenko/calcite/commit/e24eaa22fbb5c950a0bd5290cc09ca56ea7f1e44
>
> Thank you in advance,
> Igor Guzenko
>
>
>

Re: Re: [DISCUSSION] Problem caused by flattening of struct fields

Posted by Haisheng Yuan <h....@alibaba-inc.com>.
I have created 2 JIRA tickets to track the issues:
https://issues.apache.org/jira/browse/CALCITE-3137
https://issues.apache.org/jira/browse/CALCITE-3138

- Haisheng

------------------------------------------------------------------
发件人:Haisheng Yuan<h....@alibaba-inc.com>
日 期:2019年06月21日 08:54:55
收件人:Igor Guzenko<ih...@gmail.com>; Apache Calcite dev list<de...@calcite.apache.org>
主 题:Re: [DISCUSSION] Problem caused by flattening of struct fields

Consider creating ObjectSqlType like Fixture.addressType, which is STRUCTURED type. 
typeFactory.createStructType() actually creates a ROW type, which is not supported to reconstruct fields at the moment, see method RelStructuredTypeFlattener.restructureFields.

But even with ObjectSqlType, you will still see an assert error. The assertion should be removed.

- Haisheng

------------------------------------------------------------------
发件人:Igor Guzenko<ih...@gmail.com>
日 期:2019年06月20日 15:14:47
收件人:<de...@calcite.apache.org>
主 题:[DISCUSSION] Problem caused by flattening of struct fields

Hello everyone,

I've got issue while converting query with struct column. Struct's fields
are flattened while conversion to rel is performed.
But later they aren't collected back so query plan may produce incorrect
result.

For example [1], consider table *str_table* with just one column *str* of
type STRUCT<name VARCHAR(10), age INTEGER>,
then query SELECT *str* FROM *str_table* produces plan

LogicalProject(STR=[$0])
   LogicalProject(STR=[$0.name], STR1=[$0.age])
      LogicalTableScan(table=[[CATALOG, STRUCT, STR_TABLE]])

where top level project returns nested field `name` as `str` instead of
original struct column. My question is, what is the correct
way to collect back the flattened fields and produce correct result for the
query ?

[1] -
https://github.com/ihuzenko/calcite/commit/e24eaa22fbb5c950a0bd5290cc09ca56ea7f1e44

Thank you in advance,
Igor Guzenko



Re: [DISCUSSION] Problem caused by flattening of struct fields

Posted by Haisheng Yuan <h....@alibaba-inc.com>.
Consider creating ObjectSqlType like Fixture.addressType, which is STRUCTURED type. 
typeFactory.createStructType() actually creates a ROW type, which is not supported to reconstruct fields at the moment, see method RelStructuredTypeFlattener.restructureFields.

But even with ObjectSqlType, you will still see an assert error. The assertion should be removed.

- Haisheng

------------------------------------------------------------------
发件人:Igor Guzenko<ih...@gmail.com>
日 期:2019年06月20日 15:14:47
收件人:<de...@calcite.apache.org>
主 题:[DISCUSSION] Problem caused by flattening of struct fields

Hello everyone,

I've got issue while converting query with struct column. Struct's fields
are flattened while conversion to rel is performed.
But later they aren't collected back so query plan may produce incorrect
result.

For example [1], consider table *str_table* with just one column *str* of
type STRUCT<name VARCHAR(10), age INTEGER>,
then query SELECT *str* FROM *str_table* produces plan

LogicalProject(STR=[$0])
   LogicalProject(STR=[$0.name], STR1=[$0.age])
      LogicalTableScan(table=[[CATALOG, STRUCT, STR_TABLE]])

where top level project returns nested field `name` as `str` instead of
original struct column. My question is, what is the correct
way to collect back the flattened fields and produce correct result for the
query ?

[1] -
https://github.com/ihuzenko/calcite/commit/e24eaa22fbb5c950a0bd5290cc09ca56ea7f1e44

Thank you in advance,
Igor Guzenko