You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Adam Rivelli <as...@gmail.com> on 2019/06/03 21:38:46 UTC

Extracting all columns used in a query

Hi all,

I'm trying to extract all of the (fully qualified) columns used by a query
- similar to the information provided by
RelMetadataQuery.getTableReferences()
<https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode->,
but for column references. Is this possible to do using Calcite?

I've been looking through the API docs and experimenting with the API, but
I haven't found a straightforward way of doing this. Any help or
information is appreciated.

Adam

Re: Extracting all columns used in a query

Posted by "Rivelli, Adam" <as...@gmail.com>.
Haisheng,

Yes, in that case I'd want to extract r1, r3 and s1, s3. I'm doing this 
to help monitor queries made by others on databases I manage, and 
potentially remove columns that nobody ever uses in any capacity.

It looks like something along the lines of Ivan's code can do this, but 
with some modifications to handle subqueries.

-Adam


On 6/4/19 5:45 AM, Haisheng Yuan wrote:
> Hi Adam,
>
> Calcite defintely can do this. But can you first clarify what do you mean by all the columns in a query?
> e.g.
> R(r1, r2,r3), S(s1,s2,s3)
> SELECT r1+s1 FROM R,S WHERE r3=s3;
>
> What do you want to extract from this query? r1,r3 for R and s1, s3 for S?
> And why do you want do that?
>
> - Haisheng Yuan------------------------------------------------------------------
> 发件人:Stamatis Zampetakis<za...@gmail.com>
> 日 期:2019年06月04日 17:26:18
> 收件人:<de...@calcite.apache.org>
> 主 题:Re: Extracting all columns used in a query
>
> Hey Adam,
>
> I am not sure exactly what information you need, and at which level
> (SqlNode/RelNode), but maybe you can exploit what is present in RelRoot
> [1].
> Follow the calls to the constructor to see which APIs can provide you what
> you need (check for instance, SqlToRelConverter.convertQuery [2]).
>
> Best,
> Stamatis
>
> [1]
> https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/rel/RelRoot.java#L89
> [2]
> https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L553
>
> On Tue, Jun 4, 2019 at 12:39 AM Muhammad Gelbana <m....@gmail.com>
> wrote:
>
>> I don't konw if there is an API for that but visiting the  parsed/validated
>> SqlNode tree can do what you asked for.
>>
>> Thanks,
>> Gelbana
>>
>>
>> On Tue, Jun 4, 2019 at 12:12 AM Adam Rivelli <as...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to extract all of the (fully qualified) columns used by a
>> query
>>> - similar to the information provided by
>>> RelMetadataQuery.getTableReferences()
>>> <
>>>
>> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode-
>>>> ,
>>> but for column references. Is this possible to do using Calcite?
>>>
>>> I've been looking through the API docs and experimenting with the API,
>> but
>>> I haven't found a straightforward way of doing this. Any help or
>>> information is appreciated.
>>>
>>> Adam
>>>


Re: Re: Extracting all columns used in a query

Posted by Ivan Grgurina <Iv...@fer.hr>.
Hi Adam,

I developed a solution to your problem (and mine 🙂 ) as part of my master thesis. Code is available at [1].

If you have any questions regarding the code, feel free to ask me.

[1] https://github.com/igrgurina/multicloud_rewriter/blob/master/core/src/main/java/cloud/sec/core/adapter/jdbc/MultiCloudDataManager.java
[https://avatars0.githubusercontent.com/u/2978480?s=400&v=4]<https://github.com/igrgurina/multicloud_rewriter/blob/master/core/src/main/java/cloud/sec/core/adapter/jdbc/MultiCloudDataManager.java>
igrgurina/multicloud_rewriter<https://github.com/igrgurina/multicloud_rewriter/blob/master/core/src/main/java/cloud/sec/core/adapter/jdbc/MultiCloudDataManager.java>
Contribute to igrgurina/multicloud_rewriter development by creating an account on GitHub.
github.com




Ivan Grgurina
Research Assistant (ZEMRIS)
________________________________
[cid:f99e49a4-e831-4f9a-b440-6322e4462e75]<https://www.linkedin.com/in/igrgurina/> [cid:6299c77a-e109-43ef-b370-4abc0d34a3f1] <https://www.fer.unizg.hr/ivan.grgurina>

________________________________
From: Haisheng Yuan <h....@alibaba-inc.com>
Sent: Tuesday, June 4, 2019 11:45 AM
To: Stamatis Zampetakis; Apache Calcite dev list
Subject: Re: Re: Extracting all columns used in a query

Hi Adam,

Calcite defintely can do this. But can you first clarify what do you mean by all the columns in a query?
e.g.
R(r1, r2,r3), S(s1,s2,s3)
SELECT r1+s1 FROM R,S WHERE r3=s3;

What do you want to extract from this query? r1,r3 for R and s1, s3 for S?
And why do you want do that?

- Haisheng Yuan------------------------------------------------------------------
发件人:Stamatis Zampetakis<za...@gmail.com>
日 期:2019年06月04日 17:26:18
收件人:<de...@calcite.apache.org>
主 题:Re: Extracting all columns used in a query

Hey Adam,

I am not sure exactly what information you need, and at which level
(SqlNode/RelNode), but maybe you can exploit what is present in RelRoot
[1].
Follow the calls to the constructor to see which APIs can provide you what
you need (check for instance, SqlToRelConverter.convertQuery [2]).

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/rel/RelRoot.java#L89
[2]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L553

On Tue, Jun 4, 2019 at 12:39 AM Muhammad Gelbana <m....@gmail.com>
wrote:

> I don't konw if there is an API for that but visiting the  parsed/validated
> SqlNode tree can do what you asked for.
>
> Thanks,
> Gelbana
>
>
> On Tue, Jun 4, 2019 at 12:12 AM Adam Rivelli <as...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm trying to extract all of the (fully qualified) columns used by a
> query
> > - similar to the information provided by
> > RelMetadataQuery.getTableReferences()
> > <
> >
> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode-
> > >,
> > but for column references. Is this possible to do using Calcite?
> >
> > I've been looking through the API docs and experimenting with the API,
> but
> > I haven't found a straightforward way of doing this. Any help or
> > information is appreciated.
> >
> > Adam
> >
>


Re: Re: Extracting all columns used in a query

Posted by Haisheng Yuan <h....@alibaba-inc.com>.
Hi Adam,

Calcite defintely can do this. But can you first clarify what do you mean by all the columns in a query?
e.g.
R(r1, r2,r3), S(s1,s2,s3)
SELECT r1+s1 FROM R,S WHERE r3=s3;

What do you want to extract from this query? r1,r3 for R and s1, s3 for S?
And why do you want do that?

- Haisheng Yuan------------------------------------------------------------------
发件人:Stamatis Zampetakis<za...@gmail.com>
日 期:2019年06月04日 17:26:18
收件人:<de...@calcite.apache.org>
主 题:Re: Extracting all columns used in a query

Hey Adam,

I am not sure exactly what information you need, and at which level
(SqlNode/RelNode), but maybe you can exploit what is present in RelRoot
[1].
Follow the calls to the constructor to see which APIs can provide you what
you need (check for instance, SqlToRelConverter.convertQuery [2]).

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/rel/RelRoot.java#L89
[2]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L553

On Tue, Jun 4, 2019 at 12:39 AM Muhammad Gelbana <m....@gmail.com>
wrote:

> I don't konw if there is an API for that but visiting the  parsed/validated
> SqlNode tree can do what you asked for.
>
> Thanks,
> Gelbana
>
>
> On Tue, Jun 4, 2019 at 12:12 AM Adam Rivelli <as...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm trying to extract all of the (fully qualified) columns used by a
> query
> > - similar to the information provided by
> > RelMetadataQuery.getTableReferences()
> > <
> >
> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode-
> > >,
> > but for column references. Is this possible to do using Calcite?
> >
> > I've been looking through the API docs and experimenting with the API,
> but
> > I haven't found a straightforward way of doing this. Any help or
> > information is appreciated.
> >
> > Adam
> >
>


Re: Extracting all columns used in a query

Posted by Stamatis Zampetakis <za...@gmail.com>.
Hey Adam,

I am not sure exactly what information you need, and at which level
(SqlNode/RelNode), but maybe you can exploit what is present in RelRoot
[1].
Follow the calls to the constructor to see which APIs can provide you what
you need (check for instance, SqlToRelConverter.convertQuery [2]).

Best,
Stamatis

[1]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/rel/RelRoot.java#L89
[2]
https://github.com/apache/calcite/blob/7f33215ffaf9c0b8f4bef082913c910c77bf4427/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L553

On Tue, Jun 4, 2019 at 12:39 AM Muhammad Gelbana <m....@gmail.com>
wrote:

> I don't konw if there is an API for that but visiting the  parsed/validated
> SqlNode tree can do what you asked for.
>
> Thanks,
> Gelbana
>
>
> On Tue, Jun 4, 2019 at 12:12 AM Adam Rivelli <as...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm trying to extract all of the (fully qualified) columns used by a
> query
> > - similar to the information provided by
> > RelMetadataQuery.getTableReferences()
> > <
> >
> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode-
> > >,
> > but for column references. Is this possible to do using Calcite?
> >
> > I've been looking through the API docs and experimenting with the API,
> but
> > I haven't found a straightforward way of doing this. Any help or
> > information is appreciated.
> >
> > Adam
> >
>

Re: Extracting all columns used in a query

Posted by Muhammad Gelbana <m....@gmail.com>.
I don't konw if there is an API for that but visiting the  parsed/validated
SqlNode tree can do what you asked for.

Thanks,
Gelbana


On Tue, Jun 4, 2019 at 12:12 AM Adam Rivelli <as...@gmail.com> wrote:

> Hi all,
>
> I'm trying to extract all of the (fully qualified) columns used by a query
> - similar to the information provided by
> RelMetadataQuery.getTableReferences()
> <
> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/RelMetadataQuery.html#getTableReferences-org.apache.calcite.rel.RelNode-
> >,
> but for column references. Is this possible to do using Calcite?
>
> I've been looking through the API docs and experimenting with the API, but
> I haven't found a straightforward way of doing this. Any help or
> information is appreciated.
>
> Adam
>