You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Hsuan Yi Chu <hy...@maprtech.com> on 2016/01/11 20:55:08 UTC

Customize type-check-mechanism for SqlOperator?

Hi all,
Since Calcite has been used as planners for many projects, each with its
own definition of flexibility, I am thinking if we could make
type-check-mechanism plugable?

For the first example, currently, the following filter will trigger
exception:

col - to_timestamp('XXXX-XX-XX XX:XX:XX','YYYY-MM-dd HH:mm:ss') <
interval 'X XX:XX:X' day to second

=> Cannot apply '-' to arguments of type '<ANY> - <TIMESTAMP(0)>



Another example is cast function, where less type conversion is allowed in
calcite than in Drill or MS-SQL.

Please let me know if this proposal makes sense.

Re: Customize type-check-mechanism for SqlOperator?

Posted by Julian Hyde <jh...@apache.org>.
If you take that approach you will end up copy-pasting pretty much the whole of SqlStdOperatorTable, because you will need a different variant of pretty much every operator. That’s OK, but it will be troublesome to maintain.

Also, you will need to make sure that every bit of code that creates say an = or AND or OR operator uses Drill’s version rather than the default version. That is spread throughout a lot of planner rules, and elsewhere.

This approach will work, if you know what you are letting yourself in for. However I would be inclined to be more targeted — to change the policy rather than the operators themselves.

Julian

> On Jan 18, 2016, at 4:15 PM, Hsuan Yi Chu <hy...@maprtech.com> wrote:
> 
> Hi Julian,
> Thanks for the feedback.
> 
> 1. After some investigation, I would like to propose going with the first
> one. To be more specific, the customized SqlOperators will substitute for
> the standard ones at validation.
> 
> As a matter of fact, this logic has been adopted in
> SqlFunction.deriveType(...), where SqlUtil.lookupRoutine(...) is called to
> get a function from SqlOperatorTable (which can refer to a customized
> table). However, I do not see this logic in SqlOperator.deriveType(...). So
> I am wondering if we could just translate the same logic there?
> 
> If so, we will have our customized operator, according to the given
> SqlOperatorTable, plugged into the tree.
> 
> 2. First of all, if we only need to do the above to resolve the issue, the
> impact on Calcite master is limited (theoretically, if a standard operator
> table is plugged in, the behavior should converge to the behavior we have
> now).
> 
> 3. Let me think more about this question. Will have a follow-up email to
> address it.
> 
> On Mon, Jan 18, 2016 at 1:38 PM, Julian Hyde <jh...@apache.org> wrote:
> 
>> At a high level, it sounds like a great idea. However, let’s drill into
>> the details:
>> 
>> 1. What code changes to you intend to make to the Calcite base? You could
>> (a) define your own “<“ operator (using it instead of
>> SqlStdOperatorTable.LESS_THAN), or (b) modify
>> OperandTypes.COMPARABLE_ORDERED_COMPARABLE_ORDERED (the policy used by
>> LESS_THAN), or (c) modify something in the type system. Each of those could
>> work, but would change a different piece of code, and that would affect how
>> the code is maintained.
>> 
>> 2. How do you intend to test this? When ANY was introduced, I was
>> surprised how little testing was done, given that ANY has a pervasive
>> impact: it an occur in any scalar expression, and therefore you need to
>> devise a way to test every operator. I don’t know whether there is an easy
>> way to test this, but if we don’t devise an effective way to test it, we
>> will face an endless trickle of patches for this and that case that we
>> forgot to address.
>> 
>> Before I support this I will need to see a cogent plan for avoiding (or at
>> least managing) large-scale code changes, and a plan to comprehensively
>> test the full set of operators and functions.
>> 
>> Also, a question about requirements: Will all users in a Calcite instance
>> (a JVM) be using the same type system? Or might it vary by, say, schema or
>> connection?
>> 
>> Julian
>> 
>> 
>> 
>>> On Jan 15, 2016, at 10:17 AM, Hsuan Yi Chu <hy...@maprtech.com> wrote:
>>> 
>>> Hi all,
>>> I am wondering if anybody would some other ideas regarding this proposal
>> ?
>>> 
>>> In fact, I think this is customization is really beneficial since
>> different
>>> systems have their own type rules. Allowing this customization to happen
>>> can really help calcite work with other systems.
>>> 
>>> On Mon, Jan 11, 2016 at 11:55 AM, Hsuan Yi Chu <hy...@maprtech.com>
>> wrote:
>>> 
>>>> Hi all,
>>>> Since Calcite has been used as planners for many projects, each with its
>>>> own definition of flexibility, I am thinking if we could make
>>>> type-check-mechanism plugable?
>>>> 
>>>> For the first example, currently, the following filter will trigger
>>>> exception:
>>>> 
>>>> col - to_timestamp('XXXX-XX-XX XX:XX:XX','YYYY-MM-dd HH:mm:ss') <
>> interval 'X XX:XX:X' day to second
>>>> 
>>>> => Cannot apply '-' to arguments of type '<ANY> - <TIMESTAMP(0)>
>>>> 
>>>> 
>>>> 
>>>> Another example is cast function, where less type conversion is allowed
>> in
>>>> calcite than in Drill or MS-SQL.
>>>> 
>>>> Please let me know if this proposal makes sense.
>>>> 
>> 
>> 


Re: Customize type-check-mechanism for SqlOperator?

Posted by Hsuan Yi Chu <hy...@maprtech.com>.
Hi Julian,
Thanks for the feedback.

1. After some investigation, I would like to propose going with the first
one. To be more specific, the customized SqlOperators will substitute for
the standard ones at validation.

As a matter of fact, this logic has been adopted in
SqlFunction.deriveType(...), where SqlUtil.lookupRoutine(...) is called to
get a function from SqlOperatorTable (which can refer to a customized
table). However, I do not see this logic in SqlOperator.deriveType(...). So
I am wondering if we could just translate the same logic there?

If so, we will have our customized operator, according to the given
SqlOperatorTable, plugged into the tree.

2. First of all, if we only need to do the above to resolve the issue, the
impact on Calcite master is limited (theoretically, if a standard operator
table is plugged in, the behavior should converge to the behavior we have
now).

3. Let me think more about this question. Will have a follow-up email to
address it.

On Mon, Jan 18, 2016 at 1:38 PM, Julian Hyde <jh...@apache.org> wrote:

> At a high level, it sounds like a great idea. However, let’s drill into
> the details:
>
> 1. What code changes to you intend to make to the Calcite base? You could
> (a) define your own “<“ operator (using it instead of
> SqlStdOperatorTable.LESS_THAN), or (b) modify
> OperandTypes.COMPARABLE_ORDERED_COMPARABLE_ORDERED (the policy used by
> LESS_THAN), or (c) modify something in the type system. Each of those could
> work, but would change a different piece of code, and that would affect how
> the code is maintained.
>
> 2. How do you intend to test this? When ANY was introduced, I was
> surprised how little testing was done, given that ANY has a pervasive
> impact: it an occur in any scalar expression, and therefore you need to
> devise a way to test every operator. I don’t know whether there is an easy
> way to test this, but if we don’t devise an effective way to test it, we
> will face an endless trickle of patches for this and that case that we
> forgot to address.
>
> Before I support this I will need to see a cogent plan for avoiding (or at
> least managing) large-scale code changes, and a plan to comprehensively
> test the full set of operators and functions.
>
> Also, a question about requirements: Will all users in a Calcite instance
> (a JVM) be using the same type system? Or might it vary by, say, schema or
> connection?
>
> Julian
>
>
>
> > On Jan 15, 2016, at 10:17 AM, Hsuan Yi Chu <hy...@maprtech.com> wrote:
> >
> > Hi all,
> > I am wondering if anybody would some other ideas regarding this proposal
> ?
> >
> > In fact, I think this is customization is really beneficial since
> different
> > systems have their own type rules. Allowing this customization to happen
> > can really help calcite work with other systems.
> >
> > On Mon, Jan 11, 2016 at 11:55 AM, Hsuan Yi Chu <hy...@maprtech.com>
> wrote:
> >
> >> Hi all,
> >> Since Calcite has been used as planners for many projects, each with its
> >> own definition of flexibility, I am thinking if we could make
> >> type-check-mechanism plugable?
> >>
> >> For the first example, currently, the following filter will trigger
> >> exception:
> >>
> >> col - to_timestamp('XXXX-XX-XX XX:XX:XX','YYYY-MM-dd HH:mm:ss') <
> interval 'X XX:XX:X' day to second
> >>
> >> => Cannot apply '-' to arguments of type '<ANY> - <TIMESTAMP(0)>
> >>
> >>
> >>
> >> Another example is cast function, where less type conversion is allowed
> in
> >> calcite than in Drill or MS-SQL.
> >>
> >> Please let me know if this proposal makes sense.
> >>
>
>

Re: Customize type-check-mechanism for SqlOperator?

Posted by Julian Hyde <jh...@apache.org>.
At a high level, it sounds like a great idea. However, let’s drill into the details:

1. What code changes to you intend to make to the Calcite base? You could (a) define your own “<“ operator (using it instead of SqlStdOperatorTable.LESS_THAN), or (b) modify OperandTypes.COMPARABLE_ORDERED_COMPARABLE_ORDERED (the policy used by LESS_THAN), or (c) modify something in the type system. Each of those could work, but would change a different piece of code, and that would affect how the code is maintained.

2. How do you intend to test this? When ANY was introduced, I was surprised how little testing was done, given that ANY has a pervasive impact: it an occur in any scalar expression, and therefore you need to devise a way to test every operator. I don’t know whether there is an easy way to test this, but if we don’t devise an effective way to test it, we will face an endless trickle of patches for this and that case that we forgot to address.

Before I support this I will need to see a cogent plan for avoiding (or at least managing) large-scale code changes, and a plan to comprehensively test the full set of operators and functions.

Also, a question about requirements: Will all users in a Calcite instance (a JVM) be using the same type system? Or might it vary by, say, schema or connection?

Julian



> On Jan 15, 2016, at 10:17 AM, Hsuan Yi Chu <hy...@maprtech.com> wrote:
> 
> Hi all,
> I am wondering if anybody would some other ideas regarding this proposal ?
> 
> In fact, I think this is customization is really beneficial since different
> systems have their own type rules. Allowing this customization to happen
> can really help calcite work with other systems.
> 
> On Mon, Jan 11, 2016 at 11:55 AM, Hsuan Yi Chu <hy...@maprtech.com> wrote:
> 
>> Hi all,
>> Since Calcite has been used as planners for many projects, each with its
>> own definition of flexibility, I am thinking if we could make
>> type-check-mechanism plugable?
>> 
>> For the first example, currently, the following filter will trigger
>> exception:
>> 
>> col - to_timestamp('XXXX-XX-XX XX:XX:XX','YYYY-MM-dd HH:mm:ss') < interval 'X XX:XX:X' day to second
>> 
>> => Cannot apply '-' to arguments of type '<ANY> - <TIMESTAMP(0)>
>> 
>> 
>> 
>> Another example is cast function, where less type conversion is allowed in
>> calcite than in Drill or MS-SQL.
>> 
>> Please let me know if this proposal makes sense.
>> 


Re: Customize type-check-mechanism for SqlOperator?

Posted by Hsuan Yi Chu <hy...@maprtech.com>.
Hi all,
I am wondering if anybody would some other ideas regarding this proposal ?

In fact, I think this is customization is really beneficial since different
systems have their own type rules. Allowing this customization to happen
can really help calcite work with other systems.

On Mon, Jan 11, 2016 at 11:55 AM, Hsuan Yi Chu <hy...@maprtech.com> wrote:

> Hi all,
> Since Calcite has been used as planners for many projects, each with its
> own definition of flexibility, I am thinking if we could make
> type-check-mechanism plugable?
>
> For the first example, currently, the following filter will trigger
> exception:
>
> col - to_timestamp('XXXX-XX-XX XX:XX:XX','YYYY-MM-dd HH:mm:ss') < interval 'X XX:XX:X' day to second
>
> => Cannot apply '-' to arguments of type '<ANY> - <TIMESTAMP(0)>
>
>
>
> Another example is cast function, where less type conversion is allowed in
> calcite than in Drill or MS-SQL.
>
> Please let me know if this proposal makes sense.
>