You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Hsuan Yi Chu <hy...@maprtech.com> on 2016/02/02 08:42:44 UTC

Deterministic behavior of Negative Function?

All the variants of Negative DrillFuncHolder are supposed to be
deterministic. However, when they are being registered
into DrillOperatorTable, there is a if-statement:

https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116

which claims DrillFuncHolder being non-deterministic when return type is
Interval. Why it is necessary to have this if-condition?

-------------------------

Making DrillFuncHolder non-deterministic will prevent partition-pruing or
constant folding from happening.

Re: Deterministic behavior of Negative Function?

Posted by Sean Hsuan-Yi Chu <hs...@usc.edu>.
@Jacques, it is Drill-2060
.
@All, making the function properties non-deterministic also makes
"partition pruning" not happening. This is one case we observed: Say, we
have a negative function in the WHERE-CLAUSE. Since the property is claimed
to be non-deterministic, partition pruning cannot take place.

On Tue, Feb 2, 2016 at 9:41 AM, Zelaine Fong <zf...@maprtech.com> wrote:

> What was the motivation for supporting intervals spanning month to day in
> Drill?  As Julian has already noted, this isn't part of ANSI SQL, and that
> would explain why Jason wasn't able to find examples of this in other
> databases :).
>
> -- Zelaine
>
> On Tue, Feb 2, 2016 at 9:18 AM, Jason Altekruse <al...@gmail.com>
> wrote:
>
> > Thanks for the prompt reply Julian. I will open a Calcite JIRA to
> continue
> > the discussion on, I was similarly confused about the semantics of these
> > types of intervals and didn't want to reason through it when I was making
> > these changes. It would be useful to discuss if this Drill concept has a
> > place in calcite.
> >
> > On Tue, Feb 2, 2016 at 9:12 AM, Julian Hyde <jh...@apache.org> wrote:
> >
> > > I don’t recall interval literals being discussed on the Calcite list.
> We
> > > do support interval literals of the standard types (day-to-second or
> > > year-to-month) but we don’t support interval literals (or interval
> > values)
> > > of month-to-day type. I think there’s a good reason that that kind of
> > > literal isn’t in standard SQL - it is not very well-defined how to add
> > such
> > > an interval to a date (or how to create one by subtracting two dates).
> > >
> > > There are a lot of assumptions-on-assumptions in this case. It would be
> > > helpful if there were JIRA cases were logged for missing functionality
> > > (even if we don’t end up adding that functionality).
> > >
> > > Julian
> > >
> > >
> > > > On Feb 2, 2016, at 7:33 AM, Jason Altekruse <
> altekrusejason@gmail.com>
> > > wrote:
> > > >
> > > > I made this change. It was a bit of a hack to re-use the
> > > non-deterministic
> > > > property as an indication if a function could be folded into a
> > constant.
> > > >
> > > > If you look at the definition of that list of NON_REDUCIBLE_TYPES
> > > constant,
> > > > explanations are given for why given types are included.
> > > >
> > > > The problem with the INTERVAL type is that Calcite did not allow
> > > creating a
> > > > literal for this format. I believe this was intentional, so I didn't
> > > bother
> > > > opening a Calcite JIRA. The issue is that our interval type spans
> > across
> > > > months and days, which I wasn't even sure how to interpret. I could
> not
> > > > find an example of this type in other databases, see docs linked for
> > > > examples in SQL Server and Oracle. [1] [2] Both of these systems
> > support
> > > > intervals that span across years and months or days/millis, and both
> of
> > > > these Drill types are supported by the constant folding rule.
> > > >
> > > > [1] -
> > > >
> > >
> >
> http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF30020
> > > > [2] -
> > > https://msdn.microsoft.com/en-us/library/ms716506%28v=vs.85%29.aspx
> > > >
> > > > On Tue, Feb 2, 2016 at 6:11 AM, Jacques Nadeau <ja...@dremio.com>
> > > wrote:
> > > >
> > > >> What jira was this change added with?
> > > >> On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com>
> wrote:
> > > >>
> > > >>> All the variants of Negative DrillFuncHolder are supposed to be
> > > >>> deterministic. However, when they are being registered
> > > >>> into DrillOperatorTable, there is a if-statement:
> > > >>>
> > > >>>
> > > >>>
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
> > > >>>
> > > >>> which claims DrillFuncHolder being non-deterministic when return
> type
> > > is
> > > >>> Interval. Why it is necessary to have this if-condition?
> > > >>>
> > > >>> -------------------------
> > > >>>
> > > >>> Making DrillFuncHolder non-deterministic will prevent
> > partition-pruing
> > > or
> > > >>> constant folding from happening.
> > > >>>
> > > >>
> > >
> > >
> >
>

Re: Deterministic behavior of Negative Function?

Posted by Zelaine Fong <zf...@maprtech.com>.
What was the motivation for supporting intervals spanning month to day in
Drill?  As Julian has already noted, this isn't part of ANSI SQL, and that
would explain why Jason wasn't able to find examples of this in other
databases :).

-- Zelaine

On Tue, Feb 2, 2016 at 9:18 AM, Jason Altekruse <al...@gmail.com>
wrote:

> Thanks for the prompt reply Julian. I will open a Calcite JIRA to continue
> the discussion on, I was similarly confused about the semantics of these
> types of intervals and didn't want to reason through it when I was making
> these changes. It would be useful to discuss if this Drill concept has a
> place in calcite.
>
> On Tue, Feb 2, 2016 at 9:12 AM, Julian Hyde <jh...@apache.org> wrote:
>
> > I don’t recall interval literals being discussed on the Calcite list. We
> > do support interval literals of the standard types (day-to-second or
> > year-to-month) but we don’t support interval literals (or interval
> values)
> > of month-to-day type. I think there’s a good reason that that kind of
> > literal isn’t in standard SQL - it is not very well-defined how to add
> such
> > an interval to a date (or how to create one by subtracting two dates).
> >
> > There are a lot of assumptions-on-assumptions in this case. It would be
> > helpful if there were JIRA cases were logged for missing functionality
> > (even if we don’t end up adding that functionality).
> >
> > Julian
> >
> >
> > > On Feb 2, 2016, at 7:33 AM, Jason Altekruse <al...@gmail.com>
> > wrote:
> > >
> > > I made this change. It was a bit of a hack to re-use the
> > non-deterministic
> > > property as an indication if a function could be folded into a
> constant.
> > >
> > > If you look at the definition of that list of NON_REDUCIBLE_TYPES
> > constant,
> > > explanations are given for why given types are included.
> > >
> > > The problem with the INTERVAL type is that Calcite did not allow
> > creating a
> > > literal for this format. I believe this was intentional, so I didn't
> > bother
> > > opening a Calcite JIRA. The issue is that our interval type spans
> across
> > > months and days, which I wasn't even sure how to interpret. I could not
> > > find an example of this type in other databases, see docs linked for
> > > examples in SQL Server and Oracle. [1] [2] Both of these systems
> support
> > > intervals that span across years and months or days/millis, and both of
> > > these Drill types are supported by the constant folding rule.
> > >
> > > [1] -
> > >
> >
> http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF30020
> > > [2] -
> > https://msdn.microsoft.com/en-us/library/ms716506%28v=vs.85%29.aspx
> > >
> > > On Tue, Feb 2, 2016 at 6:11 AM, Jacques Nadeau <ja...@dremio.com>
> > wrote:
> > >
> > >> What jira was this change added with?
> > >> On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com> wrote:
> > >>
> > >>> All the variants of Negative DrillFuncHolder are supposed to be
> > >>> deterministic. However, when they are being registered
> > >>> into DrillOperatorTable, there is a if-statement:
> > >>>
> > >>>
> > >>>
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
> > >>>
> > >>> which claims DrillFuncHolder being non-deterministic when return type
> > is
> > >>> Interval. Why it is necessary to have this if-condition?
> > >>>
> > >>> -------------------------
> > >>>
> > >>> Making DrillFuncHolder non-deterministic will prevent
> partition-pruing
> > or
> > >>> constant folding from happening.
> > >>>
> > >>
> >
> >
>

Re: Deterministic behavior of Negative Function?

Posted by Jason Altekruse <al...@gmail.com>.
Thanks for the prompt reply Julian. I will open a Calcite JIRA to continue
the discussion on, I was similarly confused about the semantics of these
types of intervals and didn't want to reason through it when I was making
these changes. It would be useful to discuss if this Drill concept has a
place in calcite.

On Tue, Feb 2, 2016 at 9:12 AM, Julian Hyde <jh...@apache.org> wrote:

> I don’t recall interval literals being discussed on the Calcite list. We
> do support interval literals of the standard types (day-to-second or
> year-to-month) but we don’t support interval literals (or interval values)
> of month-to-day type. I think there’s a good reason that that kind of
> literal isn’t in standard SQL - it is not very well-defined how to add such
> an interval to a date (or how to create one by subtracting two dates).
>
> There are a lot of assumptions-on-assumptions in this case. It would be
> helpful if there were JIRA cases were logged for missing functionality
> (even if we don’t end up adding that functionality).
>
> Julian
>
>
> > On Feb 2, 2016, at 7:33 AM, Jason Altekruse <al...@gmail.com>
> wrote:
> >
> > I made this change. It was a bit of a hack to re-use the
> non-deterministic
> > property as an indication if a function could be folded into a constant.
> >
> > If you look at the definition of that list of NON_REDUCIBLE_TYPES
> constant,
> > explanations are given for why given types are included.
> >
> > The problem with the INTERVAL type is that Calcite did not allow
> creating a
> > literal for this format. I believe this was intentional, so I didn't
> bother
> > opening a Calcite JIRA. The issue is that our interval type spans across
> > months and days, which I wasn't even sure how to interpret. I could not
> > find an example of this type in other databases, see docs linked for
> > examples in SQL Server and Oracle. [1] [2] Both of these systems support
> > intervals that span across years and months or days/millis, and both of
> > these Drill types are supported by the constant folding rule.
> >
> > [1] -
> >
> http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF30020
> > [2] -
> https://msdn.microsoft.com/en-us/library/ms716506%28v=vs.85%29.aspx
> >
> > On Tue, Feb 2, 2016 at 6:11 AM, Jacques Nadeau <ja...@dremio.com>
> wrote:
> >
> >> What jira was this change added with?
> >> On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com> wrote:
> >>
> >>> All the variants of Negative DrillFuncHolder are supposed to be
> >>> deterministic. However, when they are being registered
> >>> into DrillOperatorTable, there is a if-statement:
> >>>
> >>>
> >>>
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
> >>>
> >>> which claims DrillFuncHolder being non-deterministic when return type
> is
> >>> Interval. Why it is necessary to have this if-condition?
> >>>
> >>> -------------------------
> >>>
> >>> Making DrillFuncHolder non-deterministic will prevent partition-pruing
> or
> >>> constant folding from happening.
> >>>
> >>
>
>

Re: Deterministic behavior of Negative Function?

Posted by Julian Hyde <jh...@apache.org>.
I don’t recall interval literals being discussed on the Calcite list. We do support interval literals of the standard types (day-to-second or year-to-month) but we don’t support interval literals (or interval values) of month-to-day type. I think there’s a good reason that that kind of literal isn’t in standard SQL - it is not very well-defined how to add such an interval to a date (or how to create one by subtracting two dates).

There are a lot of assumptions-on-assumptions in this case. It would be helpful if there were JIRA cases were logged for missing functionality (even if we don’t end up adding that functionality).

Julian


> On Feb 2, 2016, at 7:33 AM, Jason Altekruse <al...@gmail.com> wrote:
> 
> I made this change. It was a bit of a hack to re-use the non-deterministic
> property as an indication if a function could be folded into a constant.
> 
> If you look at the definition of that list of NON_REDUCIBLE_TYPES constant,
> explanations are given for why given types are included.
> 
> The problem with the INTERVAL type is that Calcite did not allow creating a
> literal for this format. I believe this was intentional, so I didn't bother
> opening a Calcite JIRA. The issue is that our interval type spans across
> months and days, which I wasn't even sure how to interpret. I could not
> find an example of this type in other databases, see docs linked for
> examples in SQL Server and Oracle. [1] [2] Both of these systems support
> intervals that span across years and months or days/millis, and both of
> these Drill types are supported by the constant folding rule.
> 
> [1] -
> http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF30020
> [2] - https://msdn.microsoft.com/en-us/library/ms716506%28v=vs.85%29.aspx
> 
> On Tue, Feb 2, 2016 at 6:11 AM, Jacques Nadeau <ja...@dremio.com> wrote:
> 
>> What jira was this change added with?
>> On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com> wrote:
>> 
>>> All the variants of Negative DrillFuncHolder are supposed to be
>>> deterministic. However, when they are being registered
>>> into DrillOperatorTable, there is a if-statement:
>>> 
>>> 
>>> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
>>> 
>>> which claims DrillFuncHolder being non-deterministic when return type is
>>> Interval. Why it is necessary to have this if-condition?
>>> 
>>> -------------------------
>>> 
>>> Making DrillFuncHolder non-deterministic will prevent partition-pruing or
>>> constant folding from happening.
>>> 
>> 


Re: Deterministic behavior of Negative Function?

Posted by Jason Altekruse <al...@gmail.com>.
I made this change. It was a bit of a hack to re-use the non-deterministic
property as an indication if a function could be folded into a constant.

If you look at the definition of that list of NON_REDUCIBLE_TYPES constant,
explanations are given for why given types are included.

The problem with the INTERVAL type is that Calcite did not allow creating a
literal for this format. I believe this was intentional, so I didn't bother
opening a Calcite JIRA. The issue is that our interval type spans across
months and days, which I wasn't even sure how to interpret. I could not
find an example of this type in other databases, see docs linked for
examples in SQL Server and Oracle. [1] [2] Both of these systems support
intervals that span across years and months or days/millis, and both of
these Drill types are supported by the constant folding rule.

[1] -
http://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF30020
[2] - https://msdn.microsoft.com/en-us/library/ms716506%28v=vs.85%29.aspx

On Tue, Feb 2, 2016 at 6:11 AM, Jacques Nadeau <ja...@dremio.com> wrote:

> What jira was this change added with?
> On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com> wrote:
>
>> All the variants of Negative DrillFuncHolder are supposed to be
>> deterministic. However, when they are being registered
>> into DrillOperatorTable, there is a if-statement:
>>
>>
>> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
>>
>> which claims DrillFuncHolder being non-deterministic when return type is
>> Interval. Why it is necessary to have this if-condition?
>>
>> -------------------------
>>
>> Making DrillFuncHolder non-deterministic will prevent partition-pruing or
>> constant folding from happening.
>>
>

Re: Deterministic behavior of Negative Function?

Posted by Jacques Nadeau <ja...@dremio.com>.
What jira was this change added with?
On Feb 1, 2016 11:42 PM, "Hsuan Yi Chu" <hy...@maprtech.com> wrote:

> All the variants of Negative DrillFuncHolder are supposed to be
> deterministic. However, when they are being registered
> into DrillOperatorTable, there is a if-statement:
>
>
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillFunctionRegistry.java#L116
>
> which claims DrillFuncHolder being non-deterministic when return type is
> Interval. Why it is necessary to have this if-condition?
>
> -------------------------
>
> Making DrillFuncHolder non-deterministic will prevent partition-pruing or
> constant folding from happening.
>