You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Li Yang <li...@apache.org> on 2015/04/27 12:19:42 UTC

How to group time by 15 min intervals

Hi

Given a timestamp column, I want to group by every 15 minutes like [1].
How can this be best done with calcite?

Looked at the reference, but answer is not obvious.


Thanks
Yang


[1]
http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
[2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md

Re: How to group time by 15 min intervals

Posted by Li Yang <li...@apache.org>.
The calendar table is Kylin's current solution. The date arithmetic is what
I'm looking for.

I'm thinking about pre-calculate the date expression as a hidden dimension
of cube. That can speed up queries with the same expression on group by.

Thanks Julian & Ethan!


On Tue, Apr 28, 2015 at 4:56 AM, Julian Hyde <ju...@gmail.com> wrote:

> By the way, if “ts” is sorted, then Calcite can deduce that cast((“ts" -
> timestamp '1970-01-01 00:00:00') minute as integer) / 15 is also sorted.
> Maybe that is something that Kylin can exploit in its execution plan.
>
> Julian
>
>
> On Apr 27, 2015, at 11:40 AM, Julian Hyde <ju...@gmail.com> wrote:
>
> > The answer on SO for MySQL is this*:
> >
> > SELECT   ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
> > FROM     table
> > GROUP BY timekey;
> > The nearest equivalent in Calcite is this:
> >
> > SELECT   cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as
> integer) / 15 AS timekey
> > FROM     table
> > GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as
> integer) / 15;
> >
> > But doing date arithmetic in queries is an anti-pattern in ROLAP systems
> such as Kylin. A time dimension table is the preferred solution. I suggest
> that you add minute to your time dimension table (values 0 .. 59). You
> could also consider adding a quarter_hour column (values 0 .. 3). Then to
> find events in the same 15 minute interval you could group by date_id,
> hour, quarter_hour.
> >
> > Julian
> >
> > * I renamed the column from “timestamp” to “ts” because TIMESTAMP is a
> reserved keyword.
> >
> > On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:
> >
> >> Looks like to me based on the stuff calcite has, you may create a
> calendar table first:
> >>
> >> tbl_calendar
> >> id  min   max
> >> 1    0       15
> >> 2   16       30
> >> 3   31       45
> >> 4   46       60
> >>
> >>
> >> select * from
> >> main,
> >> tbl_calendar
> >> where
> >> main.timestamp >= tbl_calendar.min
> >> and  main.timestamp <  tbl_calendar.max
> >> group by
> >> tbl_calendar.id
> >>
> >>
> >>
> >>
> >>
> >>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
> >>>
> >>> Hi
> >>>
> >>> Given a timestamp column, I want to group by every 15 minutes like [1].
> >>> How can this be best done with calcite?
> >>>
> >>> Looked at the reference, but answer is not obvious.
> >>>
> >>>
> >>> Thanks
> >>> Yang
> >>>
> >>>
> >>> [1]
> >>>
> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
> >>> [2]
> https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
> >>
> >
>
>

Re: How to group time by 15 min intervals

Posted by Julian Hyde <ju...@gmail.com>.
By the way, if “ts” is sorted, then Calcite can deduce that cast((“ts" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 is also sorted. Maybe that is something that Kylin can exploit in its execution plan.

Julian


On Apr 27, 2015, at 11:40 AM, Julian Hyde <ju...@gmail.com> wrote:

> The answer on SO for MySQL is this*:
> 
> SELECT   ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
> FROM     table
> GROUP BY timekey;
> The nearest equivalent in Calcite is this:
> 
> SELECT   cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 AS timekey
> FROM     table
> GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15;
> 
> But doing date arithmetic in queries is an anti-pattern in ROLAP systems such as Kylin. A time dimension table is the preferred solution. I suggest that you add minute to your time dimension table (values 0 .. 59). You could also consider adding a quarter_hour column (values 0 .. 3). Then to find events in the same 15 minute interval you could group by date_id, hour, quarter_hour.
> 
> Julian
> 
> * I renamed the column from “timestamp” to “ts” because TIMESTAMP is a reserved keyword.
> 
> On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:
> 
>> Looks like to me based on the stuff calcite has, you may create a calendar table first:
>> 
>> tbl_calendar
>> id  min   max 
>> 1    0	15
>> 2   16	30
>> 3   31	45
>> 4   46	60
>> 
>> 
>> select * from 
>> main,
>> tbl_calendar
>> where
>> main.timestamp >= tbl_calendar.min
>> and  main.timestamp <  tbl_calendar.max
>> group by
>> tbl_calendar.id
>> 
>> 
>> 
>> 
>> 
>>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
>>> 
>>> Hi
>>> 
>>> Given a timestamp column, I want to group by every 15 minutes like [1].
>>> How can this be best done with calcite?
>>> 
>>> Looked at the reference, but answer is not obvious.
>>> 
>>> 
>>> Thanks
>>> Yang
>>> 
>>> 
>>> [1]
>>> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
>>> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
>> 
> 


Re: How to group time by 15 min intervals

Posted by Julian Hyde <ju...@gmail.com>.
The answer on SO for MySQL is this*:

SELECT   ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
FROM     table
GROUP BY timekey;
The nearest equivalent in Calcite is this:

SELECT   cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 AS timekey
FROM     table
GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15;

But doing date arithmetic in queries is an anti-pattern in ROLAP systems such as Kylin. A time dimension table is the preferred solution. I suggest that you add minute to your time dimension table (values 0 .. 59). You could also consider adding a quarter_hour column (values 0 .. 3). Then to find events in the same 15 minute interval you could group by date_id, hour, quarter_hour.

Julian

* I renamed the column from “timestamp” to “ts” because TIMESTAMP is a reserved keyword.

On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:

> Looks like to me based on the stuff calcite has, you may create a calendar table first:
> 
> tbl_calendar
> id  min   max 
> 1    0	15
> 2   16	30
> 3   31	45
> 4   46	60
> 
> 
> select * from 
> main,
> tbl_calendar
> where
> main.timestamp >= tbl_calendar.min
> and  main.timestamp <  tbl_calendar.max
> group by
> tbl_calendar.id
> 
> 
> 
> 
> 
>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
>> 
>> Hi
>> 
>> Given a timestamp column, I want to group by every 15 minutes like [1].
>> How can this be best done with calcite?
>> 
>> Looked at the reference, but answer is not obvious.
>> 
>> 
>> Thanks
>> Yang
>> 
>> 
>> [1]
>> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
>> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
> 


Re: How to group time by 15 min intervals

Posted by Ethan Wang <ca...@gmail.com>.
Looks like to me based on the stuff calcite has, you may create a calendar table first:

tbl_calendar
id  min   max 
1    0	15
2   16	30
3   31	45
4   46	60


select * from 
main,
tbl_calendar
where
main.timestamp >= tbl_calendar.min
and  main.timestamp <  tbl_calendar.max
group by
tbl_calendar.id





> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
> 
> Hi
> 
> Given a timestamp column, I want to group by every 15 minutes like [1].
> How can this be best done with calcite?
> 
> Looked at the reference, but answer is not obvious.
> 
> 
> Thanks
> Yang
> 
> 
> [1]
> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md