You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Li Yang <li...@apache.org> on 2015/04/27 12:19:42 UTC
How to group time by 15 min intervals
Hi
Given a timestamp column, I want to group by every 15 minutes like [1].
How can this be best done with calcite?
Looked at the reference, but answer is not obvious.
Thanks
Yang
[1]
http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
[2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
Re: How to group time by 15 min intervals
Posted by Li Yang <li...@apache.org>.
The calendar table is Kylin's current solution. The date arithmetic is what
I'm looking for.
I'm thinking about pre-calculate the date expression as a hidden dimension
of cube. That can speed up queries with the same expression on group by.
Thanks Julian & Ethan!
On Tue, Apr 28, 2015 at 4:56 AM, Julian Hyde <ju...@gmail.com> wrote:
> By the way, if “ts” is sorted, then Calcite can deduce that cast((“ts" -
> timestamp '1970-01-01 00:00:00') minute as integer) / 15 is also sorted.
> Maybe that is something that Kylin can exploit in its execution plan.
>
> Julian
>
>
> On Apr 27, 2015, at 11:40 AM, Julian Hyde <ju...@gmail.com> wrote:
>
> > The answer on SO for MySQL is this*:
> >
> > SELECT ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
> > FROM table
> > GROUP BY timekey;
> > The nearest equivalent in Calcite is this:
> >
> > SELECT cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as
> integer) / 15 AS timekey
> > FROM table
> > GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as
> integer) / 15;
> >
> > But doing date arithmetic in queries is an anti-pattern in ROLAP systems
> such as Kylin. A time dimension table is the preferred solution. I suggest
> that you add minute to your time dimension table (values 0 .. 59). You
> could also consider adding a quarter_hour column (values 0 .. 3). Then to
> find events in the same 15 minute interval you could group by date_id,
> hour, quarter_hour.
> >
> > Julian
> >
> > * I renamed the column from “timestamp” to “ts” because TIMESTAMP is a
> reserved keyword.
> >
> > On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:
> >
> >> Looks like to me based on the stuff calcite has, you may create a
> calendar table first:
> >>
> >> tbl_calendar
> >> id min max
> >> 1 0 15
> >> 2 16 30
> >> 3 31 45
> >> 4 46 60
> >>
> >>
> >> select * from
> >> main,
> >> tbl_calendar
> >> where
> >> main.timestamp >= tbl_calendar.min
> >> and main.timestamp < tbl_calendar.max
> >> group by
> >> tbl_calendar.id
> >>
> >>
> >>
> >>
> >>
> >>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
> >>>
> >>> Hi
> >>>
> >>> Given a timestamp column, I want to group by every 15 minutes like [1].
> >>> How can this be best done with calcite?
> >>>
> >>> Looked at the reference, but answer is not obvious.
> >>>
> >>>
> >>> Thanks
> >>> Yang
> >>>
> >>>
> >>> [1]
> >>>
> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
> >>> [2]
> https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
> >>
> >
>
>
Re: How to group time by 15 min intervals
Posted by Julian Hyde <ju...@gmail.com>.
By the way, if “ts” is sorted, then Calcite can deduce that cast((“ts" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 is also sorted. Maybe that is something that Kylin can exploit in its execution plan.
Julian
On Apr 27, 2015, at 11:40 AM, Julian Hyde <ju...@gmail.com> wrote:
> The answer on SO for MySQL is this*:
>
> SELECT ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
> FROM table
> GROUP BY timekey;
> The nearest equivalent in Calcite is this:
>
> SELECT cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 AS timekey
> FROM table
> GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15;
>
> But doing date arithmetic in queries is an anti-pattern in ROLAP systems such as Kylin. A time dimension table is the preferred solution. I suggest that you add minute to your time dimension table (values 0 .. 59). You could also consider adding a quarter_hour column (values 0 .. 3). Then to find events in the same 15 minute interval you could group by date_id, hour, quarter_hour.
>
> Julian
>
> * I renamed the column from “timestamp” to “ts” because TIMESTAMP is a reserved keyword.
>
> On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:
>
>> Looks like to me based on the stuff calcite has, you may create a calendar table first:
>>
>> tbl_calendar
>> id min max
>> 1 0 15
>> 2 16 30
>> 3 31 45
>> 4 46 60
>>
>>
>> select * from
>> main,
>> tbl_calendar
>> where
>> main.timestamp >= tbl_calendar.min
>> and main.timestamp < tbl_calendar.max
>> group by
>> tbl_calendar.id
>>
>>
>>
>>
>>
>>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
>>>
>>> Hi
>>>
>>> Given a timestamp column, I want to group by every 15 minutes like [1].
>>> How can this be best done with calcite?
>>>
>>> Looked at the reference, but answer is not obvious.
>>>
>>>
>>> Thanks
>>> Yang
>>>
>>>
>>> [1]
>>> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
>>> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
>>
>
Re: How to group time by 15 min intervals
Posted by Julian Hyde <ju...@gmail.com>.
The answer on SO for MySQL is this*:
SELECT ROUND(UNIX_TIMESTAMP(ts)/(15 * 60)) AS timekey
FROM table
GROUP BY timekey;
The nearest equivalent in Calcite is this:
SELECT cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15 AS timekey
FROM table
GROUP BY cast(("hire_date" - timestamp '1970-01-01 00:00:00') minute as integer) / 15;
But doing date arithmetic in queries is an anti-pattern in ROLAP systems such as Kylin. A time dimension table is the preferred solution. I suggest that you add minute to your time dimension table (values 0 .. 59). You could also consider adding a quarter_hour column (values 0 .. 3). Then to find events in the same 15 minute interval you could group by date_id, hour, quarter_hour.
Julian
* I renamed the column from “timestamp” to “ts” because TIMESTAMP is a reserved keyword.
On Apr 27, 2015, at 6:13 AM, Ethan Wang <ca...@gmail.com> wrote:
> Looks like to me based on the stuff calcite has, you may create a calendar table first:
>
> tbl_calendar
> id min max
> 1 0 15
> 2 16 30
> 3 31 45
> 4 46 60
>
>
> select * from
> main,
> tbl_calendar
> where
> main.timestamp >= tbl_calendar.min
> and main.timestamp < tbl_calendar.max
> group by
> tbl_calendar.id
>
>
>
>
>
>> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
>>
>> Hi
>>
>> Given a timestamp column, I want to group by every 15 minutes like [1].
>> How can this be best done with calcite?
>>
>> Looked at the reference, but answer is not obvious.
>>
>>
>> Thanks
>> Yang
>>
>>
>> [1]
>> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
>> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md
>
Re: How to group time by 15 min intervals
Posted by Ethan Wang <ca...@gmail.com>.
Looks like to me based on the stuff calcite has, you may create a calendar table first:
tbl_calendar
id min max
1 0 15
2 16 30
3 31 45
4 46 60
select * from
main,
tbl_calendar
where
main.timestamp >= tbl_calendar.min
and main.timestamp < tbl_calendar.max
group by
tbl_calendar.id
> On Apr 27, 2015, at 5:19 AM, Li Yang <li...@apache.org> wrote:
>
> Hi
>
> Given a timestamp column, I want to group by every 15 minutes like [1].
> How can this be best done with calcite?
>
> Looked at the reference, but answer is not obvious.
>
>
> Thanks
> Yang
>
>
> [1]
> http://stackoverflow.com/questions/2793994/group-mysql-query-by-15-min-intervals
> [2] https://github.com/apache/incubator-calcite/blob/master/doc/REFERENCE.md