You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Sebastián Magrí <se...@gmail.com> on 2021/03/02 11:26:47 UTC

[Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

While using a simple query such as this

SELECT
   `ts`,
   FLOOR(`ts` TO WEEK) as `week_start`,
   CEIL(`ts` TO WEEK) as `week_end`
FROM some_table

I get some weird results like these:

2021-03-01T00:00|    2021-02-25T00:00|    2021-03-04T00:00

Which is obviously wrong since March 1st is on Monday, February 25th is
Thursday as well as March 04th.

I've tried different combinations of timezone configurations and with both
timestamps and dates, with the same results.

Is there anything obviously wrong in that query? Is there any configuration
to keep in mind for the start of week day?

-- 
Sebastián Ramírez Magrí

Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by Leonard Xu <xb...@gmail.com>.
Post the created Jira link

https://issues.apache.org/jira/browse/FLINK-21624 <https://issues.apache.org/jira/browse/FLINK-21624>

Best,
Leonard

> 
> Thanks a lot for the added context and pointers Julian and Leonard,
> 
> I've fixed it by going down to the arithmetics as suggested in one of the Calcite discussions.
> 
> The changes proposed by FLIP-126 definitely look good. I'll check its details further.
> 
> Best Regards,
> 
> On Thu, 4 Mar 2021 at 04:18, Leonard Xu <xbjtdcq@gmail.com <ma...@gmail.com>> wrote:
> Hi, Sebastián Ramírez Magrí 
> (Sorry for wrong name in above mail)
> 
> Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink code, I’d like to make this as a part of FLIP-162 and fix it soon.
> 
> Thanks Julian and Timo for the reminder.
> 
> 
> Best,
> Leonard
> 
>> 在 2021年3月4日,12:14,Leonard Xu <xbjtdcq@gmail.com <ma...@gmail.com>> 写道:
>> 
>> Hi, Jaffe
>> 
>> Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink code, I’d like to make this as a part of FLIP-162 and fix it soon.
>> 
>> Thanks Julian and Timo for the reminder.
>> 
>> 
>> Best,
>> Leonard
> 
> 
> 
> -- 
> Sebastián Ramírez Magrí


Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by Sebastián Magrí <se...@gmail.com>.
Thanks a lot for the added context and pointers Julian and Leonard,

I've fixed it by going down to the arithmetics as suggested in one of the
Calcite discussions.

The changes proposed by FLIP-126 definitely look good. I'll check its
details further.

Best Regards,

On Thu, 4 Mar 2021 at 04:18, Leonard Xu <xb...@gmail.com> wrote:

> Hi, Sebastián Ramírez Magrí
> (Sorry for wrong name in above mail)
>
> Flink follows old version calcite’s behaviour which lead to the wrong
> behavior. snd Julian is right that calcite has corrected FLOOR and CEIL
> functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version
> which contains the patch, what we need is only to adapt it in Flink code,
> I’d like to make this as a part of FLIP-162 and fix it soon.
>
> Thanks Julian and Timo for the reminder.
>
>
> Best,
> Leonard
>
> 在 2021年3月4日,12:14,Leonard Xu <xb...@gmail.com> 写道:
>
> Hi, Jaffe
>
> Flink follows old version calcite’s behaviour which lead to the wrong
> behavior. snd Julian is right that calcite has corrected FLOOR and CEIL
> functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version
> which contains the patch, what we need is only to adapt it in Flink code,
> I’d like to make this as a part of FLIP-162 and fix it soon.
>
> Thanks Julian and Timo for the reminder.
>
>
> Best,
> Leonard
>
>
>

-- 
Sebastián Ramírez Magrí

Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by Leonard Xu <xb...@gmail.com>.
Hi, Sebastián Ramírez Magrí 
(Sorry for wrong name in above mail)

Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink code, I’d like to make this as a part of FLIP-162 and fix it soon.

Thanks Julian and Timo for the reminder.


Best,
Leonard

> 在 2021年3月4日,12:14,Leonard Xu <xb...@gmail.com> 写道:
> 
> Hi, Jaffe
> 
> Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink code, I’d like to make this as a part of FLIP-162 and fix it soon.
> 
> Thanks Julian and Timo for the reminder.
> 
> 
> Best,
> Leonard


Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by Leonard Xu <xb...@gmail.com>.
Hi, Jaffe

Flink follows old version calcite’s behaviour which lead to the wrong behavior. snd Julian is right that calcite has corrected FLOOR and CEIL functions in CALCITE-3412, Flink has upgraded  calcite to 1.26 version which contains the patch, what we need is only to adapt it in Flink code, I’d like to make this as a part of FLIP-162 and fix it soon.

Thanks Julian and Timo for the reminder.


Best,
Leonard





> 在 2021年3月3日,03:35,Jaffe, Julian <Ju...@activision.com> 写道:
> 
> Calcite does not follow ISO-8601. Instead, until very recently Calcite weeks started on Thursdays[1].
> 
> (As an aside, Calcite somewhat abuses the WEEK time unit - converting a date to a week returns an integer representing the week of the year the date falls in while FLOORing or CEILing a timestamp to week returns a timestamp. This can cause integration issues with other systems if you're unaware)
> 
> Julian
> 
> 
> [1] https://issues.apache.org/jira/browse/CALCITE-3412
> 
> On 3/2/21, 4:12 AM, "Timo Walther" <tw...@apache.org> wrote:
> 
>    Hi Sebastián,
> 
>    it might be the case that some time functions are not correct due to the 
>    underlying refactoring of data structures. I will loop in Leonard in CC 
>    that currently works on improving this situation as part of FLIP-162 [1].
> 
>    @Leonard: Is this wrong behavior on your list?
> 
>    Regards,
>    Timo
> 
> 
>    [1] 
>    https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_FLINK_FLIP-2D162-253A-2BConsistent-2BFlink-2BSQL-2Btime-2Bfunction-2Bbehavior&d=DwIDaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=zKznthi6OTKpoJID9dIcyiJ28NX59JIQ2bD246nnMac&m=ZeBr2XK222xGShmn_0N2tF_qGbi7kvWg8WQrsLuquMk&s=3-zBtwyRTd7WiU63ZVwpTKW4vDnn-fjckRI9yjFjrNs&e= 
> 
> 
> 
> 
>    On 02.03.21 12:26, Sebastián Magrí wrote:
>> While using a simple query such as this
>> 
>> SELECT
>>    `ts`,
>>    FLOOR(`ts` TO WEEK) as `week_start`,
>>    CEIL(`ts` TO WEEK) as `week_end`
>> FROM some_table
>> 
>> I get some weird results like these:
>> 
>> 2021-03-01T00:00|    2021-02-25T00:00|    2021-03-04T00:00
>> 
>> Which is obviously wrong since March 1st is on Monday, February 25th is 
>> Thursday as well as March 04th.
>> 
>> I've tried different combinations of timezone configurations and with 
>> both timestamps and dates, with the same results.
>> 
>> Is there anything obviously wrong in that query? Is there any 
>> configuration to keep in mind for the start of week day?
>> 
>> -- 
>> Sebastián Ramírez Magrí
> 
> 


Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by "Jaffe, Julian" <Ju...@activision.com>.
Calcite does not follow ISO-8601. Instead, until very recently Calcite weeks started on Thursdays[1].

(As an aside, Calcite somewhat abuses the WEEK time unit - converting a date to a week returns an integer representing the week of the year the date falls in while FLOORing or CEILing a timestamp to week returns a timestamp. This can cause integration issues with other systems if you're unaware)

Julian


[1] https://issues.apache.org/jira/browse/CALCITE-3412

On 3/2/21, 4:12 AM, "Timo Walther" <tw...@apache.org> wrote:

    Hi Sebastián,

    it might be the case that some time functions are not correct due to the 
    underlying refactoring of data structures. I will loop in Leonard in CC 
    that currently works on improving this situation as part of FLIP-162 [1].

    @Leonard: Is this wrong behavior on your list?

    Regards,
    Timo


    [1] 
    https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_FLINK_FLIP-2D162-253A-2BConsistent-2BFlink-2BSQL-2Btime-2Bfunction-2Bbehavior&d=DwIDaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=zKznthi6OTKpoJID9dIcyiJ28NX59JIQ2bD246nnMac&m=ZeBr2XK222xGShmn_0N2tF_qGbi7kvWg8WQrsLuquMk&s=3-zBtwyRTd7WiU63ZVwpTKW4vDnn-fjckRI9yjFjrNs&e= 




    On 02.03.21 12:26, Sebastián Magrí wrote:
    > While using a simple query such as this
    > 
    > SELECT
    >     `ts`,
    >     FLOOR(`ts` TO WEEK) as `week_start`,
    >     CEIL(`ts` TO WEEK) as `week_end`
    > FROM some_table
    > 
    > I get some weird results like these:
    > 
    > 2021-03-01T00:00|    2021-02-25T00:00|    2021-03-04T00:00
    > 
    > Which is obviously wrong since March 1st is on Monday, February 25th is 
    > Thursday as well as March 04th.
    > 
    > I've tried different combinations of timezone configurations and with 
    > both timestamps and dates, with the same results.
    > 
    > Is there anything obviously wrong in that query? Is there any 
    > configuration to keep in mind for the start of week day?
    > 
    > -- 
    > Sebastián Ramírez Magrí



Re: [Flink-SQL] FLOORing OR CEILing a DATE or TIMESTAMP to WEEK uses Thursdays as week start

Posted by Timo Walther <tw...@apache.org>.
Hi Sebastián,

it might be the case that some time functions are not correct due to the 
underlying refactoring of data structures. I will loop in Leonard in CC 
that currently works on improving this situation as part of FLIP-162 [1].

@Leonard: Is this wrong behavior on your list?

Regards,
Timo


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior




On 02.03.21 12:26, Sebastián Magrí wrote:
> While using a simple query such as this
> 
> SELECT
>     `ts`,
>     FLOOR(`ts` TO WEEK) as `week_start`,
>     CEIL(`ts` TO WEEK) as `week_end`
> FROM some_table
> 
> I get some weird results like these:
> 
> 2021-03-01T00:00|    2021-02-25T00:00|    2021-03-04T00:00
> 
> Which is obviously wrong since March 1st is on Monday, February 25th is 
> Thursday as well as March 04th.
> 
> I've tried different combinations of timezone configurations and with 
> both timestamps and dates, with the same results.
> 
> Is there anything obviously wrong in that query? Is there any 
> configuration to keep in mind for the start of week day?
> 
> -- 
> Sebastián Ramírez Magrí