You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Wenchen Fan (Jira)" <ji...@apache.org> on 2020/06/03 19:04:00 UTC

[jira] [Reopened] (SPARK-31879) First day of week changed for non-MONDAY_START Lacales

     [ https://issues.apache.org/jira/browse/SPARK-31879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wenchen Fan reopened SPARK-31879:
---------------------------------

> First day of week changed for non-MONDAY_START Lacales
> ------------------------------------------------------
>
>                 Key: SPARK-31879
>                 URL: https://issues.apache.org/jira/browse/SPARK-31879
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Kent Yao
>            Assignee: Kent Yao
>            Priority: Blocker
>             Fix For: 3.0.0
>
>
> h1. cases
> {code:sql}
> spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u');
> 2019-12-29 00:00:00
> spark-sql> set spark.sql.legacy.timeParserPolicy=legacy;
> spark.sql.legacy.timeParserPolicy	legacy
> spark-sql> select to_timestamp('2020-1-1', 'YYYY-w-u');
> 2019-12-30 00:00:00
> {code}
> h1. reasons
> These week-based fields need Locale to express their semantics, the first day of the week varies from country to country.
> From the Java doc of WeekFields
> {code:java}
>     /**
>      * Gets the first day-of-week.
>      * <p>
>      * The first day-of-week varies by culture.
>      * For example, the US uses Sunday, while France and the ISO-8601 standard use Monday.
>      * This method returns the first day using the standard {@code DayOfWeek} enum.
>      *
>      * @return the first day-of-week, not null
>      */
>     public DayOfWeek getFirstDayOfWeek() {
>         return firstDayOfWeek;
>     }
> {code}
> But for the SimpleDateFormat, the day-of-week is not localized
> ```
> u	Day number of week (1 = Monday, ..., 7 = Sunday)	Number	1
> ```
> Currently, the default locale we use is the US, so the result moved a day backward.
> For other countries, please refer to [First Day of the Week in Different Countries|http://chartsbin.com/view/41671]
> h1. solution options
> 1. Use new Locale("en", "GB") as default locale.
> 2. For JDK10 and onwards, we can set locale Unicode extension 'fw'  to 'mon', but not work for lower JDKs
> 3. Forbid 'u', give user proper exceptions, and enable and document 'e/c'. Currently, the 'u' is internally substituted by 'e', but they are not equivalent.
> 1 and 2 can solve this with default locale but not for the functions with custom locale supported.
> cc [~cloud_fan] [~dongjoon] [~maropu]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org