You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kevin Appel (Jira)" <ji...@apache.org> on 2022/03/16 14:32:00 UTC
[jira] [Created] (SPARK-38571) Week of month from a date is missing in spark3 for return values of 1 to 6
Kevin Appel created SPARK-38571:
-----------------------------------
Summary: Week of month from a date is missing in spark3 for return values of 1 to 6
Key: SPARK-38571
URL: https://issues.apache.org/jira/browse/SPARK-38571
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 3.1.2
Reporter: Kevin Appel
In Spark2 we could use the date_format function with either the W or F flags to compute week of month from a date. These are computing two different items, the W is having values from 1 to 6 and the F is having values from 1 to 5
Sample code and output of expected
``` python
df1 = spark.createDataFrame(
[
(1, date(2014, 3, 7)),
(2, date(2014, 3, 8)),
(3, date(2014, 3, 30)),
(4, date(2014, 3, 31)),
(5, date(2015, 3, 7)),
(6, date(2015, 3, 8)),
(7, date(2015, 3, 30)),
(8, date(2015, 3, 31)),
],
schema="a long, b date",
)
df1 = df1.withColumn("WEEKOFMONTH1-6", F.date_format(F.col("b"), "W"))
df1 = df1.withColumn("WEEKOFMONTH1-5", F.date_format(F.col("b"), "F"))
df1.show()
```
+---+----------+--------------+--------------+
| a| b|WEEKOFMONTH1-6|WEEKOFMONTH1-5|
+---+----------+--------------+--------------+
| 1|2014-03-07| 2| 1|
| 2|2014-03-08| 2| 2|
| 3|2014-03-30| 6| 5|
| 4|2014-03-31| 6| 5|
| 5|2015-03-07| 1| 1|
| 6|2015-03-08| 2| 2|
| 7|2015-03-30| 5| 5|
| 8|2015-03-31| 5| 5|
+---+----------+--------------+--------------+
With the Spark3 having the spark.sql.legacy.timeParserPolicy set to
EXCEPTION by default this throws an error:
Caused by: java.lang.IllegalArgumentException: All week-based patterns are unsupported since Spark 3.0, detected: W, Please use the SQL function EXTRACT instead
However from the EXTRACT function there is nothing available that is extracting the week of month for the values 1 to 6
The Spark3 mentions they define our own patterns located at [https://spark.apache.org/docs/3.2.1/sql-ref-datetime-pattern.html] that are implemented via DateTimeFormatter under the hood:
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]
That site is listing both W and F for week of month
W week-of-month number 4
F week-of-month number 3
However only F is implemented on the datetime pattern reference
Is there another way we can compute this week of month for values 1 to 6 by still using the builtin with Spark3? Currently we have to set the spark.sql.legacy.timeParserPolicy to LEGACY in order to run this
Thank you,
Kevin
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org