You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kevin Appel (Jira)" <ji...@apache.org> on 2022/03/16 14:32:00 UTC

[jira] [Created] (SPARK-38571) Week of month from a date is missing in spark3 for return values of 1 to 6

Kevin Appel created SPARK-38571:
-----------------------------------

             Summary: Week of month from a date is missing in spark3 for return values of 1 to 6
                 Key: SPARK-38571
                 URL: https://issues.apache.org/jira/browse/SPARK-38571
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.1.2
            Reporter: Kevin Appel


In Spark2 we could use the date_format function with either the W or F flags to compute week of month from a date.  These are computing two different items, the W is having values from 1 to 6 and the F is having values from 1 to 5

Sample code and output of expected
``` python
df1 = spark.createDataFrame(
    [
        (1, date(2014, 3, 7)),
        (2, date(2014, 3, 8)),
        (3, date(2014, 3, 30)),
        (4, date(2014, 3, 31)),
        (5, date(2015, 3, 7)),
        (6, date(2015, 3, 8)),
        (7, date(2015, 3, 30)),
        (8, date(2015, 3, 31)),
    ],
    schema="a long, b date",
)
df1 = df1.withColumn("WEEKOFMONTH1-6", F.date_format(F.col("b"), "W"))
df1 = df1.withColumn("WEEKOFMONTH1-5", F.date_format(F.col("b"), "F"))
df1.show()
```

+---+----------+--------------+--------------+                                                                                                                        
|  a|         b|WEEKOFMONTH1-6|WEEKOFMONTH1-5|
+---+----------+--------------+--------------+
|  1|2014-03-07|             2|             1|
|  2|2014-03-08|             2|             2|
|  3|2014-03-30|             6|             5|
|  4|2014-03-31|             6|             5|
|  5|2015-03-07|             1|             1|
|  6|2015-03-08|             2|             2|
|  7|2015-03-30|             5|             5|
|  8|2015-03-31|             5|             5|
+---+----------+--------------+--------------+

 

With the Spark3 having the spark.sql.legacy.timeParserPolicy set to 
EXCEPTION by default this throws an error:
Caused by: java.lang.IllegalArgumentException: All week-based patterns are unsupported since Spark 3.0, detected: W, Please use the SQL function EXTRACT instead
 
However from the EXTRACT function there is nothing available that is extracting the week of month for the values 1 to 6
 
The Spark3 mentions they define our own patterns  located at [https://spark.apache.org/docs/3.2.1/sql-ref-datetime-pattern.html] that are implemented via DateTimeFormatter under the hood: 
[https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html]
 
That site is listing both W and F for week of month
 W week-of-month number 4
 F week-of-month number 3
 
However only F is implemented on the datetime pattern reference
 
Is there another way we can compute this week of month for values 1 to 6 by still using the builtin with Spark3?  Currently we have to set the spark.sql.legacy.timeParserPolicy to LEGACY in order to run this
 
Thank you,
 
Kevin



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org