You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liu Neng (Jira)" <ji...@apache.org> on 2020/12/04 03:47:00 UTC
[jira] [Comment Edited] (SPARK-33632) to_date doesn't behave as
documented
[ https://issues.apache.org/jira/browse/SPARK-33632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17243681#comment-17243681 ]
Liu Neng edited comment on SPARK-33632 at 12/4/20, 3:46 AM:
------------------------------------------------------------
This is not an issue, you may misunderstand the docs.
You should use patternĀ m/d/yy, parse mode is determined by count of letter 'y'.
below is source code from DateTimeFormatterBuilder.
!image-2020-12-04-11-45-10-379.png!
was (Author: qwe1398775315):
you should use patternĀ m/d/yy, parse mode is determined by count of letter 'y'.
below is source code from DateTimeFormatterBuilder.
!image-2020-12-04-11-45-10-379.png!
> to_date doesn't behave as documented
> ------------------------------------
>
> Key: SPARK-33632
> URL: https://issues.apache.org/jira/browse/SPARK-33632
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.0.1
> Reporter: Frank Oosterhuis
> Priority: Major
> Attachments: image-2020-12-04-11-45-10-379.png
>
>
> I'm trying to use to_date on a string formatted as "10/31/20".
> Expected output is "2020-10-31".
> Actual output is "0020-01-31".
> The [documentation|https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html] suggests 2020 or 20 as input for "y".
> Example below. Expected behaviour is included in the udf.
> {code:scala}
> import java.sql.Date
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.functions.{to_date, udf}
> object ToDate {
> val toDate = udf((date: String) => {
> val split = date.split("/")
> val month = "%02d".format(split(0).toInt)
> val day = "%02d".format(split(1).toInt)
> val year = split(2).toInt + 2000
> Date.valueOf(s"${year}-${month}-${day}")
> })
> def main(args: Array[String]): Unit = {
> val spark = SparkSession.builder().master("local[2]").getOrCreate()
> spark.sparkContext.setLogLevel("ERROR")
> import spark.implicits._
> Seq("1/1/20", "10/31/20")
> .toDF("raw")
> .withColumn("to_date", to_date($"raw", "m/d/y"))
> .withColumn("udf", toDate($"raw"))
> .show
> }
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org