You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ma...@apache.org on 2021/08/23 10:08:19 UTC
[spark] branch master updated:
[SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration
guide about using `CAST` in datetime parsing
This is an automated email from the ASF dual-hosted git repository.
maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 9f595c4 [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing
9f595c4 is described below
commit 9f595c4ce34728f5d8f943eadea8d85a548b2d41
Author: Max Gekk <ma...@gmail.com>
AuthorDate: Mon Aug 23 13:07:37 2021 +0300
[SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing
### What changes were proposed in this pull request?
In the PR, I propose the update the SQL migration guide about the changes introduced by the PRs https://github.com/apache/spark/pull/33709 and https://github.com/apache/spark/pull/33769.
<img width="1011" alt="Screenshot 2021-08-23 at 11 40 35" src="https://user-images.githubusercontent.com/1580697/130419710-640f20b3-6a38-4eb1-a6d6-2e069dc5665c.png">
### Why are the changes needed?
To inform users about the upcoming changes in parsing datetime strings. This should help users to migrate on the new release.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
By generating the doc, and checking by eyes:
```
$ SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll build
```
Closes #33809 from MaxGekk/datetime-cast-migr-guide.
Authored-by: Max Gekk <ma...@gmail.com>
Signed-off-by: Max Gekk <ma...@gmail.com>
---
docs/sql-migration-guide.md | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 7ad384f..47e7921 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -26,6 +26,26 @@ license: |
- Since Spark 3.3, Spark turns a non-nullable schema into nullable for API `DataFrameReader.schema(schema: StructType).json(jsonDataset: Dataset[String])` and `DataFrameReader.schema(schema: StructType).csv(csvDataset: Dataset[String])` when the schema is specified by the user and contains non-nullable fields.
+ - Since Spark 3.3, when the date or timestamp pattern is not specified, Spark converts an input string to a date/timestamp using the `CAST` expression approach. The changes affect CSV/JSON datasources and parsing of partition values. In Spark 3.2 or earlier, when the date or timestamp pattern is not set, Spark uses the default patterns: `yyyy-MM-dd` for dates and `yyyy-MM-dd HH:mm:ss` for timestamps. After the changes, Spark still recognizes the pattern together with
+
+ Date patterns:
+ * `[+-]yyyy*`
+ * `[+-]yyyy*-[m]m`
+ * `[+-]yyyy*-[m]m-[d]d`
+ * `[+-]yyyy*-[m]m-[d]d `
+ * `[+-]yyyy*-[m]m-[d]d *`
+ * `[+-]yyyy*-[m]m-[d]dT*`
+
+ Timestamp patterns:
+ * `[+-]yyyy*`
+ * `[+-]yyyy*-[m]m`
+ * `[+-]yyyy*-[m]m-[d]d`
+ * `[+-]yyyy*-[m]m-[d]d `
+ * `[+-]yyyy*-[m]m-[d]d [h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+ * `[+-]yyyy*-[m]m-[d]dT[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+ * `[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+ * `T[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+
## Upgrading from Spark SQL 3.1 to 3.2
- Since Spark 3.2, ADD FILE/JAR/ARCHIVE commands require each path to be enclosed by `"` or `'` if the path contains whitespaces.
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org