You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ma...@apache.org on 2021/08/23 10:08:19 UTC

[spark] branch master updated: [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 9f595c4  [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing
9f595c4 is described below

commit 9f595c4ce34728f5d8f943eadea8d85a548b2d41
Author: Max Gekk <ma...@gmail.com>
AuthorDate: Mon Aug 23 13:07:37 2021 +0300

    [SPARK-36418][SPARK-36536][SQL][DOCS][FOLLOWUP] Update the SQL migration guide about using `CAST` in datetime parsing
    
    ### What changes were proposed in this pull request?
    In the PR, I propose the update the SQL migration guide about the changes introduced by the PRs https://github.com/apache/spark/pull/33709 and https://github.com/apache/spark/pull/33769.
    
    <img width="1011" alt="Screenshot 2021-08-23 at 11 40 35" src="https://user-images.githubusercontent.com/1580697/130419710-640f20b3-6a38-4eb1-a6d6-2e069dc5665c.png">
    
    ### Why are the changes needed?
    To inform users about the upcoming changes in parsing datetime strings. This should help users to migrate on the new release.
    
    ### Does this PR introduce _any_ user-facing change?
    No.
    
    ### How was this patch tested?
    By generating the doc, and checking by eyes:
    ```
    $ SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll build
    ```
    
    Closes #33809 from MaxGekk/datetime-cast-migr-guide.
    
    Authored-by: Max Gekk <ma...@gmail.com>
    Signed-off-by: Max Gekk <ma...@gmail.com>
---
 docs/sql-migration-guide.md | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 7ad384f..47e7921 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -26,6 +26,26 @@ license: |
 
   - Since Spark 3.3, Spark turns a non-nullable schema into nullable for API `DataFrameReader.schema(schema: StructType).json(jsonDataset: Dataset[String])` and `DataFrameReader.schema(schema: StructType).csv(csvDataset: Dataset[String])` when the schema is specified by the user and contains non-nullable fields.
 
+  - Since Spark 3.3, when the date or timestamp pattern is not specified, Spark converts an input string to a date/timestamp using the `CAST` expression approach. The changes affect CSV/JSON datasources and parsing of partition values. In Spark 3.2 or earlier, when the date or timestamp pattern is not set, Spark uses the default patterns: `yyyy-MM-dd` for dates and `yyyy-MM-dd HH:mm:ss` for timestamps. After the changes, Spark still recognizes the pattern together with
+    
+    Date patterns:
+      * `[+-]yyyy*`
+      * `[+-]yyyy*-[m]m`
+      * `[+-]yyyy*-[m]m-[d]d`
+      * `[+-]yyyy*-[m]m-[d]d `
+      * `[+-]yyyy*-[m]m-[d]d *`
+      * `[+-]yyyy*-[m]m-[d]dT*`
+    
+    Timestamp patterns:
+      * `[+-]yyyy*`
+      * `[+-]yyyy*-[m]m`
+      * `[+-]yyyy*-[m]m-[d]d`
+      * `[+-]yyyy*-[m]m-[d]d `
+      * `[+-]yyyy*-[m]m-[d]d [h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+      * `[+-]yyyy*-[m]m-[d]dT[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+      * `[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+      * `T[h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]`
+
 ## Upgrading from Spark SQL 3.1 to 3.2
 
   - Since Spark 3.2, ADD FILE/JAR/ARCHIVE commands require each path to be enclosed by `"` or `'` if the path contains whitespaces.

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org