You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jia Fan (Jira)" <ji...@apache.org> on 2023/10/06 04:36:00 UTC

[jira] [Updated] (SPARK-45433) CSV/JSON schema inference when timestamps do not match specified timestampFormat with only one row on each partition report error

     [ https://issues.apache.org/jira/browse/SPARK-45433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jia Fan updated SPARK-45433:
----------------------------
    Description: 
CSV/JSON schema inference when timestamps do not match specified timestampFormat with `only one row on each partition` report error.
{code:java}
//eg
val csv = spark.read.option("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss")
  .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS())
csv.show() {code}
{code:java}
//error
Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code}
This bug affect 3.3/3.4/3.5. Unlike https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug but has the same error message

  was:
CSV/JSON schema inference when timestamps do not match specified timestampFormat with `only one row on each partition` report error.

 
{code:java}
//eg
val csv = spark.read.option("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss")
  .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS())
csv.show() {code}
{code:java}
//error
Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code}
This bug affect 3.3/3.4/3.5. Unlike https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug but has the same error message


> CSV/JSON schema inference when timestamps do not match specified timestampFormat with only one row on each partition report error
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-45433
>                 URL: https://issues.apache.org/jira/browse/SPARK-45433
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0, 3.4.0, 3.5.0
>            Reporter: Jia Fan
>            Priority: Major
>
> CSV/JSON schema inference when timestamps do not match specified timestampFormat with `only one row on each partition` report error.
> {code:java}
> //eg
> val csv = spark.read.option("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss")
>   .option("inferSchema", true).csv(Seq("2884-06-24T02:45:51.138").toDS())
> csv.show() {code}
> {code:java}
> //error
> Caused by: java.time.format.DateTimeParseException: Text '2884-06-24T02:45:51.138' could not be parsed, unparsed text found at index 19 {code}
> This bug affect 3.3/3.4/3.5. Unlike https://issues.apache.org/jira/browse/SPARK-45424 , this is a different bug but has the same error message



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org