You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by MaxGekk <gi...@git.apache.org> on 2018/11/26 23:02:32 UTC
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/23150
[SPARK-26178][SQL] Use java.time API for parsing timestamps and dates from CSV
## What changes were proposed in this pull request?
In the PR, I propose to use **java.time API** for parsing timestamps and dates from CSV content with microseconds precision. The SQL config `spark.sql.legacy.timeParser.enabled` allow to switch back to previous behaviour with using `java.text.SimpleDateFormat`/`FastDateFormat` for parsing/generating timestamps/dates.
## How was this patch tested?
It was tested by `UnivocityParserSuite`, `CsvExpressionsSuite`, `CsvFunctionsSuite` and `CsvSuite`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 time-parser
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23150.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23150
----
commit 74a76c2f78ad139993f3bbe0f2ff8f1c81c3bd84
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T11:37:55Z
New and legacy time parser
commit 63cf6112085029c52e4aee6f9bb2e6b84ce18a96
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T12:00:10Z
Add config spark.sql.legacy.timeParser.enabled
commit 2a2ab83a5ecb251ce81e7f12a8c0d3067f88b2d5
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T13:24:07Z
Fallback legacy parser
commit 667bf9f65a90ac69b8cbbad77a17e21f9dd18733
Author: Maxim Gekk <ma...@...>
Date: 2018-11-24T15:54:19Z
something
commit 227a7bdc53bdd022e9c365b410810c58f56e8bea
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T12:15:15Z
Using instances
commit 73ee56088bf4d2856c454a7bbd4171b61cfe4614
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T13:52:02Z
Added generator
commit f35f6e13270eb994ac97627da79497673b4fe686
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T18:03:17Z
Refactoring of TimeFormatter
commit 1c09b58e6fe3e0fd565c852dcb73dc012fa56819
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T18:06:22Z
Renaming to DateTimeFormatter
commit 7b213d5b2ae404c87f090da622a78d3d19fee6a9
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T18:32:54Z
Added DateFormatter
commit 242ba474dcf112b48bd286811daed86a66366c39
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T19:58:08Z
Default values in parsing
commit db48ee6918eef06e19c3bdf64e3c44f4541cc294
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T21:09:08Z
Parse as date type because format for timestamp is not not matched to values
commit e18841b38050ac411a507a2a2643584f2c8739ce
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T21:53:11Z
Fix tests
commit 8db023834b680f336ff5a0e08253ba2cb3b6e3b7
Author: Maxim Gekk <ma...@...>
Date: 2018-11-25T23:09:20Z
CSVSuite passed
commit 0b9ed92a456d60db0934340f37e0bd428b2f6a42
Author: Maxim Gekk <ma...@...>
Date: 2018-11-26T22:00:10Z
Fix imports
commit 799ebb3432dec7fe1e1099d68a3f1c09e714aa8e
Author: Maxim Gekk <ma...@...>
Date: 2018-11-26T22:03:19Z
Revert test back
commit 5a223919439e2d22814b92c0e1e572b3c318566f
Author: Maxim Gekk <ma...@...>
Date: 2018-11-26T22:17:11Z
Set timeZone
commit f287b77d94de9e9f466c0ff2c2370f22a46b48f7
Author: Maxim Gekk <ma...@...>
Date: 2018-11-26T22:44:42Z
Removing default for micros because it causes conflicts in parsing
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r238075585
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
@@ -1107,7 +1111,7 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te
test("SPARK-18699 put malformed records in a `columnNameOfCorruptRecord` field") {
Seq(false, true).foreach { multiLine =>
- val schema = new StructType().add("a", IntegerType).add("b", TimestampType)
+ val schema = new StructType().add("a", IntegerType).add("b", DateType)
--- End diff --
I changed the type because supposed to valid date `"1983-08-04"` cannot be parsed with default timestamp pattern.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5574/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
@srowen @HyukjinKwon @viirya @mgaido91 May I ask you to look at the PR. The changes are related to another PR which you have reviewed already.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/23150
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99556/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99510/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/23150
Looks like that kind of fallback logic was there for compatibility with Spark 1.x and 2.0; see some comments about 'backwards compatibility' in for example https://github.com/apache/spark/blame/af8250e12490c77f02587275eff9aa225e5dcdba/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
I think we can remove this fallback and document it as a behavior change.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99507/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5399/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
> Are there any other behavior changes with the new code, besides being able to parse microseconds?
The main one is new parser doesn't have the fallback to `DateTimeUtils.stringToTime` if it cannot parse its input. Though I am not sure about this fallback ... if an user wants to parse input which doesn't fit to specified pattern (like `"yyyy-MM-dd'T'HH:mm:ss.SSSXXX"`) why should guess and try another pattern instead of forcing him/her to set appropriate one. I could guess this fallback was added for backward compatibility to previous versions. If so, should we still keep compatibility to such version?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
jenkins, retest this, please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5577/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99510 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99510/testReport)** for PR 23150 at commit [`f3f46c7`](https://github.com/apache/spark/commit/f3f46c75a225021f7b7c743dd977d8d195a508fc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5633/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
@HyukjinKwon May I ask you to look at this PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99296/testReport)** for PR 23150 at commit [`f287b77`](https://github.com/apache/spark/commit/f287b77d94de9e9f466c0ff2c2370f22a46b48f7).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99296/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99296/testReport)** for PR 23150 at commit [`f287b77`](https://github.com/apache/spark/commit/f287b77d94de9e9f466c0ff2c2370f22a46b48f7).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #4449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4449/testReport)** for PR 23150 at commit [`60c0974`](https://github.com/apache/spark/commit/60c0974261c947c0838457c40f4fe0e64d17ca15).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99316/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r237451758
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala ---
@@ -23,10 +23,16 @@ import scala.util.control.Exception.allCatch
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.catalyst.analysis.TypeCoercion
-import org.apache.spark.sql.catalyst.util.DateTimeUtils
+import org.apache.spark.sql.catalyst.util.DateTimeFormatter
import org.apache.spark.sql.types._
-object CSVInferSchema {
+class CSVInferSchema(val options: CSVOptions) extends Serializable {
--- End diff --
since we get the `CSVOptions` in the constructor, shall we remove it as a parameter of the several methods? it is pretty confusing which one is used right now...
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
jenkins, retest this, please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99575/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
I have to correct timestamp/date pattern in a few test to follow ISO 8601 (see [Patterns for Formatting and Parsing](https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html) like `hh` -> `HH` and `mm` -> `MM`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99447 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99447/testReport)** for PR 23150 at commit [`4d6c86b`](https://github.com/apache/spark/commit/4d6c86b4d82743b6ca9f7389579646324bca2727).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99608 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99608/testReport)** for PR 23150 at commit [`60c0974`](https://github.com/apache/spark/commit/60c0974261c947c0838457c40f4fe0e64d17ca15).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r238075711
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala ---
@@ -86,62 +85,74 @@ class UnivocityParserSuite extends SparkFunSuite with SQLHelper {
// null.
Seq(true, false).foreach { b =>
val options = new CSVOptions(Map("nullValue" -> "null"), false, "GMT")
- val converter =
- parser.makeConverter("_1", StringType, nullable = b, options = options)
+ val parser = new UnivocityParser(StructType(Seq.empty), options)
+ val converter = parser.makeConverter("_1", StringType, nullable = b)
assert(converter.apply("") == UTF8String.fromString(""))
}
}
test("Throws exception for empty string with non null type") {
- val options = new CSVOptions(Map.empty[String, String], false, "GMT")
+ val options = new CSVOptions(Map.empty[String, String], false, "GMT")
+ val parser = new UnivocityParser(StructType(Seq.empty), options)
val exception = intercept[RuntimeException]{
- parser.makeConverter("_1", IntegerType, nullable = false, options = options).apply("")
+ parser.makeConverter("_1", IntegerType, nullable = false).apply("")
}
assert(exception.getMessage.contains("null value found but field _1 is not nullable."))
}
test("Types are cast correctly") {
val options = new CSVOptions(Map.empty[String, String], false, "GMT")
- assert(parser.makeConverter("_1", ByteType, options = options).apply("10") == 10)
- assert(parser.makeConverter("_1", ShortType, options = options).apply("10") == 10)
- assert(parser.makeConverter("_1", IntegerType, options = options).apply("10") == 10)
- assert(parser.makeConverter("_1", LongType, options = options).apply("10") == 10)
- assert(parser.makeConverter("_1", FloatType, options = options).apply("1.00") == 1.0)
- assert(parser.makeConverter("_1", DoubleType, options = options).apply("1.00") == 1.0)
- assert(parser.makeConverter("_1", BooleanType, options = options).apply("true") == true)
-
- val timestampsOptions =
+ var parser = new UnivocityParser(StructType(Seq.empty), options)
+ assert(parser.makeConverter("_1", ByteType).apply("10") == 10)
+ assert(parser.makeConverter("_1", ShortType).apply("10") == 10)
+ assert(parser.makeConverter("_1", IntegerType).apply("10") == 10)
+ assert(parser.makeConverter("_1", LongType).apply("10") == 10)
+ assert(parser.makeConverter("_1", FloatType).apply("1.00") == 1.0)
+ assert(parser.makeConverter("_1", DoubleType).apply("1.00") == 1.0)
+ assert(parser.makeConverter("_1", BooleanType).apply("true") == true)
+
+ var timestampsOptions =
new CSVOptions(Map("timestampFormat" -> "dd/MM/yyyy hh:mm"), false, "GMT")
+ parser = new UnivocityParser(StructType(Seq.empty), timestampsOptions)
val customTimestamp = "31/01/2015 00:00"
- val expectedTime = timestampsOptions.timestampFormat.parse(customTimestamp).getTime
- val castedTimestamp =
- parser.makeConverter("_1", TimestampType, nullable = true, options = timestampsOptions)
+ var format = FastDateFormat.getInstance(
+ timestampsOptions.timestampFormat, timestampsOptions.timeZone, timestampsOptions.locale)
+ val expectedTime = format.parse(customTimestamp).getTime
+ val castedTimestamp = parser.makeConverter("_1", TimestampType, nullable = true)
.apply(customTimestamp)
assert(castedTimestamp == expectedTime * 1000L)
val customDate = "31/01/2015"
val dateOptions = new CSVOptions(Map("dateFormat" -> "dd/MM/yyyy"), false, "GMT")
- val expectedDate = dateOptions.dateFormat.parse(customDate).getTime
- val castedDate =
- parser.makeConverter("_1", DateType, nullable = true, options = dateOptions)
- .apply(customTimestamp)
- assert(castedDate == DateTimeUtils.millisToDays(expectedDate))
+ parser = new UnivocityParser(StructType(Seq.empty), dateOptions)
+ format = FastDateFormat.getInstance(
+ dateOptions.dateFormat, dateOptions.timeZone, dateOptions.locale)
+ val expectedDate = format.parse(customDate).getTime
+ val castedDate = parser.makeConverter("_1", DateType, nullable = true)
+ .apply(customDate)
+ assert(castedDate == DateTimeUtils.millisToDays(expectedDate, TimeZone.getTimeZone("GMT")))
val timestamp = "2015-01-01 00:00:00"
- assert(parser.makeConverter("_1", TimestampType, options = options).apply(timestamp) ==
- DateTimeUtils.stringToTime(timestamp).getTime * 1000L)
- assert(parser.makeConverter("_1", DateType, options = options).apply("2015-01-01") ==
- DateTimeUtils.millisToDays(DateTimeUtils.stringToTime("2015-01-01").getTime))
+ timestampsOptions = new CSVOptions(Map(
+ "timestampFormat" -> "yyyy-MM-dd HH:mm:ss",
+ "dateFormat" -> "yyyy-MM-dd"), false, "UTC")
+ parser = new UnivocityParser(StructType(Seq.empty), timestampsOptions)
+ val expected = 1420070400 * DateTimeUtils.MICROS_PER_SECOND
--- End diff --
I set number of seconds since epoch in UTC because `DateTimeUtils.stringToTime(timestamp).getTime` depends on current local time zone in jvm.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/23150
Merged to master
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99575/testReport)** for PR 23150 at commit [`f8097b4`](https://github.com/apache/spark/commit/f8097b49acf75661de79a06ec0a55f8a66516c3a).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99608 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99608/testReport)** for PR 23150 at commit [`60c0974`](https://github.com/apache/spark/commit/60c0974261c947c0838457c40f4fe0e64d17ca15).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r238195206
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1618,6 +1618,13 @@ object SQLConf {
"a SparkConf entry.")
.booleanConf
.createWithDefault(true)
+
+ val LEGACY_TIME_PARSER_ENABLED = buildConf("spark.sql.legacy.timeParser.enabled")
+ .doc("When set to true, java.text.SimpleDateFormat is using for formatting and parsing " +
+ " dates/timestamps in a locale-sensitive manner. When set to false, classes from " +
+ "java.time.* packages are using for the same purpose.")
--- End diff --
ditto
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99316/testReport)** for PR 23150 at commit [`52074f7`](https://github.com/apache/spark/commit/52074f77a7ffbf26444e67077a353c0ebd07d832).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r238195153
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -1618,6 +1618,13 @@ object SQLConf {
"a SparkConf entry.")
.booleanConf
.createWithDefault(true)
+
+ val LEGACY_TIME_PARSER_ENABLED = buildConf("spark.sql.legacy.timeParser.enabled")
+ .doc("When set to true, java.text.SimpleDateFormat is using for formatting and parsing " +
--- End diff --
nit `using` -> `used`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/23150
The code looks good at a glance. So the flag lets people select the old behavior; that keeps it pretty safe. Are there any other behavior changes with the new code, besides being able to parse microseconds?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99510 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99510/testReport)** for PR 23150 at commit [`f3f46c7`](https://github.com/apache/spark/commit/f3f46c75a225021f7b7c743dd977d8d195a508fc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99575 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99575/testReport)** for PR 23150 at commit [`f8097b4`](https://github.com/apache/spark/commit/f8097b49acf75661de79a06ec0a55f8a66516c3a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5664/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5407/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99447/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on a diff in the pull request:
https://github.com/apache/spark/pull/23150#discussion_r238075664
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala ---
@@ -622,10 +623,11 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te
val options = Map(
"header" -> "true",
"inferSchema" -> "false",
- "dateFormat" -> "dd/MM/yyyy hh:mm")
+ "dateFormat" -> "dd/MM/yyyy HH:mm")
--- End diff --
According to iso 8601:
```
h clock-hour-of-am-pm (1-12) number 12
H hour-of-day (0-23) number 0
```
but real data is not in the allowed range:
```
date
26/08/2015 18:00
27/10/2014 18:30
28/01/2016 20:00
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5379/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99327/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
> ... if it's ready from your side @MaxGekk
@srowen I just think how I could reduce number of changes in tests. In some cases, test behavior depends on current time zone on my laptop. Without changes of this PR, such tests passed because expected and produced values are results of the same functions. I would try to fix that in a separate PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99327 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99327/testReport)** for PR 23150 at commit [`52074f7`](https://github.com/apache/spark/commit/52074f77a7ffbf26444e67077a353c0ebd07d832).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99507/testReport)** for PR 23150 at commit [`f3f46c7`](https://github.com/apache/spark/commit/f3f46c75a225021f7b7c743dd977d8d195a508fc).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5519/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
> they pass right? is there another test you were unable to add?
For now everything has been passed. I run all test localy on different timezones (set via jvm parameter `-Duser.timezone`).
I updated the migration guide since I enabled new parser by default.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
Some tests didn't pass on new changes till I set time zone explicitly. The tests use the same functions for checking correctness as the code that is supposed to test. I think need more investigation here, I guess old (current) implementation doesn't take timezones into account in some cases.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99507 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99507/testReport)** for PR 23150 at commit [`f3f46c7`](https://github.com/apache/spark/commit/f3f46c75a225021f7b7c743dd977d8d195a508fc).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99316/testReport)** for PR 23150 at commit [`52074f7`](https://github.com/apache/spark/commit/52074f77a7ffbf26444e67077a353c0ebd07d832).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by MaxGekk <gi...@git.apache.org>.
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23150
@srowen I think this PR is ready.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5615/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99556/testReport)** for PR 23150 at commit [`00509d3`](https://github.com/apache/spark/commit/00509d3a94e0679505cd9fde78e38a3a15d11bde).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99327 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99327/testReport)** for PR 23150 at commit [`52074f7`](https://github.com/apache/spark/commit/52074f77a7ffbf26444e67077a353c0ebd07d832).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/23150
It makes sense that parsing depends on a timezone, though that's set as an option in the parser typically. The tests should generally test "GMT" for this reason. If there's a default code path for when no timezone is specified, then I'd use the test harness mechanisms for temporarily changing the system timezone to GMT (which then automatically changes back).
Your changes look OK here and they pass right? is there another test you were unable to add?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99556/testReport)** for PR 23150 at commit [`00509d3`](https://github.com/apache/spark/commit/00509d3a94e0679505cd9fde78e38a3a15d11bde).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #4449 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4449/testReport)** for PR 23150 at commit [`60c0974`](https://github.com/apache/spark/commit/60c0974261c947c0838457c40f4fe0e64d17ca15).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99608/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23150
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23150
**[Test build #99447 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99447/testReport)** for PR 23150 at commit [`4d6c86b`](https://github.com/apache/spark/commit/4d6c86b4d82743b6ca9f7389579646324bca2727).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org