You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by we...@apache.org on 2020/05/31 13:05:59 UTC
[spark] branch master updated: [SPARK-31874][SQL] Use
`FastDateFormat` as the legacy fractional formatter
This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 47dc332 [SPARK-31874][SQL] Use `FastDateFormat` as the legacy fractional formatter
47dc332 is described below
commit 47dc332258bec20c460f666de50d9a8c5c0fbc0a
Author: Max Gekk <ma...@gmail.com>
AuthorDate: Sun May 31 13:05:00 2020 +0000
[SPARK-31874][SQL] Use `FastDateFormat` as the legacy fractional formatter
### What changes were proposed in this pull request?
1. Replace `SimpleDateFormat` by `FastDateFormat` as the legacy formatter of `FractionTimestampFormatter`.
2. Optimise `LegacyFastTimestampFormatter` for `java.sql.Timestamp` w/o fractional part.
### Why are the changes needed?
1. By default `HiveResult`.`hiveResultString` retrieves timestamps values as instances of `java.sql.Timestamp`, and uses the legacy parser `SimpleDateFormat` to convert the timestamps to strings. After the fix https://github.com/apache/spark/pull/28024, the fractional formatter and its companion - legacy formatter `SimpleDateFormat` are created per every value. By switching from `LegacySimpleTimestampFormatter` to `LegacyFastTimestampFormatter`, we can utilize the internal cache of `F [...]
2. The second change in the method `def format(ts: Timestamp): String` of `LegacyFastTimestampFormatter` is needed to optimize the formatter for patterns without the fractional part and avoid conversions to microseconds.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
By existing tests in `TimestampFormatter`.
Closes #28678 from MaxGekk/fastdateformat-as-legacy-frac-formatter.
Authored-by: Max Gekk <ma...@gmail.com>
Signed-off-by: Wenchen Fan <we...@databricks.com>
---
.../org/apache/spark/sql/catalyst/util/TimestampFormatter.scala | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala
index 8428964..3e302e2 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala
@@ -121,6 +121,7 @@ class FractionTimestampFormatter(zoneId: ZoneId)
TimestampFormatter.defaultPattern,
zoneId,
TimestampFormatter.defaultLocale,
+ LegacyDateFormats.FAST_DATE_FORMAT,
needVarLengthSecondFraction = false) {
@transient
@@ -224,7 +225,11 @@ class LegacyFastTimestampFormatter(
}
override def format(ts: Timestamp): String = {
- format(fromJavaTimestamp(ts))
+ if (ts.getNanos == 0) {
+ fastDateFormat.format(ts)
+ } else {
+ format(fromJavaTimestamp(ts))
+ }
}
override def format(instant: Instant): String = {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org