You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "mingkangli-db (via GitHub)" <gi...@apache.org> on 2023/07/04 05:52:51 UTC

[GitHub] [spark] mingkangli-db opened a new pull request, #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

mingkangli-db opened a new pull request, #41843:
URL: https://github.com/apache/spark/pull/41843

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   This PR fixes an overflow issue upon reading an "infinity" timestamp from PostgreSQL. It addresses the issue by adding a new function, convertJavaTimestampToTimestamp, to the JDBCDialects API, and overrides it in PostgresDialect.scala by clamping the special "infinity" timestamps to long.max and long.min.
   
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   The pre-existing default behavior of timestamp conversion potentially triggers an overflow due to these special values (i.e. The executor would crash if you select a column that contains infinity timestamps in PostgreSQL.) By integrating this new function, we can mitigate such issues, enabling more versatile and robust timestamp value conversions across various JDBC-based connectors.
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   A new function, convertJavaTimestampToTimestamp, is added to the JDBCDialects API to allow JDBC dialects to override the default behavior of converting Java timestamps.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   
   An integration test was added in PostgresIntegrationSuite.scala to verify it can handle +infinity and -infinity timestamps in PostgreSQL.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yaooqinn (via GitHub)" <gi...@apache.org>.
yaooqinn commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530674531


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   Nice catch! We shall fix this...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1619546065

   cc @cloud-fan @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1619543727

   rerun checks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   `ts.getTime()' returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254927001


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime
+
+    if (time == POSTGRESQL_DATE_POSITIVE_INFINITY ||
+      time == POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY) {
+      Long.MaxValue

Review Comment:
   and what does the pgsql JDBC client return? Same as here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255078843


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   To simplify, Long.MaxValue should be the min value in microseconds to not overflow, is that right @mingkangli-db?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   @yaooqinn, @mingkangli-db, @cloud-fan
   
   `ts.getTime()` returns epoch in milliseconds, `new Timestamp(ts)` takes time in **milliseconds** , and `LocalDateTime.toEpochSecond()` returns values in **seconds**.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`
   
   Otherwise:
   ```scala
   val tsLong = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
   val ts = new Timestamp(tsLong)
   ```
   
   gives:
   ```
   tsLong: Long = 253402300799
   ts: java.sql.Timestamp = 1978-01-11 22:31:40.799
   ```
   not `9999-12-31`
   
   and `time` is used later to create a new instance of `Timestamp`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254918704


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime
+
+    if (time == POSTGRESQL_DATE_POSITIVE_INFINITY ||
+      time == POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY) {
+      Long.MaxValue

Review Comment:
   It will be displayed as "+infinity" or "-infinity". See: [here](https://www.postgresql.org/docs/current/datatype-datetime.html) "The values infinity and -infinity are specially represented inside the system and will be displayed unchanged"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255065980


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   Then it is the same range for java Timestamp.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254909622


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime
+
+    if (time == POSTGRESQL_DATE_POSITIVE_INFINITY ||
+      time == POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY) {
+      Long.MaxValue

Review Comment:
   If we query a infinite timestamp column in pgsql, what does pgsql display?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   @yaooqinn, @mingkangli-db, @cloud-fan
   
   `ts.getTime()` returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`
   
   Otherwise:
   ```scala
   val tsLong = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
   val ts = new Timestamp(tsLong)
   ```
   
   gives:
   ```
   tsLong: Long = 253402300799
   ts: java.sql.Timestamp = 1978-01-11 22:31:40.799
   ```
   not `9999-12-31`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on a diff in pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1253902230


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,29 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+
+    // variable names come from PostgreSQL "constant field docs":

Review Comment:
   nit: Please update to this: `// Variable names come from ...`



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,29 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+

Review Comment:
   nit: no new line is needed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1277904958


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala:
##########
@@ -432,4 +437,18 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite {
     assert(row(0).getSeq[String](0) == Seq("1", "fds", "fdsa"))
     assert(row(0).getString(1) == "fdasfasdf")
   }
+
+  test("SPARK-44280: infinity timestamp test") {
+    val df = sqlContext.read.jdbc(jdbcUrl, "infinity_timestamp", new Properties)

Review Comment:
   The goal here is upon reading one of the infinity timestamps in Postgresql, it is cast into reasonable values in Spark SQL instead of throwing an overflow error. However, from the other direction, since there is no built-in "infinity" values in Spark SQL, we can't write an infinity timestamp value to Postgresql, so I don't think a roundtrip test would be possible here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   @yaooqinn, @mingkangli-db, @cloud-fan
   
   `ts.getTime()` returns epoch in milliseconds, `new Timestamp(ts)` takes time in **milliseconds** , and `LocalDateTime.toEpochSecond()` returns values in **seconds**.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`
   
   Otherwise:
   ```scala
   val tsLong = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
   val ts = new Timestamp(tsLong)
   ```
   
   gives:
   ```
   tsLong: Long = 253402300799
   ts: java.sql.Timestamp = 1978-01-11 22:31:40.799
   ```
   not `9999-12-31`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   `ts.getTime()' returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1273019471


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -105,10 +105,19 @@ abstract class JdbcDialect extends Serializable with Logging {
    */
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
+  /**
+   * Converts an instance of `java.sql.Timestamp` to a custom `java.sql.Timestamp` value.
+   * @param t represents a specific instant in time based on
+   *          the hybrid calendar which combines Julian and
+   *          Gregorian calendars.
+   * @throws IllegalArgumentException if t is null

Review Comment:
   can we add a `@return` doc?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1251523944


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -105,10 +105,30 @@ abstract class JdbcDialect extends Serializable with Logging {
    */
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
+  /**
+   * Converts an instance of `java.sql.Timestamp` to the number of microseconds since
+   * 1970-01-01T00:00:00.000000Z. It extracts date-time fields from the input, builds
+   * a local timestamp in Proleptic Gregorian calendar from the fields, and binds
+   * the timestamp to the system time zone. The resulted instant is converted to
+   * microseconds since the epoch.
+   * JDBC dialects can override this function to provide implementations that suit their
+   * JDBC drivers (e.g. if there are special "infinity" values that would overflow)
+   *
+   * @param t represents a specific instant in time based on
+   *          the hybrid calendar which combines Julian and
+   *          Gregorian calendars.
+   * @return The number of micros since epoch from `java.sql.Timestamp`.
+   * @throws IllegalArgumentException if t is null
+   */
+  def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    require(t != null, "timestamp must be non-null")

Review Comment:
   Is this an user-facing error? If so, please, add an error class to `error-classes.json`. See for instance:
   https://github.com/apache/spark/blob/7bc28d54f83261b16eaa11201a7987d8d2c8dd1e/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala#L478



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254927867


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   So the range of java `Timestamp` is larger than Spark SQL timestamp?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1622948319

   @mingkangli-db Please update the title of the PR. It is not [CORE] but [SQL].


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255069403


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   The problem is that java.sql timestamps are measured with millisecond accuracy (from Long.MinValue milliseconds to Long.MaxValue milliseconds), [see here](https://docs.oracle.com/javase/8/docs/api/java/sql/Timestamp.html) while Spark timestamps are measured at microseconds accuracy. So we would get an overflow exception when we call MultiplyExact by 1000 in Java. 
   
   The stacktrace would look something like this:
   at java.lang.Math.multiplyExact(Math.java:892)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.millisToMicros(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestampNoRebase(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestamp(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$15(JdbcUtils.scala:xxx)
   	
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   @yaooqinn, @mingkangli-db, @cloud-fan
   
   `ts.getTime()' returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`
   
   Otherwise:
   ```scala
   val tsLong = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
   val ts = new Timestamp(tsLong)
   ```
   
   gives:
   ```
   tsLong: Long = 253402300799
   ts: java.sql.Timestamp = 1978-01-11 22:31:40.799
   ```
   not `9999-12-31`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yaooqinn commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "yaooqinn (via GitHub)" <gi...@apache.org>.
yaooqinn commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1622942615

   Can we provide more information about the overflow issue?IIRC, Spark doesn't accommodate infinite timestamps. It may be more appropriate to fail on overflow or unsupported values instead of converting them to maximum values. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255069403


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   The problem is that java.sql timestamps are measured with millisecond accuracy (from Long.MinValue milliseconds to Long.MaxValue milliseconds), [see here](https://docs.oracle.com/javase/8/docs/api/java/sql/Timestamp.html) while Spark timestamps are measured at microseconds accuracy. So we would get an overflow exception when we call MultiplyExact in Java. 
   
   The stacktrace would look something like this:
   at java.lang.Math.multiplyExact(Math.java:892)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.millisToMicros(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestampNoRebase(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestamp(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$15(JdbcUtils.scala:xxx)
   	
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255069403


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   The problem is that java.sql timestamps are measured with millisecond accuracy (from Long.MinValue milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured at microseconds accuracy. So we would get an overflow exception when we call MultiplyExact in Java. 
   
   The stacktrace would look something like this:
   at java.lang.Math.multiplyExact(Math.java:892)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.millisToMicros(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestampNoRebase(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestamp(DateTimeUtils.scala:xxx)
   	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$15(JdbcUtils.scala:xxx)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255007503


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   From Long.MinValue microseconds before UTC epoch to Long.MaxValue microseconds after UTC epoch.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1661595531

   thanks, merging to master/3.5 (fix pgsql dialect)!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API
URL: https://github.com/apache/spark/pull/41843


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] yaooqinn commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "yaooqinn (via GitHub)" <gi...@apache.org>.
yaooqinn commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1274304599


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala:
##########
@@ -432,4 +437,18 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite {
     assert(row(0).getSeq[String](0) == Seq("1", "fds", "fdsa"))
     assert(row(0).getString(1) == "fdasfasdf")
   }
+
+  test("SPARK-44280: infinity timestamp test") {
+    val df = sqlContext.read.jdbc(jdbcUrl, "infinity_timestamp", new Properties)

Review Comment:
   Can we also use the write API to play a roundtrip here?



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala:
##########
@@ -432,4 +437,18 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite {
     assert(row(0).getSeq[String](0) == Seq("1", "fds", "fdsa"))
     assert(row(0).getString(1) == "fdasfasdf")
   }
+
+  test("SPARK-44280: infinity timestamp test") {
+    val df = sqlContext.read.jdbc(jdbcUrl, "infinity_timestamp", new Properties)

Review Comment:
   Can we also use the write API to play a roundtrip here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575858


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +281,34 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * java.sql timestamps are measured with millisecond accuracy (from Long.MinValue
+   * milliseconds to Long.MaxValue milliseconds), while Spark timestamps are measured
+   * at microseconds accuracy. For the "infinity values" in PostgreSQL (represented by
+   * big constants), we need clamp them to avoid overflow. If it is not one of the infinity
+   * values, fall back to default behavior.
+   */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Timestamp = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    val time = t.getTime

Review Comment:
   `ts.getTime()' returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   e.g. `* 1000`
   
   Otherwise:
   ```scala
   val tsLong = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
   val ts = new Timestamp(tsLong)
   ```
   
   gives:
   ```
   tsLong: Long = 253402300799
   ts: java.sql.Timestamp = 1978-01-11 22:31:40.799
   ```
   not `9999-12-31`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1622952789

   @yaooqinn Additional information is provided in the PR description: if Postgres table contains infinity timestamps, users cannot read such tables in Spark. 
   
   I don't think failing such queries is an option - they already fail even without this change, albeit with an overflow error. Postgres driver handles infinity values as mentioned in the PR, Spark simply fails to convert them. This PR adds the ability to bypass Spark conversion for such values.
   
   ```scala
   org.apache.spark.sql.catalyst.util.DateTimeUtils.fromJavaTimestamp(new java.sql.Timestamp(-9223372036832400000L))
   ```
   would fail with 
   ```
   java.lang.ArithmeticException: long overflow
     java.lang.Math.multiplyExact(Math.java:892)
     org.apache.spark.sql.catalyst.util.DateTimeUtils$.millisToMicros(DateTimeUtils.scala:257)
     org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestampNoRebase(DateTimeUtils.scala:209)
     org.apache.spark.sql.catalyst.util.DateTimeUtils$.fromJavaTimestamp(DateTimeUtils.scala:199)
   ```
   
   IMHO, it could be beneficial to refactor code and add a special conversion function since other JDBC dialects could potentially require timestamp handling, like in this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254474150


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -105,10 +105,30 @@ abstract class JdbcDialect extends Serializable with Logging {
    */
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
+  /**
+   * Converts an instance of `java.sql.Timestamp` to the number of microseconds since
+   * 1970-01-01T00:00:00.000000Z. It extracts date-time fields from the input, builds
+   * a local timestamp in Proleptic Gregorian calendar from the fields, and binds
+   * the timestamp to the system time zone. The resulted instant is converted to
+   * microseconds since the epoch.
+   * JDBC dialects can override this function to provide implementations that suit their
+   * JDBC drivers (e.g. if there are special "infinity" values that would overflow)
+   *
+   * @param t represents a specific instant in time based on
+   *          the hybrid calendar which combines Julian and
+   *          Gregorian calendars.
+   * @return The number of micros since epoch from `java.sql.Timestamp`.
+   * @throws IllegalArgumentException if t is null
+   */
+  def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    require(t != null, "timestamp must be non-null")

Review Comment:
   It is not an user facing error but is a precondition: the timestamp passed into this function will never be null. I removed the null check because a similar function `convertJavaTimestampToTimestampNTZ` in JdbcDialects.scala does not have the check either.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadikovi commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "sadikovi (via GitHub)" <gi...@apache.org>.
sadikovi commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1254932046


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   What is the Spark SQL timestamp range?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255067195


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   then how does the overflow happen? because the calendar is different?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255081499


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   Yes, if you mean Long.MaxValue itself is the maximum value in microseconds that can be stored to not cause overflow. I added some comments in PostgresDialect.scala, hopefully this would make it clearer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mingkangli-db commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "mingkangli-db (via GitHub)" <gi...@apache.org>.
mingkangli-db commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1255081499


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala:
##########
@@ -281,4 +282,28 @@ private object PostgresDialect extends JdbcDialect with SQLConfHelper {
     }
     s"ALTER TABLE ${getFullyQualifiedQuotedTableName(oldTable)} RENAME TO ${newTable.name()}"
   }
+
+  /**
+   * PostgreSQL has four special "infinity values" that we need clamp to avoid overflow.
+   * If it is not one of the infinity values, fall back to default behavior. */
+  override def convertJavaTimestampToTimestamp(t: Timestamp): Long = {
+    // Variable names come from PostgreSQL "constant field docs":
+    // https://jdbc.postgresql.org/documentation/publicapi/index.html?constant-values.html
+    val POSTGRESQL_DATE_NEGATIVE_INFINITY = -9223372036832400000L
+    val POSTGRESQL_DATE_NEGATIVE_SMALLER_INFINITY = -185543533774800000L
+    val POSTGRESQL_DATE_POSITIVE_INFINITY = 9223372036825200000L
+    val POSTGRESQL_DATE_DATE_POSITIVE_SMALLER_INFINITY = 185543533774800000L
+
+    val time = t.getTime

Review Comment:
   Yes, if you mean Long.MaxValue itself is the maximum value in microseconds that can be stored to not cause overflow.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #41843: [SPARK-44280][CORE] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #41843:
URL: https://github.com/apache/spark/pull/41843#issuecomment-1621186262

   cc @sadikovi FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #41843: [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1273019605


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala:
##########
@@ -105,10 +105,19 @@ abstract class JdbcDialect extends Serializable with Logging {
    */
   def getJDBCType(dt: DataType): Option[JdbcType] = None
 
+  /**
+   * Converts an instance of `java.sql.Timestamp` to a custom `java.sql.Timestamp` value.
+   * @param t represents a specific instant in time based on
+   *          the hybrid calendar which combines Julian and
+   *          Gregorian calendars.
+   * @throws IllegalArgumentException if t is null
+   */

Review Comment:
   add `@Since("3.5.0")`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-44280][SQL] Add convertJavaTimestampToTimestamp in JDBCDialect API [spark]

Posted by "yruslan (via GitHub)" <gi...@apache.org>.
yruslan commented on code in PR #41843:
URL: https://github.com/apache/spark/pull/41843#discussion_r1530575246


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala:
##########
@@ -432,4 +437,18 @@ class PostgresIntegrationSuite extends DockerJDBCIntegrationSuite {
     assert(row(0).getSeq[String](0) == Seq("1", "fds", "fdsa"))
     assert(row(0).getString(1) == "fdasfasdf")
   }
+
+  test("SPARK-44280: infinity timestamp test") {
+    val df = sqlContext.read.jdbc(jdbcUrl, "infinity_timestamp", new Properties)
+    val row = df.collect()
+
+    assert(row.length == 2)
+    val infinity = row(0).getAs[Timestamp]("timestamp_column")
+    val negativeInfinity = row(1).getAs[Timestamp]("timestamp_column")
+    val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC)
+    val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC)
+
+    assert(infinity.getTime == maxTimestamp)
+    assert(negativeInfinity.getTime == minTimeStamp)

Review Comment:
   `ts.getTime()' returns epoch in milliseconds, and `LocalDateTime.toEpochSecond()` returns values in seconds.
   
   Shouldn't this be
   ```scala
       val minTimeStamp = LocalDateTime.of(1, 1, 1, 0, 0, 0).toEpochSecond(ZoneOffset.UTC) * 1000
       val maxTimestamp = LocalDateTime.of(9999, 12, 31, 23, 59, 59).toEpochSecond(ZoneOffset.UTC) * 1000
   ```
   
   Otherwise, `convertJavaTimestampToTimestamp()` does not return infinites as expected



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org