You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "michaelzhan-db (via GitHub)" <gi...@apache.org> on 2023/10/16 22:27:21 UTC

[PR] [SPARK-45561] Add proper conversions for SMALLINT and TINYINT in MySQLDialect [spark]

michaelzhan-db opened a new pull request, #43390:
URL: https://github.com/apache/spark/pull/43390

   ### What changes were proposed in this pull request?
   
   Change MySql Dialect to convert catalyst TINYINT and SMALLINT into MySQL TINYINT and SMALLINT rather than BYTE and INTEGER. BYTE does not exist in MySQL.
   
   ### Why are the changes needed?
   
   Since BYTE type does not exist in MySQL, any casts that could be pushed down involving BYTE type would fail.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   UT pass.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1367683819


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -43,7 +43,8 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     override val usesIpc = false
     override val jdbcPort: Int = 3306
     override def getJdbcUrl(ip: String, port: Int): String =
-      s"jdbc:mysql://$ip:$port/mysql?user=root&password=rootpass"
+      s"jdbc:mysql://$ip:$port/" +
+        s"mysql?user=root&password=rootpass"

Review Comment:
   Please restore the two lines.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43390:
URL: https://github.com/apache/spark/pull/43390#issuecomment-1765562203

   https://github.com/apache/spark/pull/43232 mapped `TINYINT` to `ShortType` for `MsSqlServerDialect`.
   It looks very interesting. The behavior of these two PRs is somewhat opposite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43390:
URL: https://github.com/apache/spark/pull/43390#issuecomment-1765567144

   cc @gengliangwang @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on PR #43390:
URL: https://github.com/apache/spark/pull/43390#issuecomment-1765565748

   @michaelzhan-db Could you add tests? Please refer https://github.com/apache/spark/pull/43232


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561][SQL] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #43390: [SPARK-45561][SQL] Add proper conversions for TINYINT in MySQLDialect
URL: https://github.com/apache/spark/pull/43390


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1364796827


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -54,10 +55,10 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     conn.prepareStatement("INSERT INTO tbl VALUES (42,'fred')").executeUpdate()
     conn.prepareStatement("INSERT INTO tbl VALUES (17,'dave')").executeUpdate()
 
-    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), "
+    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), tiny TINYINT, "

Review Comment:
   We can reduce the change if we add it at the end.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1361577350


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala:
##########
@@ -128,6 +128,7 @@ private object MsSqlServerDialect extends JdbcDialect {
     case BinaryType => Some(JdbcType("VARBINARY(MAX)", java.sql.Types.VARBINARY))
     case ShortType if !SQLConf.get.legacyMsSqlServerNumericMappingEnabled =>
       Some(JdbcType("SMALLINT", java.sql.Types.SMALLINT))
+    case ByteType => Option(JdbcType("TINYINT", java.sql.Types.TINYINT))

Review Comment:
   ![image](https://github.com/apache/spark/assets/8486025/0bcb907b-3855-46c7-aecb-e9cdd4ca441f)
   
   @michaelzhan-db The range of MsSqlServer's TINYINT from 0 to 255. But Spark ByteType could represent the negative value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561][SQL] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #43390:
URL: https://github.com/apache/spark/pull/43390#issuecomment-1776886315

   +1, LGTM. Merging to master/3.5.
   Thank you, @michaelzhan-db and @yaooqinn @beliefer @cloud-fan for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1364796827


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -54,10 +55,10 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     conn.prepareStatement("INSERT INTO tbl VALUES (42,'fred')").executeUpdate()
     conn.prepareStatement("INSERT INTO tbl VALUES (17,'dave')").executeUpdate()
 
-    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), "
+    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), tiny TINYINT, "

Review Comment:
   We can reduce the change if we add it at the end. It is easy to review.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1366347361


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -43,7 +43,8 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     override val usesIpc = false
     override val jdbcPort: Int = 3306
     override def getJdbcUrl(ip: String, port: Int): String =
-      s"jdbc:mysql://$ip:$port/mysql?user=root&password=rootpass"
+      s"jdbc:mysql://$ip:$port/" +
+        s"mysql?user=root&password=rootpass&allowPublicKeyRetrieval=true&useSSL=false"

Review Comment:
   Why add `allowPublicKeyRetrieval` and `useSSL`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "michaelzhan-db (via GitHub)" <gi...@apache.org>.
michaelzhan-db commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1364338014


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -54,10 +55,10 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     conn.prepareStatement("INSERT INTO tbl VALUES (42,'fred')").executeUpdate()
     conn.prepareStatement("INSERT INTO tbl VALUES (17,'dave')").executeUpdate()
 
-    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), "
+    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), tiny TINYINT, "

Review Comment:
   I wanted to maintain the rough order of increasing size of numeric type but I am ok with adding it at the end instead. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1363047316


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MySQLDialect.scala:
##########
@@ -93,8 +93,8 @@ private case object MySQLDialect extends JdbcDialect with SQLConfHelper {
       // byte arrays instead of longs.
       md.putLong("binarylong", 1)
       Option(LongType)
-    } else if (sqlType == Types.BIT && typeName.equals("TINYINT")) {
-      Option(BooleanType)

Review Comment:
   Please don't remove the two lines.
   I don't know the original intention of these two lines of code.



##########
sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala:
##########
@@ -912,8 +912,8 @@ class JDBCSuite extends QueryTest with SharedSparkSession {
       Some(LongType))
     assert(metadata.build().contains("binarylong"))
     assert(mySqlDialect.getCatalystType(java.sql.Types.VARBINARY, "BIT", 1, metadata) == None)
-    assert(mySqlDialect.getCatalystType(java.sql.Types.BIT, "TINYINT", 1, metadata) ==
-      Some(BooleanType))

Review Comment:
   Please don't remove the two lines.



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -54,10 +55,10 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     conn.prepareStatement("INSERT INTO tbl VALUES (42,'fred')").executeUpdate()
     conn.prepareStatement("INSERT INTO tbl VALUES (17,'dave')").executeUpdate()
 
-    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), "
+    conn.prepareStatement("CREATE TABLE numbers (onebit BIT(1), tenbits BIT(10), tiny TINYINT, "

Review Comment:
   Shall we add `tiny TINYINT` at last?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "michaelzhan-db (via GitHub)" <gi...@apache.org>.
michaelzhan-db commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1362518741


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala:
##########
@@ -128,6 +128,7 @@ private object MsSqlServerDialect extends JdbcDialect {
     case BinaryType => Some(JdbcType("VARBINARY(MAX)", java.sql.Types.VARBINARY))
     case ShortType if !SQLConf.get.legacyMsSqlServerNumericMappingEnabled =>
       Some(JdbcType("SMALLINT", java.sql.Types.SMALLINT))
+    case ByteType => Option(JdbcType("TINYINT", java.sql.Types.TINYINT))

Review Comment:
   My bad, I have been too hasty in creating the PR. I will update the PR and JIRA to just MySQL. Thanks for catching that!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect and MsSqlServerDialect [spark]

Posted by "yaooqinn (via GitHub)" <gi...@apache.org>.
yaooqinn commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1361484127


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala:
##########
@@ -128,6 +128,7 @@ private object MsSqlServerDialect extends JdbcDialect {
     case BinaryType => Some(JdbcType("VARBINARY(MAX)", java.sql.Types.VARBINARY))
     case ShortType if !SQLConf.get.legacyMsSqlServerNumericMappingEnabled =>
       Some(JdbcType("SMALLINT", java.sql.Types.SMALLINT))
+    case ByteType => Option(JdbcType("TINYINT", java.sql.Types.TINYINT))

Review Comment:
    Is Ms SQL TINYINT sufficient for negative values of spark's ByteType?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561][SQL] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #43390:
URL: https://github.com/apache/spark/pull/43390#issuecomment-1776615074

   @michaelzhan-db Does Spark 3.4 suffer from the same issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45561] Add proper conversions for TINYINT in MySQLDialect [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #43390:
URL: https://github.com/apache/spark/pull/43390#discussion_r1368202786


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MySQLIntegrationSuite.scala:
##########
@@ -43,7 +43,8 @@ class MySQLIntegrationSuite extends DockerJDBCIntegrationSuite {
     override val usesIpc = false
     override val jdbcPort: Int = 3306
     override def getJdbcUrl(ip: String, port: Int): String =
-      s"jdbc:mysql://$ip:$port/mysql?user=root&password=rootpass"
+      s"jdbc:mysql://$ip:$port/" +
+        s"mysql?user=root&password=rootpass"

Review Comment:
   ```suggestion
         s"jdbc:mysql://$ip:$port/mysql?user=root&password=rootpass"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org