You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/07 14:12:53 UTC

[GitHub] [spark] srowen commented on a diff in pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

srowen commented on code in PR #36499:
URL: https://github.com/apache/spark/pull/36499#discussion_r891287048


##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala:
##########
@@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect {
   override def getLimitClause(limit: Integer): String = {
     ""
   }
+
+  override def getCatalystType(
+    sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
+    if (sqlType == Types.NUMERIC) {

Review Comment:
   OK, now down to nits. I would use `sqlType match {` for consistency. Also return, `Some(...)` when the argument is definitely non-null, like a constant, as in all cases here



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala:
##########
@@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect {
   override def getLimitClause(limit: Integer): String = {
     ""
   }
+
+  override def getCatalystType(
+    sqlType: Int, typeName: String, size: Int, md: MetadataBuilder): Option[DataType] = {
+    if (sqlType == Types.NUMERIC) {
+      if (md == null) {
+        Option(DecimalType.SYSTEM_DEFAULT)
+      } else {
+        val scale = md.build().getLong("scale")
+        // In Teradata, Number or Number(*) means precision and scale is flexible.
+        // However, the scale returned from JDBC is 0, which leads to fractional part loss.
+        // Handle this special case by adding explicit conversion to system default decimal type.
+        // Note, even if the NUMBER is defined with explicit precision and scale like NUMBER(20, 0),
+        // Spark will treat it as DecimalType.SYSTEM_DEFAULT, which is NUMBER(38,18)
+        if (scale == 0) {
+          Option(DecimalType.SYSTEM_DEFAULT)
+        } else {
+          // In Teradata, Number(*, scale) will return size, namely precision, as 40,
+          // which conflicts to DecimalType.MAX_PRECISION
+          Option(DecimalType(Math.min(size, DecimalType.MAX_PRECISION), scale.toInt))

Review Comment:
   What if precision = 40 and scale = 0? do we need to entertain that possibility, or is scale=0 always going to mean default precision too?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org