You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rangadi (via GitHub)" <gi...@apache.org> on 2023/05/07 05:42:44 UTC

[GitHub] [spark] rangadi commented on a diff in pull request #41075: [SPARK-43361][CONNECT][PROTOBUF] spark-protobuf: allow serde with enum as ints

rangadi commented on code in PR #41075:
URL: https://github.com/apache/spark/pull/41075#discussion_r1186785845


##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufDeserializer.scala:
##########
@@ -260,6 +260,11 @@ private[sql] class ProtobufDeserializer(
       case (ENUM, StringType) =>
         (updater, ordinal, value) => updater.set(ordinal, UTF8String.fromString(value.toString))
 
+      case (ENUM, IntegerType) =>
+        (updater, ordinal, value) => {
+          updater.set(ordinal, protoType.getEnumType.findValueByName(value.toString).getNumber)

Review Comment:
   `value` itself would be of type `ProtocolMessageEnum` I think. Can do `value.asInstanceOf[ProtobufMessageEnum].getNumber()`. No need to convert to string then search.



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/ProtobufSerializer.scala:
##########
@@ -110,6 +110,19 @@ private[sql] class ProtobufSerializer(
               enumSymbols.mkString("\"", "\", \"", "\""))
           }
           fieldDescriptor.getEnumType.findValueByName(data)
+      case (IntegerType, ENUM) =>
+        val enumValues: Set[Int] =
+          fieldDescriptor.getEnumType.getValues.asScala.map(e => e.getNumber).toSet
+        (getter, ordinal) =>
+          val data = getter.getInt(ordinal)
+          if (!enumValues.contains(data)) {
+            throw QueryCompilationErrors.cannotConvertCatalystTypeToProtobufEnumTypeError(

Review Comment:
   This is not type conversion error. Also it occurs at runtime, not at compile time. Check for other errors to see there is a better suited one. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org