You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "pengzhon-db (via GitHub)" <gi...@apache.org> on 2023/05/22 17:13:16 UTC

[GitHub] [spark] pengzhon-db commented on a diff in pull request #41192: [SPARK-43530][PROTOBUF] Read descriptor file only once

pengzhon-db commented on code in PR #41192:
URL: https://github.com/apache/spark/pull/41192#discussion_r1200776716


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/protobuf/functions.scala:
##########
@@ -199,4 +301,16 @@ object functions {
       options: java.util.Map[String, String]): Column = {
     fnWithOptions("to_protobuf", options.asScala.iterator, data, lit(messageClassName))
   }
+
+  private def emptyOptions: java.util.Map[String, String] = Collections.emptyMap[String, String]()
+
+  private def readDescriptorFileContent(filePath: String): Array[Byte] = {
+    // This method is copied from org.apache.spark.sql.protobuf.util.ProtobufUtils

Review Comment:
   do you mean copied from org.apache.commons.io.FileUtils?



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala:
##########
@@ -223,28 +226,28 @@ private[sql] object ProtobufUtils extends Logging {
     }
   }
 
-  private def parseFileDescriptorSet(descFilePath: String): List[Descriptors.FileDescriptor] = {
-    var fileDescriptorSet: DescriptorProtos.FileDescriptorSet = null
+  def readDescriptorFileContent(filePath: String): Array[Byte] = {

Review Comment:
   This function is duplicate of the one in connect client. Is there way to define them in a common place? I think this is a general question of connect. No need to address in this PR. 



##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/protobuf/functions.scala:
##########
@@ -199,4 +301,16 @@ object functions {
       options: java.util.Map[String, String]): Column = {
     fnWithOptions("to_protobuf", options.asScala.iterator, data, lit(messageClassName))
   }
+
+  private def emptyOptions: java.util.Map[String, String] = Collections.emptyMap[String, String]()
+
+  private def readDescriptorFileContent(filePath: String): Array[Byte] = {
+    // This method is copied from org.apache.spark.sql.protobuf.util.ProtobufUtils
+    try {
+      FileUtils.readFileToByteArray(new File(filePath))
+    } catch {
+      case NonFatal(ex) =>
+        throw QueryCompilationErrors.cannotFindDescriptorFileError(filePath, ex)

Review Comment:
   how to handle other type of exceptions, such as IOException ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org