You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/11/16 21:06:45 UTC

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #38384: [SPARK-40657][PROTOBUF] Require shading for Java class jar, improve error handling

HeartSaVioR commented on code in PR #38384:
URL: https://github.com/apache/spark/pull/38384#discussion_r1024506647


##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala:
##########
@@ -155,21 +155,52 @@ private[sql] object ProtobufUtils extends Logging {
    *  Loads the given protobuf class and returns Protobuf descriptor for it.
    */
   def buildDescriptorFromJavaClass(protobufClassName: String): Descriptor = {
+
+    // Default 'Message' class here is shaded while using the package (as in production).
+    // The incoming classes might not be shaded. Check both.
+    val shadedMessageClass = classOf[Message] // Shaded in prod, not in unit tests.
+    val missingShadingErrorMessage = "The jar with Protobuf classes needs to be shaded " +
+      s"(com.google.protobuf.* --> ${shadedMessageClass.getPackage.getName}.*)"
+
     val protobufClass = try {
       Utils.classForName(protobufClassName)
     } catch {
       case e: ClassNotFoundException =>
-        throw QueryCompilationErrors.protobufClassLoadError(protobufClassName, e)
+        val explanation =
+          if (protobufClassName.contains(".")) "Ensure the class include in the jar"
+          else "Ensure the class name includes package prefix"
+        throw QueryCompilationErrors.protobufClassLoadError(protobufClassName, explanation, e)
+
+      case e: NoClassDefFoundError if e.getMessage.matches("com/google/proto.*Generated.*") =>
+        // This indicates the the the Java classes are not shaded.
+        throw QueryCompilationErrors.protobufClassLoadError(
+          protobufClassName, missingShadingErrorMessage, e)
     }
 
-    if (!classOf[Message].isAssignableFrom(protobufClass)) {
-      throw QueryCompilationErrors.protobufMessageTypeError(protobufClassName)
-      // TODO: Need to support V2. This might work with V2 classes too.
+    if (!shadedMessageClass.isAssignableFrom(protobufClass)) {
+      // Check if this extends 2.x Message class included in spark, that does not work.
+      val unshadedMessageClass = Utils.classForName(
+        "com.escape-shading.google.protobuf.Message".replace("escape-shading.", "")

Review Comment:
   Do we have specific reason to not just say "com.google.protobuf.Message"? It seems to me technically the same. If it is just an informal one, shall we consider code comment instead?



##########
connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala:
##########
@@ -155,21 +155,52 @@ private[sql] object ProtobufUtils extends Logging {
    *  Loads the given protobuf class and returns Protobuf descriptor for it.
    */
   def buildDescriptorFromJavaClass(protobufClassName: String): Descriptor = {
+
+    // Default 'Message' class here is shaded while using the package (as in production).
+    // The incoming classes might not be shaded. Check both.
+    val shadedMessageClass = classOf[Message] // Shaded in prod, not in unit tests.
+    val missingShadingErrorMessage = "The jar with Protobuf classes needs to be shaded " +
+      s"(com.google.protobuf.* --> ${shadedMessageClass.getPackage.getName}.*)"
+
     val protobufClass = try {
       Utils.classForName(protobufClassName)
     } catch {
       case e: ClassNotFoundException =>
-        throw QueryCompilationErrors.protobufClassLoadError(protobufClassName, e)
+        val explanation =
+          if (protobufClassName.contains(".")) "Ensure the class include in the jar"
+          else "Ensure the class name includes package prefix"
+        throw QueryCompilationErrors.protobufClassLoadError(protobufClassName, explanation, e)
+
+      case e: NoClassDefFoundError if e.getMessage.matches("com/google/proto.*Generated.*") =>
+        // This indicates the the the Java classes are not shaded.

Review Comment:
   nit: `the the the` doesn't seem to be what we want to say. shall we correct this via follow-up PR?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org