You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/07/09 16:39:18 UTC

[GitHub] [hudi] sbernauer commented on a change in pull request #1760: [HUDI-1040] Update apis for spark3 compatibility

sbernauer commented on a change in pull request #1760:
URL: https://github.com/apache/hudi/pull/1760#discussion_r452348820



##########
File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionUtils.scala
##########
@@ -78,4 +79,21 @@ object AvroConversionUtils {
   def convertAvroSchemaToStructType(avroSchema: Schema): StructType = {
     SchemaConverters.toSqlType(avroSchema).dataType.asInstanceOf[StructType]
   }
+
+  private def deserializeRow(encoder: ExpressionEncoder[Row], internalRow: InternalRow): Row = {
+    // First attempt to use spark2 API for deserialization, otherwise attempt with spark3 API
+    try {
+      val spark2method = encoder.getClass.getMethods.filter(method => method.getName.equals("fromRow")).last
+      spark2method.invoke(encoder, internalRow).asInstanceOf[Row]
+    } catch {
+      case e: NoSuchElementException => spark3Deserialize(encoder, internalRow)

Review comment:
       I think `org.apache.spark.SPARK_VERSION` could help us out here. I've implemented it in this commit: https://github.com/sbernauer/hudi/commit/a4f1866f5be56639958479e9a597ae8c4d3d8f4f. But I noticed that it had a huge performance impact (I assume due to the reflection)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org