You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/08 00:08:44 UTC

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

HyukjinKwon commented on a change in pull request #31771:
URL: https://github.com/apache/spark/pull/31771#discussion_r589110588



##########
File path: external/avro/src/main/scala/org/apache/spark/sql/avro/functions.scala
##########
@@ -65,6 +68,31 @@ object functions {
     new Column(AvroDataToCatalyst(data.expr, jsonFormatSchema, options.asScala.toMap))
   }
 
+  /**
+   * Converts a binary column of Avro format into its corresponding catalyst value.
+   * The specified subject must match actual schema of the read data, otherwise the behavior
+   * is undefined: it may fail or return arbitrary result.
+   * To deserialize the data with a compatible and evolved schema, the expected Avro schema can be
+   * set via the option avroSchema.
+   *
+   * @param data the binary column.
+   * @param subject the subject name in the schema-registry. eg. topic: t, key: t-key value: t-value
+   * @param schemaRegistryUri address of the schema-registry url
+   *
+   * @since 3.0.0
+   */
+  @throws(classOf[java.io.IOException])
+  @Experimental
+  def from_avro(
+      data: Column,
+      subject: String,
+      schemaRegistryUri: String): Column = {

Review comment:
       I would avoid adding this in an API. Can we combine both `avroSchema` and `avroSchemaUrl`? You can try `Schema.Parser().parse` first and falls back to try parsing URL if it fails. In that way, we wouldn't have to add another API.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org