You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "David Ahern (JIRA)" <ji...@apache.org> on 2018/12/09 14:08:00 UTC
[jira] [Created] (SPARK-26314) support Confluent encoded Avro in
Spark Structured Streaming
David Ahern created SPARK-26314:
-----------------------------------
Summary: support Confluent encoded Avro in Spark Structured Streaming
Key: SPARK-26314
URL: https://issues.apache.org/jira/browse/SPARK-26314
Project: Spark
Issue Type: Improvement
Components: Structured Streaming
Affects Versions: 2.4.0
Reporter: David Ahern
As Avro has now been added as a first class citizen,
[https://spark.apache.org/docs/latest/sql-data-sources-avro.html]
please make Confluent encoded avro work out of the box with Spark Structured Streaming
as described in this link, Avro messages on Kafka encoded with confluent serializer also need to be decoded with confluent. It would be great if this worked out of the box
[https://developer.ibm.com/answers/questions/321440/ibm-iidr-cdc-db2-to-kafka.html?smartspace=blockchain]
here are details on the Confluent encoding
[https://www.sderosiaux.com/articles/2017/03/02/serializing-data-efficiently-with-apache-avro-and-dealing-with-a-schema-registry/#encodingdecoding-the-messages-with-the-schema-id]
It's been a year since i worked on anything to do with Avro and Spark Structured Streaming, but i had to take an approach such as this when getting it to work. This is what i used as a reference at that time
[https://github.com/tubular/confluent-spark-avro]
Also, here is another link i found that someone has done in the meantime
[https://github.com/AbsaOSS/ABRiS]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org