You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by karthikjay <as...@gmail.com> on 2018/03/01 00:46:53 UTC

[Structured Streaming] Handling Kakfa Stream messages with different JSON Schemas.

Hi all,

I have the following code to stream Kafka data and apply a schema called
"hbSchema" on it and then act on the data. 

val df = spark
      .readStream
      .format("kafka")
      .option("kafka.bootstrap.servers", "10.102.255.241:9092")
      .option("subscribe", "mabr_avro_json1")
      .option("failOnDataLoss", "false")
      .load()
      .selectExpr("""deserialize("avro_json1", value) AS message""")

    import spark.implicits._

    val df1 = df
      .selectExpr("cast (value as string) as json")
      .select(from_json($"message", schema=hbSchema).as("data"))
      .select("data.*")

But, what if the data in Kafka topic have different schemas ? How do I apply
different schemas based on the data ?




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org