You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chaitanya (Jira)" <ji...@apache.org> on 2020/09/09 17:53:00 UTC
[jira] [Created] (SPARK-32834) from_avro is giving empty result
Chaitanya created SPARK-32834:
---------------------------------
Summary: from_avro is giving empty result
Key: SPARK-32834
URL: https://issues.apache.org/jira/browse/SPARK-32834
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 3.0.0
Environment: Ubuntu 18
Spark 3.0
Kafka 2.0.0
Reporter: Chaitanya
I am trying to read a Kafka topic with avro avro
Code:
df = spark\
.readStream\
.format("kafka")\
.option("kafka.bootstrap.servers", "host:6667")\
.option("subscribe", "utopic1")\
.option("failOnDataLoss", "false")\
.option("startingOffsets", "earliest")\
.option("checkpointLocation", "/home/abc/wspace/spark_test/data/")\
.load()
outputDF = df\
.select(from_avro("value", jsonFormatSchema, options=\{"mode":"FASTFAIL"}).alias("user"))
outputDF.printSchema()
query = outputDF.writeStream.format("console").start()
time.sleep(10)
Input:
avro schema file: [user.avsc|https://github.com/apache/spark/raw/4ad9bfd53b84a6d2497668c73af6899bae14c187/examples/src/main/resources/user.avsc]
Kafka topic: \{'favorite_color': 'Red', 'name': 'Alyssa'}
Expected Output:
It should print values.
Actual Output:
+----+
|user|
+----+
| [,]|
+----+
Additional information:
# Searched in the internet and found that other peson faced same issue. [https://stackoverflow.com/questions/59222774/spark-from-avro-function-returning-null-values]
# I am able to print values to console if I cast to String using below line df.selectExpr("CAST(value AS STRING)")
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org