You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/03/11 06:45:00 UTC

[jira] [Resolved] (SPARK-31042) Error in writing a pyspark streaming dataframe created from Kafka source to a csv file

     [ https://issues.apache.org/jira/browse/SPARK-31042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-31042.
----------------------------------
    Resolution: Invalid

Sounds more like a question. Seems such cases are being tested in Spark side. Please ask questions to the mailing list (see https://spark.apache.org/community.html)

> Error in writing a pyspark streaming dataframe created from Kafka source to a csv file 
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-31042
>                 URL: https://issues.apache.org/jira/browse/SPARK-31042
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Structured Streaming
>    Affects Versions: 2.4.5
>            Reporter: Suchintak Patnaik
>            Priority: Major
>
> While writing a streaming dataframe created from Kafka source to a csv file gives following error in PySpark.
> NOTE : The same streaming dataframe is getting displayed in the console.
> sdf.writeStream.format("console").start().awaitTermination()  // Working
> sdf.writeStream\
>     .format("csv")\
>     .option("path", "C://output")\
>     .option("checkpointLocation", "C://Checkpoint")\
>     .outputMode("append")\
>     .start().awaitTermination()    // Not working
> Error
> ---------
>  *File "C:\Spark\python\pyspark\sql\utils.py", line 63, in deco
>     return f(*a, **kw)
>   File "C:\Spark\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling o63.awaitTermination.
> : org.apache.spark.sql.streaming.StreamingQueryException: Expected e.g. {"topicA":{"0":23,"1":-1},"topicB":{"0":-2}}, got {"logOffset":1}
> === Streaming Query ===
> Identifier: [id = 6718625c-489e-44c8-b273-0da3429e97a8, runId = b64887ba-ca32-499e-9ab5-f839fd44ec26]
> Current Committed Offsets: {KafkaV2[Subscribe[test1]]: {"logOffset":1}}
> Current Available Offsets: {KafkaV2[Subscribe[test1]]: {"logOffset":1}}
> Current State: ACTIVE
> Thread State: RUNNABLE*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org