You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sushmita chauhan (Jira)" <ji...@apache.org> on 2022/04/29 06:31:00 UTC
[jira] [Comment Edited] (SPARK-39064) FileNotFoundException while reading from Kafka
[ https://issues.apache.org/jira/browse/SPARK-39064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529806#comment-17529806 ]
Sushmita chauhan edited comment on SPARK-39064 at 4/29/22 6:30 AM:
-------------------------------------------------------------------
Please Guys help me to resolve the issue
was (Author: JIRAUSER288816):
Please Guys help to resolve the issue
> FileNotFoundException while reading from Kafka
> ----------------------------------------------
>
> Key: SPARK-39064
> URL: https://issues.apache.org/jira/browse/SPARK-39064
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.4.8
> Reporter: Sushmita chauhan
> Priority: Critical
> Fix For: 2.4.8
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> We are running a stateful structured streaming job which reads from Kafka and writes to HDFS. And we are hitting this exception:
>
>
> 17/12/08 05:20:12 ERROR FileFormatWriter: Aborting job null. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4, hcube1-1n03.eng.hortonworks.com, executor 1): java.lang.IllegalStateException: Error reading delta file /checkpointDir/state/0/0/1.delta of HDFSStateStoreProvider[id = (op=0, part=0), dir = /checkpointDir/state/0/0]: /checkpointDir/state/0/0/1.delta does not exist at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$updateFromDeltaFile(HDFSBackedStateStoreProvider.scala:410) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1$$anonfun$6.apply(HDFSBackedStateStoreProvider.scala:362) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1$$anonfun$6.apply(HDFSBackedStateStoreProvider.scala:359) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:359) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1.apply(HDFSBackedStateStoreProvider.scala:358) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap(HDFSBackedStateStoreProvider.scala:358) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1$$anonfun$6.apply(HDFSBackedStateStoreProvider.scala:360) at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$$anonfun$org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$loadMap$1$$anonfun$6.apply(HDFSBackedStateStoreProvider.scala:359) at scala.Option.getOrElse(Option.scala:121
>
>
>
> Of course, the file doesn't exist in HDFS. And in the {{state/0/0}} directory there is no file at all. While we have some files in the commits and offsets folders. I am not sure about the reason of this behavior. It seems to happen on the second time the job is started,
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org