You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "geekyouth (Jira)" <ji...@apache.org> on 2021/07/12 02:55:00 UTC

[jira] [Updated] (KAFKA-13061) spark kafka offset missed some partition offset describle

     [ https://issues.apache.org/jira/browse/KAFKA-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

geekyouth updated KAFKA-13061:
------------------------------
    Description: 
Caused by: org.apache.spark.sql.streaming.StreamingQueryException: Set(rcf_adidas_wow_room_event-9, rcf_adidas_wow_room_event-8, rcf_adidas_wow_room_event-7, rcf_adidas_wow_room_event-6, rcf_adidas_wow_room_event-5) are gone. Some data may have been missed.
 Some data may have been lost because they are not available in Kafka any more; either the
 data was aged out by Kafka or the topic may have been deleted before all the data in the
 topic was processed. If you don't want your streaming query to fail on such cases, set the
 source option "failOnDataLoss" to "false".
 === Streaming Query ===
 Identifier: [id = b0e0f54e-e638-430e-9912-751234a6ce13, runId = 570e950b-edcb-40c4-99ea-fd4db958b225]
 Current Committed Offsets: {KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":

{"8":3572,"2":3571,"5":3571,"4":3571,"7":3571,"1":3571,"9":3571,"3":3571,"6":3571,"0":3571}

}}
 Current Available Offsets: {KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":

{"2":3571,"4":3571,"1":3571,"3":3571,"0":3571}

}}

 

—

My topic has 10 partitions , but the last offset dir records only 5 partitions.

After i delete the last error offset dir , the query runs success.

 

But i do not understand what cause this error.

 

 

  was:
Caused by: org.apache.spark.sql.streaming.StreamingQueryException: Set(rcf_adidas_wow_room_event-9, rcf_adidas_wow_room_event-8, rcf_adidas_wow_room_event-7, rcf_adidas_wow_room_event-6, rcf_adidas_wow_room_event-5) are gone. Some data may have been missed.
Some data may have been lost because they are not available in Kafka any more; either the
data was aged out by Kafka or the topic may have been deleted before all the data in the
topic was processed. If you don't want your streaming query to fail on such cases, set the
source option "failOnDataLoss" to "false".
=== Streaming Query ===
Identifier: [id = b0e0f54e-e638-430e-9912-751234a6ce13, runId = 570e950b-edcb-40c4-99ea-fd4db958b225]
Current Committed Offsets: \{KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":{"8":3572,"2":3571,"5":3571,"4":3571,"7":3571,"1":3571,"9":3571,"3":3571,"6":3571,"0":3571}}}
Current Available Offsets: \{KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":{"2":3571,"4":3571,"1":3571,"3":3571,"0":3571}}}

 

---

my topic has 10 partitions , but the offset dir records only 5 partitions.

after i delete the last error offset dir , the query runs success.

 

but i do not understand what cause this error.

 

 


> spark kafka offset missed some partition offset describle
> ---------------------------------------------------------
>
>                 Key: KAFKA-13061
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13061
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.0.0
>            Reporter: geekyouth
>            Priority: Critical
>
> Caused by: org.apache.spark.sql.streaming.StreamingQueryException: Set(rcf_adidas_wow_room_event-9, rcf_adidas_wow_room_event-8, rcf_adidas_wow_room_event-7, rcf_adidas_wow_room_event-6, rcf_adidas_wow_room_event-5) are gone. Some data may have been missed.
>  Some data may have been lost because they are not available in Kafka any more; either the
>  data was aged out by Kafka or the topic may have been deleted before all the data in the
>  topic was processed. If you don't want your streaming query to fail on such cases, set the
>  source option "failOnDataLoss" to "false".
>  === Streaming Query ===
>  Identifier: [id = b0e0f54e-e638-430e-9912-751234a6ce13, runId = 570e950b-edcb-40c4-99ea-fd4db958b225]
>  Current Committed Offsets: {KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":
> {"8":3572,"2":3571,"5":3571,"4":3571,"7":3571,"1":3571,"9":3571,"3":3571,"6":3571,"0":3571}
> }}
>  Current Available Offsets: {KafkaV2[Subscribe[rcf_adidas_wow_room_event]]: {"rcf_adidas_wow_room_event":
> {"2":3571,"4":3571,"1":3571,"3":3571,"0":3571}
> }}
>  
> —
> My topic has 10 partitions , but the last offset dir records only 5 partitions.
> After i delete the last error offset dir , the query runs success.
>  
> But i do not understand what cause this error.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)