You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "David Arthur (Jira)" <ji...@apache.org> on 2021/07/08 16:39:00 UTC
[jira] [Created] (KAFKA-13050) Race between controller creating
snapshot and snapshot cleaning
David Arthur created KAFKA-13050:
------------------------------------
Summary: Race between controller creating snapshot and snapshot cleaning
Key: KAFKA-13050
URL: https://issues.apache.org/jira/browse/KAFKA-13050
Project: Kafka
Issue Type: Bug
Components: controller, kraft
Affects Versions: 3.0.0
Reporter: David Arthur
If the controller attempts to take a snapshot with its cached OffsetAndEpoch while snapshot cleaning is happening, it is possible for the OffsetAndEpoch to be invalidated due to truncation.
{code}
[2021-07-08 12:12:41,938] WARN [Controller 1] org.apache.kafka.controller.QuorumController@67e0d836: failed with unknown server exception IllegalArgumentException at epoch -1 in 3207460 us. Reverting to last committed offset 98. (org.apache.kafka.controller.QuorumController)
java.lang.IllegalArgumentException: Snapshot id (OffsetAndEpoch(offset=99, epoch=5)) is not valid according to the log: ValidOffsetAndEpoch(kind=SNAPSHOT, offsetAndEpoch=OffsetAndEpoch(offset=180, epoch=8))
at kafka.raft.KafkaMetadataLog.createNewSnapshot(KafkaMetadataLog.scala:252)
at org.apache.kafka.raft.KafkaRaftClient.lambda$createSnapshot$30(KafkaRaftClient.java:2334)
at org.apache.kafka.snapshot.SnapshotWriter.createWithHeader(SnapshotWriter.java:134)
at org.apache.kafka.raft.KafkaRaftClient.createSnapshot(KafkaRaftClient.java:2333)
at org.apache.kafka.controller.QuorumController$SnapshotGeneratorManager.createSnapshotGenerator(QuorumController.java:351)
at org.apache.kafka.controller.QuorumController.checkSnapshotGeneration(QuorumController.java:904)
at org.apache.kafka.controller.QuorumController.access$3000(QuorumController.java:121)
at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:681)
at org.apache.kafka.controller.QuorumController$ControlEvent.run(QuorumController.java:311)
at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
at java.lang.Thread.run(Thread.java:748)
[2021-07-08 12:12:41,941] INFO [BrokerMetadataListener id=1] Loading snapshot 180-8. (kafka.server.metadata.BrokerMetadataListener)
{code}
This was observed while running a broker in combined mode with artificially low values for snapshot generation and cleaning.
{code}
metadata.log.max.record.bytes.between.snapshots=100
metadata.log.segment.bytes=1024
metadata.max.retention.bytes=4096
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)