You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "iBlackeyes (Jira)" <ji...@apache.org> on 2021/03/16 03:36:00 UTC

[jira] [Comment Edited] (KAFKA-9156) LazyTimeIndex & LazyOffsetIndex may cause niobufferoverflow in concurrent state

    [ https://issues.apache.org/jira/browse/KAFKA-9156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302197#comment-17302197 ] 

iBlackeyes edited comment on KAFKA-9156 at 3/16/21, 3:35 AM:
-------------------------------------------------------------

This issue seem not been completed fixed. Because after we patched this MR, the issue has occured for many times in our product env. Error log like this 

```log

2021-03-16 09:50:29,626 | ERROR | [data-plane-kafka-request-handler-4] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) 2021-03-16 09:50:29,626 | ERROR | [data-plane-kafka-request-handler-4] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:527) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:131) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:110) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:445) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at scala.Option.foreach(Option.scala:257) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1349) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.roll(Log.scala:1349) at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1336) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:754) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:614) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:602) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:601) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)2021-03-16 09:50:29,625 | ERROR | [data-plane-kafka-request-handler-2] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:527) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:131) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:110) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:445) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at scala.Option.foreach(Option.scala:257) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1349) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.roll(Log.scala:1349) at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1336) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:754) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:614) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:602) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:601) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)

```


was (Author: iblackeyes):
This issue seem not been completed fixed. Because after we patched this MR, the issue has occured for many times in our product env. Error log like this 

```

2021-03-16 09:50:29,626 | ERROR | [data-plane-kafka-request-handler-4] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) 2021-03-16 09:50:29,626 | ERROR | [data-plane-kafka-request-handler-4] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:527) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:131) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:110) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:445) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at scala.Option.foreach(Option.scala:257) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1349) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.roll(Log.scala:1349) at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1336) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:754) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:614) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:602) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:601) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)2021-03-16 09:50:29,625 | ERROR | [data-plane-kafka-request-handler-2] | [ReplicaManager broker=1] Error processing append operation on partition topic_xxxxx_50680-4 | kafka.server.ReplicaManager (Logging.scala:76) java.nio.BufferOverflowException at java.nio.Buffer.nextPutIndex(Buffer.java:527) at java.nio.DirectByteBuffer.putLong(DirectByteBuffer.java:797) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply$mcV$sp(TimeIndex.scala:131) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.log.TimeIndex$$anonfun$maybeAppend$1.apply(TimeIndex.scala:110) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.log.TimeIndex.maybeAppend(TimeIndex.scala:110) at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:445) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2$$anonfun$apply$27.apply(Log.scala:1363) at scala.Option.foreach(Option.scala:257) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1363) at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1349) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.roll(Log.scala:1349) at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1336) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:754) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:614) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:602) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:601) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)

```

> LazyTimeIndex & LazyOffsetIndex may cause niobufferoverflow in concurrent state
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-9156
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9156
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.3.0, 2.3.1
>            Reporter: shilin Lu
>            Assignee: Alex Mironov
>            Priority: Blocker
>              Labels: regression
>             Fix For: 2.4.0, 2.3.2
>
>         Attachments: image-2019-11-07-17-42-13-852.png, image-2019-11-07-17-44-05-357.png, image-2019-11-07-17-46-53-650.png
>
>
> !image-2019-11-07-17-42-13-852.png!
> this timeindex get function is not thread safe ,may cause create some timeindex.
> !image-2019-11-07-17-44-05-357.png!
> When create timeindex not exactly one ,may cause mappedbytebuffer position to end. Then write index entry to this mmap file will cause java.nio.BufferOverflowException.
>  
> !image-2019-11-07-17-46-53-650.png!
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)