You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "iBlackeyes (Jira)" <ji...@apache.org> on 2021/03/18 02:48:00 UTC

[jira] [Created] (KAFKA-12494) Broker raise InternalError after disk sector medium error without marking dir to offline

iBlackeyes created KAFKA-12494:
----------------------------------

             Summary: Broker raise InternalError after disk sector medium error without marking dir to offline
                 Key: KAFKA-12494
                 URL: https://issues.apache.org/jira/browse/KAFKA-12494
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 2.7.0, 2.5.1, 2.6.0, 2.4.0, 1.1.2
         Environment: Kafka Version: 1.1.0
Jdk Version:  jdk1.8
            Reporter: iBlackeyes


In my produce env, we encounter a case that kafka broker only raise errors like 

 `_*2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error processing append operation on partition xxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76)*_ 
_*java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code*_` 

when broker append to a error disk sector  and doesn't mark the dir on this disk to offline.

This result in many partitions which assign replicas on this disk  in under-replicated state . 

Here is the logs:

*os messages log:*
{code:java}
Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, dev sds, sector 2308010408
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium Error [current] 
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: Unrecovered read error
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 91 71 a8 00 00 08 00
Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, dev sds, sector 2308010408
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium Error [current] 
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: Unrecovered read error
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 91 71 a8 00 00 08 00
Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, dev sds, sector 2308010408
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium Error [current] 
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: Unrecovered read error
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 91 71 a8 00 00 08 00
Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, dev sds, sector 2308010408
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Sense Key : Medium Error [current] 
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] Add. Sense: Unrecovered read error
Feb 16 23:24:24 hd-node109 kernel: sd 14:1:0:18: [sds] CDB: Read(10) 28 00 89 91 71 a8 00 00 08 00
Feb 16 23:24:24 hd-node109 kernel: blk_update_request: critical medium error, dev sds, sector 2308010408{code}
*broker server.log:*
{code:java}
2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error processing append operation on xxxxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76) 2021-02-16 23:24:24,965 | ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error processing append operation on xxxxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.util.zip.Inflater.<init>(Inflater.java:102) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69) at org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265) at org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267) at kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259) at kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70) at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)2021-02-16 23:24:24,999 | ERROR | [data-plane-kafka-request-handler-19] | [ReplicaManager broker=7] Error processing append operation on partition xxxxxxx-0 | kafka.server.ReplicaManager (Logging.scala:76) java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled Java code at java.util.zip.Inflater.<init>(Inflater.java:102) at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:77) at org.apache.kafka.common.record.CompressionType$2.wrapForInput(CompressionType.java:69) at org.apache.kafka.common.record.DefaultRecordBatch.compressedIterator(DefaultRecordBatch.java:265) at org.apache.kafka.common.record.DefaultRecordBatch.iterator(DefaultRecordBatch.java:332) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:54) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:267) at kafka.log.LogValidator$$anonfun$validateMessagesAndAssignOffsetsCompressed$1.apply(LogValidator.scala:259) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.log.LogValidator$.validateMessagesAndAssignOffsetsCompressed(LogValidator.scala:259) at kafka.log.LogValidator$.validateMessagesAndAssignOffsets(LogValidator.scala:70) at kafka.log.Log$$anonfun$append$2.liftedTree1$1(Log.scala:672) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:671) at kafka.log.Log$$anonfun$append$2.apply(Log.scala:653) at kafka.log.Log.maybeHandleIOException(Log.scala:1711) at kafka.log.Log.append(Log.scala:653) at kafka.log.Log.appendAsLeader(Log.scala:623) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:609) at kafka.cluster.Partition$$anonfun$13.apply(Partition.scala:597) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:250) at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:256) at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:596) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:739) at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:723) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:723) at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:464) at kafka.server.KafkaApis.handleProduceRequest(KafkaApis.scala:471) at kafka.server.KafkaApis.handle(KafkaApis.scala:104) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69) at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)