You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Dong, Peter A. (NSB - CN/Qingdao)" <pe...@nokia-sbell.com> on 2022/04/28 03:47:28 UTC

How to deal with the Error: KAFKA_STORAGE_ERROR

Greetings, Kafka specialists

A strange issue in my Kafka instance that blocked my for a couple of days.

1.       Cannot produce event to [topic-test] due to a KAFKA_STORAGE_ERROR

2.       It seems to me the log segment file is not having any problem.

I can dump log, index, timeindex with kafka-dump-log without seeing any error.

3.       No error when I tried to produce events to other topics. Kafka log files are in a same disk partition.

4.       Restart Kafka instance and zookeeper instance did not help.

5.       I cannot find useful information about the error from server.log with TRACE level.



Could you please let me know whether similar issue ever happened before?
Where shall I go to dig further and continue my investigation?

Thanks a lot!

Peter




The kafka client log
kafka-console-producer --bootstrap-server 135.251.236.162:9092 --topic topic-test

>[2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer] Got error produce response with correlation id 5 on topic-partition topic-test-0, retrying (2 attempts left). Error: KAFKA_STORAGE_ERROR (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer] Got error produce response with correlation id 7 on topic-partition topic-test-0, retrying (1 attempts left). Error: KAFKA_STORAGE_ERROR (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer] Got error produce response with correlation id 9 on topic-partition topic-test-0, retrying (0 attempts left). Error: KAFKA_STORAGE_ERROR (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)
[2022-04-28 11:12:11,231] ERROR Error when sending message to topic topic-test with key: null, value: 0 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk.
[2022-04-28 11:12:11,233] WARN [Producer clientId=console-producer] Received invalid metadata error in produce request on partition topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException: Disk error when trying to access log file on the disk.. Going to request metadata update now (org.apache.kafka.clients.producer.internals.Sender)


The server.log
[2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Handling request:RequestHeader(apiKey=METADATA, apiVersion=11, clientId=console-producer, correlationId=9) -- MetadataRequestData(topics=[MetadataRequestTopic(topicId=AAAAAAAAAAAAAAAAAAAAAA, name='topic-test')], allowAutoTopicCreation=true, includeClusterAuthorizedOperations=false, includeTopicAuthorizedOperations=false) from connection 135.251.236.162:9092-135.251.236.162:44194-2;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS (kafka.server.KafkaApis)
[2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Sending topic metadata MetadataResponseTopic(errorCode=0, name='topic-tst', topicId=mgS7D7-9RZSgeEUJ3XXErw, isInternal=false, partitions=[MetadataResponsePartition(errorCode=0, partitionIndex=0, leaderId=0, leaderEpoch=0, replicaNodes=[0], isrNodes=[0], offlineReplicas=[])], topicAuthorizedOperations=-2147483648) and brokers baijin162-vnfprov:9092 (id: 0 rack: null) for correlation id 9 to client console-producer (kafka.server.KafkaApis)
[2022-04-27 07:41:51,297] TRACE [KafkaApi-0] Handling request:RequestHeader(apiKey=PRODUCE, apiVersion=9, clientId=console-producer, correlationId=10) -- {acks=1,timeout=1500,partitionSizes=[topic-test-0=81]} from connection 135.251.236.162:9092-135.251.236.162:44194-2;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS (kafka.server.KafkaApis)
[2022-04-27 07:41:51,297] TRACE [ReplicaManager broker=0] Append [HashMap(topic-test-0 -> MemoryRecords(size=81, buffer=java.nio.HeapByteBuffer[pos=0 lim=81 cap=84]))] to local log (kafka.server.ReplicaManager)
[2022-04-27 07:41:51,297] DEBUG [KafkaApi-0] Produce request with correlation id 10 from client console-producer on partition topic-test-0 failed due to org.apache.kafka.common.errors.KafkaStorageException (kafka.server.KafkaApis)




RE: How to deal with the Error: KAFKA_STORAGE_ERROR

Posted by "Dong, Peter A. (NSB - CN/Qingdao)" <pe...@nokia-sbell.com>.
Liam,

Thanks a lot for your help.

I have not made metrics yet.

I have just checked everything about file privileges again, all correct.  Then I notice there's an error in state-change log as below,

[2022-04-26 19:07:17,727] ERROR [Broker id=1] Topic ID in memory: 7LRTCKIRT5W5qZUrLQ_0wA does not match the topic ID for partition topic-test-0 received: o-iJg3N9Qk61rzKfdor5tA. (state.change.logger)

I don't know whether the error will be the root cause. 

Regards,

Peter


-----Original Message-----
From: Liam Clarke-Hutchinson <lc...@redhat.com> 
Sent: 2022年4月28日 16:45
To: users@kafka.apache.org
Subject: Re: How to deal with the Error: KAFKA_STORAGE_ERROR

Hi Peter,

Firstly, I'd check disk health, then I'd check owners and permissions on files in your log dir, eliminate those as issues.
Secondly, are you tracking metrics on offline log dirs?

Cheers,

Liam

On Thu, 28 Apr 2022 at 15:55, Dong, Peter A. (NSB - CN/Qingdao) < peter.a.dong@nokia-sbell.com> wrote:

>
> Greetings, Kafka specialists
>
> A strange issue in my Kafka instance that blocked my for a couple of days.
>
> 1.       Cannot produce event to [topic-test] due to a KAFKA_STORAGE_ERROR
>
> 2.       It seems to me the log segment file is not having any problem.
>
> I can dump log, index, timeindex with kafka-dump-log without seeing 
> any error.
>
> 3.       No error when I tried to produce events to other topics. Kafka
> log files are in a same disk partition.
>
> 4.       Restart Kafka instance and zookeeper instance did not help.
>
> 5.       I cannot find useful information about the error from server.log
> with TRACE level.
>
>
>
> Could you please let me know whether similar issue ever happened before?
> Where shall I go to dig further and continue my investigation?
>
> Thanks a lot!
>
> Peter
>
>
>
>
> The kafka client log
> kafka-console-producer --bootstrap-server 135.251.236.162:9092 --topic 
> topic-test
>
> >[2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer] 
> >Got
> error produce response with correlation id 5 on topic-partition 
> topic-test-0, retrying (2 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer] 
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to 
> request metadata update now 
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer] 
> Got error produce response with correlation id 7 on topic-partition 
> topic-test-0, retrying (1 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer] 
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to 
> request metadata update now 
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer] 
> Got error produce response with correlation id 9 on topic-partition 
> topic-test-0, retrying (0 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer] 
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to 
> request metadata update now 
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,231] ERROR Error when sending message to topic 
> topic-test with key: null, value: 0 bytes with error:
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.KafkaStorageException: Disk error when 
> trying to access log file on the disk.
> [2022-04-28 11:12:11,233] WARN [Producer clientId=console-producer] 
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to 
> request metadata update now 
> (org.apache.kafka.clients.producer.internals.Sender)
>
>
> The server.log
> [2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Handling 
> request:RequestHeader(apiKey=METADATA, apiVersion=11, 
> clientId=console-producer, correlationId=9) -- 
> MetadataRequestData(topics=[MetadataRequestTopic(topicId=AAAAAAAAAAAAA
> AAAAAAAAA, name='topic-test')], allowAutoTopicCreation=true, 
> includeClusterAuthorizedOperations=false,
> includeTopicAuthorizedOperations=false) from connection 
> 135.251.236.162:9092 
> -135.251.236.162:44194-2;securityProtocol:PLAINTEXT,principal:User:ANO
> NYMOUS
> (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Sending topic metadata 
> MetadataResponseTopic(errorCode=0, name='topic-tst', 
> topicId=mgS7D7-9RZSgeEUJ3XXErw, isInternal=false, 
> partitions=[MetadataResponsePartition(errorCode=0, partitionIndex=0, 
> leaderId=0, leaderEpoch=0, replicaNodes=[0], isrNodes=[0], 
> offlineReplicas=[])], topicAuthorizedOperations=-2147483648) and 
> brokers
> baijin162-vnfprov:9092 (id: 0 rack: null) for correlation id 9 to 
> client console-producer (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,297] TRACE [KafkaApi-0] Handling 
> request:RequestHeader(apiKey=PRODUCE, apiVersion=9, 
> clientId=console-producer, correlationId=10) -- 
> {acks=1,timeout=1500,partitionSizes=[topic-test-0=81]} from connection 
> 135.251.236.162:9092-135.251.236.162:44194-2;securityProtocol:PLAINTEX
> T,principal:User:ANONYMOUS
> (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,297] TRACE [ReplicaManager broker=0] Append
> [HashMap(topic-test-0 -> MemoryRecords(size=81,
> buffer=java.nio.HeapByteBuffer[pos=0 lim=81 cap=84]))] to local log
> (kafka.server.ReplicaManager)
> [2022-04-27 07:41:51,297] DEBUG [KafkaApi-0] Produce request with 
> correlation id 10 from client console-producer on partition 
> topic-test-0 failed due to 
> org.apache.kafka.common.errors.KafkaStorageException
> (kafka.server.KafkaApis)
>
>
>
>

Re: How to deal with the Error: KAFKA_STORAGE_ERROR

Posted by Liam Clarke-Hutchinson <lc...@redhat.com>.
Hi Peter,

Firstly, I'd check disk health, then I'd check owners and permissions on
files in your log dir, eliminate those as issues.
Secondly, are you tracking metrics on offline log dirs?

Cheers,

Liam

On Thu, 28 Apr 2022 at 15:55, Dong, Peter A. (NSB - CN/Qingdao) <
peter.a.dong@nokia-sbell.com> wrote:

>
> Greetings, Kafka specialists
>
> A strange issue in my Kafka instance that blocked my for a couple of days.
>
> 1.       Cannot produce event to [topic-test] due to a KAFKA_STORAGE_ERROR
>
> 2.       It seems to me the log segment file is not having any problem.
>
> I can dump log, index, timeindex with kafka-dump-log without seeing any
> error.
>
> 3.       No error when I tried to produce events to other topics. Kafka
> log files are in a same disk partition.
>
> 4.       Restart Kafka instance and zookeeper instance did not help.
>
> 5.       I cannot find useful information about the error from server.log
> with TRACE level.
>
>
>
> Could you please let me know whether similar issue ever happened before?
> Where shall I go to dig further and continue my investigation?
>
> Thanks a lot!
>
> Peter
>
>
>
>
> The kafka client log
> kafka-console-producer --bootstrap-server 135.251.236.162:9092 --topic
> topic-test
>
> >[2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer] Got
> error produce response with correlation id 5 on topic-partition
> topic-test-0, retrying (2 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:10,925] WARN [Producer clientId=console-producer]
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to request
> metadata update now (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer] Got
> error produce response with correlation id 7 on topic-partition
> topic-test-0, retrying (1 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,024] WARN [Producer clientId=console-producer]
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to request
> metadata update now (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer] Got
> error produce response with correlation id 9 on topic-partition
> topic-test-0, retrying (0 attempts left). Error: KAFKA_STORAGE_ERROR
> (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,127] WARN [Producer clientId=console-producer]
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to request
> metadata update now (org.apache.kafka.clients.producer.internals.Sender)
> [2022-04-28 11:12:11,231] ERROR Error when sending message to topic
> topic-test with key: null, value: 0 bytes with error:
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.KafkaStorageException: Disk error when
> trying to access log file on the disk.
> [2022-04-28 11:12:11,233] WARN [Producer clientId=console-producer]
> Received invalid metadata error in produce request on partition
> topic-test-0 due to org.apache.kafka.common.errors.KafkaStorageException:
> Disk error when trying to access log file on the disk.. Going to request
> metadata update now (org.apache.kafka.clients.producer.internals.Sender)
>
>
> The server.log
> [2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Handling
> request:RequestHeader(apiKey=METADATA, apiVersion=11,
> clientId=console-producer, correlationId=9) --
> MetadataRequestData(topics=[MetadataRequestTopic(topicId=AAAAAAAAAAAAAAAAAAAAAA,
> name='topic-test')], allowAutoTopicCreation=true,
> includeClusterAuthorizedOperations=false,
> includeTopicAuthorizedOperations=false) from connection 135.251.236.162:9092
> -135.251.236.162:44194-2;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS
> (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,203] TRACE [KafkaApi-0] Sending topic metadata
> MetadataResponseTopic(errorCode=0, name='topic-tst',
> topicId=mgS7D7-9RZSgeEUJ3XXErw, isInternal=false,
> partitions=[MetadataResponsePartition(errorCode=0, partitionIndex=0,
> leaderId=0, leaderEpoch=0, replicaNodes=[0], isrNodes=[0],
> offlineReplicas=[])], topicAuthorizedOperations=-2147483648) and brokers
> baijin162-vnfprov:9092 (id: 0 rack: null) for correlation id 9 to client
> console-producer (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,297] TRACE [KafkaApi-0] Handling
> request:RequestHeader(apiKey=PRODUCE, apiVersion=9,
> clientId=console-producer, correlationId=10) --
> {acks=1,timeout=1500,partitionSizes=[topic-test-0=81]} from connection
> 135.251.236.162:9092-135.251.236.162:44194-2;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS
> (kafka.server.KafkaApis)
> [2022-04-27 07:41:51,297] TRACE [ReplicaManager broker=0] Append
> [HashMap(topic-test-0 -> MemoryRecords(size=81,
> buffer=java.nio.HeapByteBuffer[pos=0 lim=81 cap=84]))] to local log
> (kafka.server.ReplicaManager)
> [2022-04-27 07:41:51,297] DEBUG [KafkaApi-0] Produce request with
> correlation id 10 from client console-producer on partition topic-test-0
> failed due to org.apache.kafka.common.errors.KafkaStorageException
> (kafka.server.KafkaApis)
>
>
>
>