You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Alexander (Jira)" <ji...@apache.org> on 2021/10/19 13:41:00 UTC

[jira] [Updated] (KAFKA-13384) FailedPartitionsCount metric is not updated if a partition log file was corrupted

     [ https://issues.apache.org/jira/browse/KAFKA-13384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander updated KAFKA-13384:
------------------------------
    Description: 
We found a misbehavior on our Kafka cluster (version: 2.6.2 (Commit:da65af02e5856e34)), `FailedPartitionsCount` metric is not updated if a partition log file was corrupted

Steps to reproduce the problem:
 # corrupt a partition log file
 # restart Kafka process

After that, you will get a correct log which tells that Kafka marked corrupted partitions as failed

 
{code:java}
2021-10-19T14:49:31+02:00 [2021-10-19 12:49:30,924] WARN [ReplicaFetcher replicaId=11, leaderId=10, fetcherId=0] Partition test_topic-1 marked as failed
 (kafka.server.ReplicaFetcherThread){code}
 

 

But the value of `FailedPartitionsCount` metric will be 0 (see attached screenshot)

  was:
We found a misbehavior on our Kafka cluster (version: 2.6.2 (Commit:da65af02e5856e34)), `FailedPartitionsCount` metric is not updated if a partition log file was corrupted

Steps to reproduce the problem:
1. corrupt a partition log file
2. restart Kafka process

After that, you will get a correct log which tells that Kafka marked corrupted partitions as failed

```
2021-10-19T14:49:31+02:00 [2021-10-19 12:49:30,924] WARN [ReplicaFetcher replicaId=11, leaderId=10, fetcherId=0] Partition test_topic-1 marked as failed
 (kafka.server.ReplicaFetcherThread)
```

But the value of `FailedPartitionsCount` metric will be 0 (see attached screenshot)


> FailedPartitionsCount metric is not updated if a partition log file was corrupted
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-13384
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13384
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.6.2
>         Environment: OS:
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/"
> Kafka version:
> 2.6.2 (Commit:da65af02e5856e34)
>            Reporter: Alexander
>            Priority: Major
>         Attachments: Screenshot 2021-10-19 at 15.28.33.png
>
>
> We found a misbehavior on our Kafka cluster (version: 2.6.2 (Commit:da65af02e5856e34)), `FailedPartitionsCount` metric is not updated if a partition log file was corrupted
> Steps to reproduce the problem:
>  # corrupt a partition log file
>  # restart Kafka process
> After that, you will get a correct log which tells that Kafka marked corrupted partitions as failed
>  
> {code:java}
> 2021-10-19T14:49:31+02:00 [2021-10-19 12:49:30,924] WARN [ReplicaFetcher replicaId=11, leaderId=10, fetcherId=0] Partition test_topic-1 marked as failed
>  (kafka.server.ReplicaFetcherThread){code}
>  
>  
> But the value of `FailedPartitionsCount` metric will be 0 (see attached screenshot)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)