You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Caleb Rackliffe (Jira)" <ji...@apache.org> on 2021/07/07 21:38:00 UTC

[jira] [Updated] (CASSANDRA-16775) Reduce the log level on "expected" repair exceptions

     [ https://issues.apache.org/jira/browse/CASSANDRA-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Caleb Rackliffe updated CASSANDRA-16775:
----------------------------------------
    Test and Documentation Plan: 
- new unit tests around abort functionality in ValidationTask
- new in-JVM dtests around log message severity through repair/validation errors
                         Status: Patch Available  (was: In Progress)

[patch|https://github.com/apache/cassandra/pull/1104]
[Circle J8|https://app.circleci.com/pipelines/github/maedhroz/cassandra/288/workflows/98e0b78d-ffbb-48df-832f-5acb4897f362]
[Circle J11|https://app.circleci.com/pipelines/github/maedhroz/cassandra/288/workflows/5053fe0b-9aec-4167-ba08-3656cd6129d8]

The primary aim of this patch was to reduce the logging level from ERROR to WARN for a number of exceptions that distract from the root causes of repair/validation failures. Along the way, i.e. while writing and debugging the new {{RepairErrorsTest}}, I discovered that we don't properly release off-heap Merkle trees through remote validation errors. To make sure that the new tests aren't flaky out of the box, I went ahead and fixed that as well.

> Reduce the log level on "expected" repair exceptions
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16775
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16775
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Repair, Observability/Logging
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 4.x
>
>
> Many of the repair errors we typically see in the logs are redundant. Say for example that one node has an unreadable SSTable...we should log that fact at ERROR, but then the failing repairs due to that unreadable SSTable should be at WARN, making it easier to find the actual problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org