You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/01/10 22:57:00 UTC

[jira] [Updated] (HDDS-7097) Container scanner log output lacks useful information

     [ https://issues.apache.org/jira/browse/HDDS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated HDDS-7097:
---------------------------------
    Labels: pull-request-available  (was: )

> Container scanner log output lacks useful information
> -----------------------------------------------------
>
>                 Key: HDDS-7097
>                 URL: https://issues.apache.org/jira/browse/HDDS-7097
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ethan Rose
>            Assignee: Dave Teng
>            Priority: Major
>              Labels: pull-request-available
>
> Currently the output from the container scanner may look like this
> {code}
> 2022-08-04 14:16:37,702 WARN org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer: Moving container /hadoop-ozone/datanode/data/hdds/CID-5612c780-06f8-4ac5-9eae-498159abd009/current/containerDir1/1008 to state UNHEALTHY from state:UNHEALTHY Trace:java.base/java.lang.Thread.getStackTrace(Thread.java:1606)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
> org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.markContainerUnhealthy(KeyValueContainer.java:335)
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.markContainerUnhealthy(KeyValueHandler.java:1017)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.markContainerUnhealthy(ContainerController.java:116)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.runIteration(ContainerDataScanner.java:108)
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerDataScanner.run(ContainerDataScanner.java:81)
> ...
> 2022-08-04 14:30:19,407 ERROR org.apache.hadoop.ozone.container.keyvalue.KeyValueContainerCheck: Corruption detected in container: [2] Exception: [null]
> {code}
> There's numerous problems with this:
> - The previous container state is not logged. The new unhealthy state is incorrectly logged as the previous state.
> - The exception identifying the corruption only has its message printed. The exception object itself should be logged to better identify the failure and catch cases like above where there is no exception message (probably caused by a bug).
> - The stack trace of the call to {{KeyValueContainer#markContainerUnhealthy}} is logged, which both verbose and not useful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org