You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Aravindan Vijayan (Jira)" <ji...@apache.org> on 2020/10/28 17:56:00 UTC

[jira] [Updated] (HDDS-4404) Datanode can go OOM when a Recon or SCM Server is very slow in processing reports.

     [ https://issues.apache.org/jira/browse/HDDS-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aravindan Vijayan updated HDDS-4404:
------------------------------------
    Description: 
From [~nanda619]'s analysis.

ContainerReportPublisher thread runs periodically (default interval 60s) in Datanode and adds ContainerReport to StateContext (Queue).
Heartbeat thread runs periodically (default interval 30s), picks up the ContainerReport (if any) from StateContext.
For short time, the ContainerReport will be held in Datanode StateContext.
For Recon, a change was made in datanode such that the ContainerReport will be cached in Datanode StateContext separately for each endpoint (i.e. SCM and Recon). As I see, if Recon is configured in the Datanode and all the reports that are to be sent to Recon will be pending in the StateContextQueue (LinkedList)

> Datanode can go OOM when a Recon or SCM Server is very slow in processing reports.
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-4404
>                 URL: https://issues.apache.org/jira/browse/HDDS-4404
>             Project: Hadoop Distributed Data Store
>          Issue Type: Task
>          Components: Ozone Datanode
>    Affects Versions: 1.0.0
>            Reporter: Aravindan Vijayan
>            Priority: Critical
>         Attachments: Screen Shot 2020-10-26 at 11.24.09 PM.png
>
>
> From [~nanda619]'s analysis.
> ContainerReportPublisher thread runs periodically (default interval 60s) in Datanode and adds ContainerReport to StateContext (Queue).
> Heartbeat thread runs periodically (default interval 30s), picks up the ContainerReport (if any) from StateContext.
> For short time, the ContainerReport will be held in Datanode StateContext.
> For Recon, a change was made in datanode such that the ContainerReport will be cached in Datanode StateContext separately for each endpoint (i.e. SCM and Recon). As I see, if Recon is configured in the Datanode and all the reports that are to be sent to Recon will be pending in the StateContextQueue (LinkedList)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org