You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Ethan Li (Jira)" <ji...@apache.org> on 2020/07/01 13:58:00 UTC

[jira] [Updated] (STORM-3649) Logic error regarding storm.supervisor.medium.memory.grace.period.ms

     [ https://issues.apache.org/jira/browse/STORM-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Li updated STORM-3649:
----------------------------
    Affects Version/s: 2.2.0
                       2.0.0
                       2.1.0

> Logic error regarding storm.supervisor.medium.memory.grace.period.ms
> --------------------------------------------------------------------
>
>                 Key: STORM-3649
>                 URL: https://issues.apache.org/jira/browse/STORM-3649
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-server
>    Affects Versions: 2.0.0, 2.1.0, 2.2.0
>            Reporter: Ethan Li
>            Assignee: Ethan Li
>            Priority: Major
>             Fix For: 2.3.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Inside this chunk of code
> https://github.com/apache/storm/blob/2.2.x-branch/storm-server/src/main/java/org/apache/storm/daemon/supervisor/BasicContainer.java#L758
> {code:java}
> if (systemFreeMemoryMb < mediumMemoryThresholdMb) {
>                     if (memoryLimitExceededStart < 0) {
>                         memoryLimitExceededStart = Time.currentTimeMillis();
>                     } else {
>                         long timeInViolation = Time.currentTimeMillis() - memoryLimitExceededStart;
>                         if (timeInViolation > mediumMemoryGracePeriodMs) {
>                             LOG.warn(
>                                 "{} is using {} MB > memory limit {} MB for {} seconds",
>                                 typeOfCheck,
>                                 usageMb,
>                                 memoryLimitMb,
>                                 timeInViolation / 1000);
>                             return true;
>                         }
>                     }
>                 } 
> {code}
> At very beginning, memoryLimitExceededStart in BasicContainer is initialized as 0. :
> https://github.com/apache/storm/blob/2.2.x-branch/storm-server/src/main/java/org/apache/storm/daemon/supervisor/BasicContainer.java#L80
> {code:java}
> protected volatile long memoryLimitExceededStart;
> {code}
> So once it hits this scenario, the grace period doesn't really take any effect because the timeInViolation will be very large (equals to currentTime)
> The logs from a test:
> {code:java}
> 2020-06-08 20:39:18.277 o.a.s.d.s.BasicContainer SLOT_6707 [WARN] WORKER 9c16e81e-4936-4029-bcda-ceb5b74b8f42 is using 167 MB > memory limit 158 MB for 1591648758 seconds
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)