You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Hai Lu (Jira)" <ji...@apache.org> on 2020/11/22 01:07:00 UTC

[jira] [Resolved] (SAMZA-2601) Continuous heavy logging when YARN does not allocate resources for a container

     [ https://issues.apache.org/jira/browse/SAMZA-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hai Lu resolved SAMZA-2601.
---------------------------
    Resolution: Fixed

> Continuous heavy logging when YARN does not allocate resources for a container
> ------------------------------------------------------------------------------
>
>                 Key: SAMZA-2601
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2601
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Hai Lu
>            Assignee: Hai Lu
>            Priority: Minor
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If YARN does not immediately allocate resources for a container and host affinity is disabled for the job, then the container allocator will spam the logs with messages about the request, even if the request is considered expired. This is because the allocator loops over the pending requests to check if they have been allocated, but there is no delay between loop iterations, so it might just keep checking the same pending request over and over as fast as possible. A metric is also incremented in this flow, so that metric is extremely high, which is probably not the intention.
> One case we saw is that if YARN does not have enough capacity, then it will not return resources, so the AM gets into this logging loop:
> Handling assignment for Processor ID:
> Did not find any allocated containers for running Processor ID:
> ...
> Functionally, I did not observe impact, since YARN is out of capacity anyways, so the job isn't going to be deployed in any case. However, the AM basically goes into an infinite polling loop if YARN is out of capacity, and this will fill up disk with logs very quickly (3GB/s).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)