You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Omkar Vinit Joshi (JIRA)" <ji...@apache.org> on 2013/04/05 01:44:16 UTC

[jira] [Commented] (YARN-539) LocalizedResources are leaked in memory in case resource localization fails

    [ https://issues.apache.org/jira/browse/YARN-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13623082#comment-13623082 ] 

Omkar Vinit Joshi commented on YARN-539:
----------------------------------------

At present the flow of events in case resource localization is as follows
* When resource localization fails (Public localizer / LocalizerRunner(Private) )it sends ContainerResourceFailedEvent to the containers which then sends ResourceReleaseEvent to the failed resource. In the end when LocalizedResource's ref count drops to 0 its state is changed from DOWNLOADING to INIT.

Now due to this resource may end up in memory (ResourceLocalizationTracker - memory leak) or may also introduce a race condition [yarn-544|https://issues.apache.org/jira/browse/YARN-544]

Now proposed solution is
* when resource localization fails, resource localization failed event (ResourceFailedEvent) is sent to (LocalResourcesTrackerImpl). The tracker will remove this localized resource from its cache and will then pass the event to LocalizedResource. LocalizedResource will then notify all the containers which were waiting for this resource. The containers will no longer send an additional ResourceReleaseEvent.
* Now to keep the flow same for Success as well as Failure, even the Localization successful event will be sent to LocalizedResource via LocalResourcesTrackerImpl.
                
> LocalizedResources are leaked in memory in case resource localization fails
> ---------------------------------------------------------------------------
>
>                 Key: YARN-539
>                 URL: https://issues.apache.org/jira/browse/YARN-539
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Omkar Vinit Joshi
>            Assignee: Omkar Vinit Joshi
>
> If resource localization fails then resource remains in memory and is
> 1) Either cleaned up when next time cache cleanup runs and there is space crunch. (If sufficient space in cache is available then it will remain in memory).
> 2) reused if LocalizationRequest comes again for the same resource.
> I think when resource localization fails then that event should be sent to LocalResourceTracker which will then remove it from its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira