You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/01/06 01:53:00 UTC

[jira] [Commented] (SAMZA-1552) Host affinity improvements - Improve matching of hosts to allocated resources

    [ https://issues.apache.org/jira/browse/SAMZA-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16314287#comment-16314287 ] 

ASF GitHub Bot commented on SAMZA-1552:
---------------------------------------

GitHub user vjagadish1989 opened a pull request:

    https://github.com/apache/samza/pull/401

    SAMZA-1552: Host affinity improvements - Improve matching of hosts to allocated resources

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vjagadish1989/samza host-affinity-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/401.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #401
    
----
commit 73c230989f16f9e2ffcaefbe5b9610ef6bc818a6
Author: Jagadish <jv...@...>
Date:   2018-01-05T08:46:00Z

    Host affinity fixes

----


> Host affinity improvements - Improve matching of hosts to allocated resources
> -----------------------------------------------------------------------------
>
>                 Key: SAMZA-1552
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1552
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Abhishek Shivanna
>            Assignee: Jagadish
>
> Kudos to [~abkshvn] for observing this!
> We have observed host-affinity not being honored for some containers in very large jobs. When Yarn allocates more resources than what Samza requested on a specific host, the extra resources are added to a spare-pool called the "ANY_HOST Buffer". Later, when there is a resource request for the same host from Samza and Yarn does not return resources, we don't leverage the spare-pool of previously returned resources in that host. 
> This problem is specially pronounced in clusters that are heavily loaded in cpu, and memory where allocations need to satisfy both cpu and memory requirements of available hosts (Often, hosts have cpu but not memory or vice-versa). If there are a lot of container failures on a particular host in the midst of allocation, it further aggravates this problem.
> The fix is as follows:
> Check if there are available containers in the buffer corresponding to our preferred host. If not, we should also scan the ANY-HOST buffer for matched containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)