You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Yi Pan <ni...@gmail.com> on 2018/02/02 01:29:55 UTC
Re: SEP-11. Host Affinity in standalone discussion.
Linking Boris' earlier comment in another email to this thread:
http://mail-archives.apache.org/mod_mbox/samza-dev/201801.mbox/%3CCAPAaT%2BtH2H5TEvFQUn9jw6iR%3DyvVEu46rDLJsqexpwKz0CAH1g%40mail.gmail.com%3E
On Fri, Jan 26, 2018 at 4:17 PM, Boris S <bo...@gmail.com> wrote:
> Shanthoosh,
> Thank you for suggesting and submitting this SEP:
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957309
>
> Couple of things I would want to point out so far:
>
> 1. Kudos on cleaning up the interface and introducing new ones
> (LocalityInfo and LocalityManager). I think we also need MetadataStorage
> one (details may be worked out later) to hide the locality storage
> implementation details.
> 2. Instead of using physical hostname we should stick to the LocationId,
> since some VMs may be running multiple processors on a single physical
> host.
> 3. Thank you for adding the diagrams. I think we can improve them little
> bit.
> - First diagram describes how local storage works. Please label it as
> such.
> - Second diagram describes the flow of JobModel generation. I am not
> sure if actual pictures help here. Consider writing it as a list.
> - Third diagram. Host affinity implementation flow. This is very
> helpful. I think, though, using function names doesn't give
> enough clarity
> on what is going on. May be we should add more explanation. For
> example:
> group(InputSSP) -> generate list of SSPs from the list of input
> streams/partitions.
> readTaskLocalityInfo() -> read locality mapping from the
> MetaDataStorage.
> Also we should add another step there - each processor will update
> locality information based on its mapping in the current JobModel.
> 4. Some time the perfect mapping to the same Locality is not possible
> (especially when a task dies and is distributed between other tasks).
> What
> should we do in this case?
>
>
> Thanks again. I will keep reading the document.
>