You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Shanthoosh Venkataraman (JIRA)" <ji...@apache.org> on 2018/01/11 01:48:00 UTC

[jira] [Created] (SAMZA-1554) Host affinity in standalone.

Shanthoosh Venkataraman created SAMZA-1554:
----------------------------------------------

             Summary: Host affinity in standalone.
                 Key: SAMZA-1554
                 URL: https://issues.apache.org/jira/browse/SAMZA-1554
             Project: Samza
          Issue Type: New Feature
            Reporter: Shanthoosh Venkataraman
            Assignee: Shanthoosh Venkataraman



Samza framework enables its users to build stateful stream processing applications–that is, applications that remember information about past events in a local state(store), which will be then used to influence the processing of future events from the stream. Local state is a fundamental and enabling concept in stream processing which is required and essential to support a majority of common use cases such as stream-stream join, stream-table join, windowing etc.

Local store of a task instance is backed up by an log compacted kafka topic referred to as change-log. When a task instance commits, incremental local task store updates are flushed to the kafka topic. When a task instance runs on a host that doesn’t have latest local store, it’s restored by replaying messages from the change-log stream. For large stateful jobs, this restoration phase takes longer time, thus preventing the application from starting up and processing events from the input streams. Host affinity is a feature that maintains stickiness between a task and physical host and offers best-effort guarantees that a task instance will be assigned to run on the same physical it had ran before. 

This tracks the work required to accomplish this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)