You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Xinyu Liu (JIRA)" <ji...@apache.org> on 2017/05/04 16:47:04 UTC

[jira] [Created] (SAMZA-1260) Support End-of-Stream Across Intermediate Streams

Xinyu Liu created SAMZA-1260:
--------------------------------

             Summary: Support End-of-Stream Across Intermediate Streams
                 Key: SAMZA-1260
                 URL: https://issues.apache.org/jira/browse/SAMZA-1260
             Project: Samza
          Issue Type: New Feature
    Affects Versions: 0.14.0
            Reporter: Xinyu Liu
            Assignee: Xinyu Liu


In SAMZA-974, we built a mechanism to support batch job with bounded data source. The feature allows Samza jobs to shut down once all the input topic partitions reach the end of stream.

This works for applications which do not produce and consume from the same streams. With the introduction of partitionBy operators, the application can output to an intermediate stream, and then consume the same stream again for further processing. Since the end of stream tokens are not carried over from the original input streams to the intermediate streams, the job won’t be able to shut down even if all the input streams reach to the end. To address this problem, we need to extend the existing end-of-stream feature to support applications with intermediate streams.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)