You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by "Sunil (JIRA)" <ji...@apache.org> on 2016/08/12 20:40:20 UTC

[jira] [Created] (APEXMALHAR-2187) Kafka Input Operator supports retry for loading initial offset

Sunil created APEXMALHAR-2187:
---------------------------------

             Summary: Kafka Input Operator supports retry for loading initial offset
                 Key: APEXMALHAR-2187
                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2187
             Project: Apache Apex Malhar
          Issue Type: New Feature
            Reporter: Sunil
            Assignee: Sandesh


Goal : 2 Operartors for Kafka Output

      1. Simple Kafka Output Operator 
            - Supports Atleast Once 
            - Expose most used producer properties as class properties

      2. Exactly Once Kafka Output ( Not possible in all the cases, will be documented later )
            

Design for Exactly Once

Window Data Manager - Stores the Kafka partitions offsets.
Kafka Key - Used by the operator = AppID#OperatorId

During recovery. Partially written window is re-created using the following  approach:

Tuples between the largest recovery offsets and the current offset are checked. Based on the key, tuples written by the other entities are discarded. 

Only tuples which are not in the recovered set are emitted.

Tuples needs to be unique within the window.
      



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)