You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jagadish (JIRA)" <ji...@apache.org> on 2017/11/01 19:58:00 UTC

[jira] [Created] (SAMZA-1479) Kafka checkpoint manager improvements

Jagadish created SAMZA-1479:
-------------------------------

             Summary: Kafka checkpoint manager improvements
                 Key: SAMZA-1479
                 URL: https://issues.apache.org/jira/browse/SAMZA-1479
             Project: Samza
          Issue Type: Bug
            Reporter: Jagadish


This proposal adds the following improvements to KafkaCheckpointManager for better testability:

*  Rewrite `KafkaCheckpointLogKey` into two classes - an immutable class, and a SerDe
* Remove dependency on static setters in the `KafkaCheckpointLogKey`
* Change lifecycle of components in KafkaCheckpointManager 
  - It's safe to start producers and consumers during `start` as opposed to lazy loading them during writes, and reads.
  - Initialize systemProducer and systemConsumer during construction
* Simplify logic for ignoring checkpoint validations
* Re-write checkpointManager#readLog() to use a simpler API.  
* Remove unnecessary complexity after the migration from 0.8
* Remove unnecessary locking in startup, and shut-down
* Remove dependencies on SimpleConsumer configs like bufferSize, fetchSize, socketTimeout
* Refactor KafkaCheckpointManagerFactory and remove static getCheckpointSystemNameAndFactory
* Bug-fix : Register the taskName correctly (instead of using a dummy string for the taskName)

Testing improvements:

* Add unit tests to verify more checkpoint scenarios
* Consolidate unit tests into utils for creating producer, consumer and admin instances
* Convert/consolidate most long-running integration tests into unit tests




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)