You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Thomas Becker (JIRA)" <ji...@apache.org> on 2015/03/03 23:01:05 UTC
[jira] [Commented] (STORM-650) Storm-Kafka Refactoring and Improvements

    [ https://issues.apache.org/jira/browse/STORM-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345822#comment-14345822 ] 

Thomas Becker commented on STORM-650:
-------------------------------------

I’ve tried to put the current suggestions into a story format that should help us create/update the appropriate tickets.
I've grouped the suggestions into things that affect the internals and things that affect users of the library. 

As a storm developer I’d like to use the Kafka Metadata API for broker management to avoid having to know kafka-internal zookeeper details (STORM-590/STORM-631) 
As a storm developer I’d like to use ITuple interface consistently to avoid duplication (STORM-631)
As a storm developer I’d like all loggers to be private (STORM-650)
As a storm developer I’d like consistent, structured exception handling (STORM-650)
As a storm developer I’d like to use the new kafka consumer API (0.8.3) to reduce dependencies and use long term supported kafka apis (STORM-650)
As a storm developer I’d like to use the new kafka producer API to reduce dependencies and use long term supported kafka apis (STORM-650)
As a storm developer I'd like to avoid having to use unnecessary marker interfaces (STORM-631) 


As an API client developer I’d like to be able to distinguish between internal and public APIs to avoid confusion (STORM-650)
As an API client developer I’d like to be able to select the starting point in kafka in an unambiguous way (STORM-563)
As an API client developer I’d like all public APIs to be documented (STORM-650)
As an API client developer I’d like to be able to use a pluggable failure handler (https://github.com/apache/storm/pull/406)
As an API client developer I’d like to be able to use a single way of configuring storm and trident kafka topologies (STORM-631)
As an API client developer I'd like the kafka related configuration to be immutable (STORM-650)
As an API client developer I’d like to be able to white list topics in the kafka spout. (STORM-650)
As an API client developer I’d like to know the offset and partition for a message so i can audit and replay messages (STORM-697) 

If we agree on this this list then I can go ahead and create/update the tickets so that we can move forward and start discussing further details in the individual tickets.

Thanks 

> Storm-Kafka Refactoring and Improvements
> ----------------------------------------
>
>                 Key: STORM-650
>                 URL: https://issues.apache.org/jira/browse/STORM-650
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka
>            Reporter: P. Taylor Goetz
>
> This is intended to be a parent/umbrella JIRA covering a number of efforts/suggestions aimed at improving the storm-kafka module. The goal is to facilitate communication and collaboration by providing a central point for discussion and coordination.
> The first phase should be to identify and agree upon a list of high-level points we would like to address. Once that is complete, we can move on to implementation/design discussions, followed by an implementation plan, division of labor, etc.
> A non-exhaustive, initial list of items follows. New/additional thoughts can be proposed in the comments.
> * Improve API for Specifying the Kafka Starting point
> Configuring the kafka spout's starting position (e.g. forceFromStart=true) is a common source of confusion. This should be refactored to provide an easy to understand, unambiguous API for configuring this property.
> * Use Kafka APIs Instead of Internal ZK Metadata (STORM-590)
> Currently the Kafka spout relies on reading Kafka's internal metadata from zookeeper. This should be refactored to use the Kafka Consumer API to protect against changes to the internal metadata format stored in ZK.
> * Improve Error Handling
> There are a number of failure scenarios with the kafka spout that users may want to react to differently based on their use case. Add a failure handler API that allows users to implement and/or plug in alternative failure handling implementations. It is assumed that default (sane) implementations would be included and configured by default.
> * Configuration/General Refactoring (BrokerHosts, etc.) (STORM-631)
> (need to flesh this out better) Reduce unnecessary marker interfaces/"instance of" checks. Unify configuration of core storm/trident spout implementations.
> * Kafka Spout doesn't pick up from the beginning of the queue unless forceFromStart specified (STORM-563)
> Discussion Items:
> * How important is backward compatibility?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)