You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Andrew Montalenti (JIRA)" <ji...@apache.org> on 2014/12/02 04:44:12 UTC

[jira] [Commented] (STORM-561) Add ability to create topologies dynamically

    [ https://issues.apache.org/jira/browse/STORM-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14230924#comment-14230924 ] 

Andrew Montalenti commented on STORM-561:
-----------------------------------------

+1. I am a coauthor of streamparse, a Python interop library for Storm. We currently rely on lein and the Clojure DSL to build topologies because this issue exists. As described in this Twitter conversation, this is an impediment for new users wanting to use Storm and Python together:

https://mobile.twitter.com/sarthakdev/status/539390816339247104

I planned on building a Python DSL that generated appropriate Clojure DSL code to simplify pure Python use cases. I could imagine the same Python DSL also generating the JSON format that this issue might add to the Storm core. This might help make Python a first-class citizen in Storm ecosystem. I'd therefore love to volunteer my team's time for this and discuss further.

> Add ability to create topologies dynamically
> --------------------------------------------
>
>                 Key: STORM-561
>                 URL: https://issues.apache.org/jira/browse/STORM-561
>             Project: Apache Storm
>          Issue Type: Improvement
>            Reporter: Nathan Leung
>            Assignee: Nathan Leung
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> It would be nice if a storm topology could be built dynamically, instead of requiring a recompile to change parameters (e.g. number of workers, number of tasks, layout, etc).
> I would propose the following data structures for building core storm topologies.  I haven't done a design for trident yet but the intention would be to add trident support when core storm support is complete (or in parallel if there are other people working on it):
> {code}
> // fields value and arguments are mutually exclusive
> class Argument {
>     String argumentType;  // Class used to lookup arguments in method/constructor
>     String implementationType; // Class used to create this argument
>     String value; // String used to construct this argument
>     List<Argument> arguments; // arguments used to build this argument
> }
> class Dependency {
>     String upstreamComponent; // name of upstream component
>     String grouping;
>     List<Argument> arguments; // arguments for the grouping
> }
> class StormSpout {
>     String name;
>     String klazz;  // Class of this spout
>     List <Argument> arguments;
>     int numTasks;
>     int numExecutors;
> }
> class StormBolt {
>     String name;
>     String klazz; // Class of this bolt
>     List <Argument> arguments;
>     int numTasks;
>     int numExecutors;
>     List<Dependency> dependencies;
> }
> class StormTopologyRepresentation {
>     String name;
>     List<StormSpout> spouts;
>     List<StormBolt> bolts;
>     Map config;
>     int numWorkers;
> }
> {code}
> Topology creation will be built on top of the data structures above.  The benefits:
> * Dependency free.  Code to unmarshal from json, xml, etc, can be kept in extensions, or as examples, and users can write a different unmarshaller if they want to use a different text representation.
> * support arbitrary spout and bolts types
> * support of all groupings, streams, via reflections
> * ability to specify configuration map via config file
> * reification of spout / bolt / dependency arguments
> ** recursive argument reification for complex objects



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)