You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Longda Feng <zh...@alibaba-inc.com> on 2015/11/19 07:13:11 UTC

Re: [DISCUSS] Storm 2.0 plan

Sorry for changing the Subject.

I am +1 for releasing Storm 2.0 with java core, which is merged with JStorm.

I think the change of this release will be the biggest one in history. It will probably take a long time to develop. At the same time, Heron is going to open source, and the latest release of Flink provides the compatibility to Storm’s API. These might be the threat to Storm. So I suggest we start the development of Storm 2.0 as quickly as possible. In order to accelerate the development cycle, I proposed to take JStorm 2.1.0 core and UI as the base version since this version is stable and compatible with API of Storm 1.0. Please refer to the phases below for the detailed merging plan.

Note: We provide a demo of JStorm’s web UI. Please refer to storm.taobao.org . I think JStorm will give a totally different view to you.

I would like to share the experience of initial development of JStorm (Migrate from clojure core to java core). 
Our team(4 developers) have spent almost one year to finish the migration. We took 4 months to release the first JStorm version, and 6 months to make JStorm stable. During this period, we tried to switch more than online 100 applications with different scenarios from Storm to JStorm, and many bugs were fixed. Then more and more applications were switched to JStorm in Alibaba.
Currently, there are 7000+ nodes of JStorm clusters in Alibaba and 2000+ applications are running on them. The JStorm Clusters here can handle 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData field but also in many other online scenarios.
Besides it, we have experienced the November 11th Shopping Festival of Alibaba for last three years. At that day, the computation in our cluster increased several times than usual. All applications worked well during the peak time. I can say the stability of JStorm is no doubt today. Actually, besides Alibaba, the most powerful Chinese IT company are also using JStorm.


Phase 1:
 
Define the target of Storm 2.0. List the requirement of Storm 2.0
1. Open a new Umbrella Jira (https://issues.apache.org/jira/browse/STORM-717)
2. Create one 2.0 branch, 
2.1 Copy modules from JStorm, one module from one module
2.2 The sequence is extern modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
3.1 Discuss solution for each difference(jira)
3.2 Once the solution is finalized, we can start the merging. (Some issues could be start concurrently. It depends on the discussion.)

The phase mainly try to define target and finalize the solution. Hopefully this phase could be finished in 2 month(before 2016/1/31). . 


Phase 2:
Release Storm 2.0 beta
1. Based on phrase 1's discussion, finish all features of Storm 2.0
2. Integrate all modules, make the simplest storm example can run on the system.
3. Test with all example and modules in Storm code base.
4. All daily test can be passed.
 
Hopefully this phase could be finished in 2 month(before 2016/3/31)


Phase 3:
Persuade some user to have a try.
Alibaba will try to run some online applications on the beta version

Hopefully this phase could be finished in 1 month(before 2016/4/31).


Any comments are welcome.


Thanks
Longda------------------------------------------------------------------From:P. Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <de...@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re: [DISCUSS] Plan for Merging JStorm Code
All I have at this point is a placeholder wiki entry [1], and a lot of local notes that likely would only make sense to me.

Let me know your wiki username and I’ll give you permissions. The same goes for anyone else who wants to help.

-Taylor

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109

> On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> 
> Taylor and others I was hoping to get started filing JIRA and planning on how we are going to do the java migration + JStorm merger.  Is anyone else starting to do this?  If not would anyone object to me starting on it? - Bobby
> 
> 
>    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <pt...@gmail.com> wrote:
> 
> 
> Thanks for putting this together Basti, that comparison helps a lot.
> 
> And thanks Bobby for converting it into markdown. I was going to just attach the spreadsheet to JIRA, but markdown is a much better solution.
> 
> -Taylor
> 
>> On Nov 12, 2015, at 12:03 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
>> 
>> I translated the excel spreadsheet into a markdown file and put up a pull request for it.
>> https://github.com/apache/storm/pull/877
>> I did a few edits to it to make it work with Markdown, and to add in a few of my own comments.  I also put in a field for JIRAs to be able to track the migration.
>> Overall I think your evaluation was very good.  We have a fair amount of work ahead of us to decide what version of various features we want to go forward with.
>>   - Bobby
>> 
>> 
>>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> 
>> 
>> Hi Bobby & Jungtaek,
>> 
>> Thanks for your replay.
>> I totally agree that compatibility is the most important thing. Actually, JStorm has been compatible with the user API of Storm.
>> As you mentioned below, we indeed still have some features different between Storm and JStorm. I have tried to list them (minor update or improvements are not included).
>> Please refer to attachment for details. If any missing, please help to point out. (The current working features are probably missing here.)
>> Just have a look at these differences. For the missing features in JStorm, I did not see any obstacle which will block the merge to JStorm.
>> For the features which has different solution between Storm and JStorm, we can evaluate the solution one by one to decision which one is appropriate.
>> After the finalization of evaluation, I think JStorm team can take the merging job and publish a stable release in 2 months.
>> But anyway, the detailed implementation for these features with different solution is transparent to user. So, from user's point of view, there is not any compatibility problem.
>> 
>> Besides compatibility, by our experience, stability is also important and is not an easy job. 4 people in JStorm team took almost one year to finish the porting from "clojure core"
>> to "java core", and to make it stable. Of course, we have many devs in community to make the porting job faster. But it still needs a long time to run many online complex topologys to find bugs and fix them. So, that is the reason why I proposed to do merging and build on a stable "java core".
>> 
>> -----Original Message-----
>> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
>> Sent: Wednesday, November 11, 2015 10:51 PM
>> To: dev@storm.apache.org
>> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
>> 
>> +1 for doing a 1.0 release based off of the clojure 0.11.x code.  Migrating the APIs to org.apache.storm is a big non-backwards compatible move, and a major version bump to 2.x seems like a good move there.
>> +1 for the release plan
>> 
>> I would like the move for user facing APIs to org.apache to be one of the last things we do.  Translating clojure code into java and moving it to org.apache I am not too concerned about.
>> 
>> Basti,
>> We have two code bases that have diverged significantly from one another in terms of functionality.  The storm code now or soon will have A Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware Scheduling, a distributed cache like API, log searching, security, massive performance improvements, shaded almost all of our dependencies, a REST API for programtically accessing everything on the UI, and I am sure I am missing a few other things.  JStorm also has many changes including cgroup isolation, restructured zookeeper layout, classpath isolation, and more too.
>> No matter what we do it will be a large effort to port changes from one code base to another, and from clojure to java.  I proposed this initially because it can be broken up into incremental changes.  It may take a little longer, but we will always have a working codebase that is testable and compatible with the current storm release, at least until we move the user facing APIs to be under org.apache.  This lets the community continue to build and test the master branch and report problems that they find, which is incredibly valuable.  I personally don't think it will be much easier, especially if we are intent on always maintaining compatibility with storm. - Bobby
>> 
>> 
>>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> 
>> 
>> Hi Taylor,
>> 
>> 
>> 
>> Thanks for the merge plan. I have a question about “Phase 2.2”.
>> 
>> Do you mean community plan to create a fresh new “java core” based on current “clojure core” firstly, and then migrate the features from JStorm?
>> 
>> If so, it confused me.  It is really a huge job which might require a long developing time to make it stable, while JStorm is already a stable version.
>> 
>> The release planned to be release after Nov 11th has already run online stably several month in Alibaba.
>> 
>> Besides this, there are many valuable internal requirements in Alibaba, the fast evolution of JStorm is forseeable in next few months.
>> 
>> If the “java core” is totally fresh new, it might bring many problems for the coming merge.
>> 
>> So, from the point of this view,  I think it is much better and easier to migrate the features of “clojure core” basing on JStorm for the “java core”.
>> 
>> Please correct me, if any misunderstanding.
>> 
>> 
>> 
>> Regards
>> 
>> Basti
>> 
>> 
>> 
>> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
>> 发送时间: 2015年11月11日 5:32
>> 收件人: dev@storm.apache.org
>> 主题: [DISCUSS] Plan for Merging JStorm Code
>> 
>> 
>> 
>> Based on a number of discussions regarding merging the JStorm code, I’ve tried to distill the ideas presented and inserted some of my own. The result is below.
>> 
>> 
>> 
>> I’ve divided the plan into three phases, though they are not necessarily sequential — obviously some tasks can take place in parallel.
>> 
>> 
>> 
>> None of this is set in stone, just presented for discussion. Any and all comments are welcome.
>> 
>> 
>> 
>> -------
>> 
>> 
>> 
>> Phase 1 - Plan for 0.11.x Release
>> 
>> 1. Determine feature set for 0.11.x and publish to wiki [1].
>> 
>> 2. Announce feature-freeze for 0.11.x
>> 
>> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
>> 
>> 4. Release 0.11.0 (or whatever version # we want to use)
>> 
>> 5. Bug fixes and subsequent releases from 0.11.x-branch
>> 
>> 
>> 
>> 
>> 
>> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
>> 
>> 1. Determine/document unique features in JStorm (e.g. classpath isolation, cgroups, etc.) and create JIRA for migrating the feature.
>> 
>> 2. Create JIRA for migrating each clojure component (or logical group of components) to Java. Assumes tests will be ported as well.
>> 
>> 3. Discuss/establish style guide for Java coding conventions. Consider using Oracle’s or Google’s Java conventions as a base — they are both pretty solid.
>> 
>> 4. align package names (e.g backtype.storm --> org.apache.storm / com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
>> 
>> 
>> 
>> 
>> 
>> Phase 3 - Migrate Clojure --> Java
>> 
>> 1. Port code/tests to Java, leveraging existing JStorm code wherever possible (core functionality only, features distinct to JStorm migrated separately).
>> 
>> 2. Port JStorm-specific features.
>> 
>> 3. Begin releasing preview/beta versions.
>> 
>> 4. Code cleanup (across the board) and refactoring using established coding conventions, and leveraging PMD/Checkstyle reports for reference. (Note: good oportunity for new contributors.)
>> 
>> 5. Release 0.12.0 (or whatever version # we want to use) and lift feature freeze.
>> 
>> 
>> 
>> 
>> 
>> Notes:
>> 
>> We should consider bumping up to version 1.0 sometime soon and then switching to semantic versioning [3] from then on.
>> 
>> 
>> 
>> 
>> 
>> With the exception of package name alignment, the "jstorm-import" branch will largely be read-only throughout the process.
>> 
>> 
>> 
>> During migration, it's probably easiest to operate with two local clones of the Apache Storm repo: one for working (i.e. checked out to working branch) and one for reference/copying (i.e. checked out to "jstorm-import").
>> 
>> 
>> 
>> Feature-freeze probably only needs to be enforced against core functionality. Components under "external" can likely be exempt, but we should figure out a process for accepting and releasing new features during the migration.
>> 
>> 
>> 
>> Performance testing should be continuous throughout the process. Since we don't really have ASF infrastructure for performance testing, we will need a volunteer(s) to host and run the performance tests. Performance test results can be posted to the wiki [2]. It would probably be a good idea to establish a baseline with the 0.10.0 release.
>> 
>> 
>> 
>> I’ve attached an analysis document Sean Zhong put together a while back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3 release but is still relevant and has a lot of good information.
>> 
>> 
>> 
>> 
>> 
>> [1] https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
>> 
>> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
>> 
>> [3] http://semver.org
>> 
>> [4] https://issues.apache.org/jira/browse/STORM-717
>> 
>> 
>> 
>> 
>> 
>> -Taylor
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 



Re: [DISCUSS] Storm 2.0 plan

Posted by "Boyang(Jerry) Peng" <je...@yahoo-inc.com.INVALID>.
I am not such of fan of simply substituting the Apache Storm clojure core with the JStorm java core either.  Don't get me wrong, I would very much like to merge the two projects together and build a stronger community but we cannot just hastily rush this process. Like others have said the projects have diverged quite a bit in the last couple of years.  I am just nervous to take to JStorm core whole sale without carefully examining the implementation.  I am sure JStorm has a lot of good features that Apache Storm doesn't have, but compatibility is a key issue that determines the future of the project.  We can use the JStorm code base as a reference but we really need examine and potentially rewrite every line of clojure code to do this right and to make sure we minimize compatibility issues even for the features that are small.  This may sound like it can take a while but with the potential number of full time developers working on this, I see it taking only a month or so.
Also, Apache Storm has also received numerous improvements (heartbeat server, backpressure, disruptor queue optimizations, etc...) over the last year so it would be nice if we can run some sort of performance tests between JStorm and Apache Storm to see the strengths and weakness of both implementations.  Perhaps after that we can have a better gauge of what pieces needs to be taken from where for Storm 2.0.   
As for being scared of the other projects.  I think each project has its own fortes and area it does well in but Storm definitely still does pretty well in the area it was designed the strive in, thus I am not that concerned.
Just some of my thoughts.  In the end, i am really excited for the future of this project, especially with the addition of developers from the JStorm project.
Best,
Boyang Jerry Peng 


    On Thursday, November 19, 2015 9:59 PM, Sean Zhong <cl...@gmail.com> wrote:
 

 Hi All,

I think there are may be some misproper use of words or misunderstanding.

Here is what I can see both agrees these goals:
1. We want to migrate clojure to java
2. We want to merge important features together.
3. We want to do this in a step by step, transparent, reviewable way,
especially with close examination and reviews for code that has
architecture change.
4. We want the final version to remain compatibility.

The only difference is the process we use to achieve these goals.
Longda's view:
1. do a parallel migration from clojure core to java part by part. parallel
means equivalent, no new features added in this step. He suggest to follow
the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
web ui/others"
He use word "copy", which is mis-proper in my idea. It is more like a
merging.
quote on his words.

>  2.1 Copy modules from JStorm, one module from one module

2.2 The sequence is extern modules/client/utils/nimbus/
> supervisor/drpc/worker & task/web ui/others

2. upon the java core code base, incremental add new feature blocks.
quote on his words.

> 3.1 Discuss solution for each difference(jira)
> 3.2 Once the solution is finalized, we can start the
> merging. (Some issues could be start concurrently. It
> depends on the discussion.)

3.  His goal is to remain compatibility. "this version is stable and
compatible with API of Storm 1.0." is not accurate statement from my point,
at least not for the security feature.
4. He share his concern on other streaming engines.


Bobby and Jungtaek 's view:
1. "Copy" is not acceptable, it will impact the security features. (Copy is
a wrong phase to use, I think Longda means more a merging)
2. With JStorm team, we start with clojure -> java translation first,
3. By optimistic view, with JStorm team, one month should be enough for
above stage.
3. Adding new features after whole code is migrated to java.
4. No need to that worry about other engines.

If my understanding of both parties are correct. I think we agree on most
of things about the process.
first: clojure -> java
second: merge features.

With a slight difference about how aggressive we want to do "clojure ->
java", and how long it takes.


@Longda, can you clarify whether my understanding of your opinion is right?


Sean


On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com> wrote:

> Very well stated Juntaek.
>
> I should also point out that there's nothing stopping the JStorm team from
> releasing new versions of JStorm, or adding new features. But you would
> have to be careful to note that any such release is "JStorm" and not
> "Apache Storm." And any such release cannot be hosted on Apache
> infrastructure.
>
> We also shouldn't be too worried about competition with other stream
> processing frameworks. Competition is healthy and leads to improvements
> across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> integration. It also borrowed memory management ideas from Flink. I don't
> see that as a problem. This is open source. We can, and should, do the same
> where applicable.
>
> Did we learn anything from the Heron paper? Nothing we didn't already
> know. And a lot of the points have been addressed. We dealt security first,
> which is more important for adoption, especially in the enterprise. Now
> we've addressed many performance, scaling, and usability issues. Most of
> the production deployments I've seen are nowhere near the magnitude of what
> twitter requires. But I've seen many deployments that  only exist because
> we offer security. I doubt heron has that.
>
> We've also seen an uptick in community and developer involvement, which
> means a likely increase in committers, which likely means a faster
> turnaround for patch reviews, which means a tighter release cycle for new
> features, which means we will be moving faster. This is healthy for an
> Apache project.
>
> And the inclusion of the JStorm team will only make that more so.
>
> I feel we are headed in the right direction, and there are good things to
> come.
>
> -Taylor
>
>
> > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> >
> > Sorry Longda, but I can't help telling that I also disagree about
> changing codebase.
> >
> > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> just in point of feature's view. We can't be safe to change although
> feature matrixes are identical, because feature matrix doesn't contain the
> details.
> >
> > I mean, users could be scared when expected behaviors are not in place
> although they're small. User experience is the one of the most important
> part of the project, and if UX changes are huge, barrier for upgrading
> their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> should be the worst scenario I can imagine after merging.
> >
> > The safest way to merge is applying JStorm's great features to Apache
> Storm.
> > I think porting language of Apache Storm to Java is not tightly related
> to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> itself can port to other languages like Java, Scala, or something else
> which are more popular than Clojure.
> >
> > And I'm also not scary about Flink, Heron, Spark, etc.
> > It doesn't mean other projects are not greater then Storm. Just I'm
> saying each projects have their own strength.
> > For example, all conferences are saying about Spark, and as one of users
> of Spark, Spark is really great. If you are a little bit familiar with
> Scala, you can just apply Scala-like functional methods to RDD. Really easy
> to use.
> > But it doesn't mean that Spark can replace Storm in all kind of use
> cases. Recently I've seen some articles that why Storm is more preferred in
> realtime streaming processing.
> >
> > Competition should give us a positive motivation. I hope that our
> roadmap isn't focused to defeat competitors, but is focused to present
> great features, better performance, and better UX to Storm community. It's
> not commercial product, it's open source project!
> >
> > tl;dr. Please don't change codebase unless we plan to release a brand
> new project. It breaks UX completely which could make users leave.
> >
> > I'm also open to other opinions as well.
> >
> > Best,
> > Jungtaek Lim (HeartSaVioR)
> >
> >
> > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> >> I disagree completely.  You claim that JStorm is compatible with storm
> 1.0.  I don't believe that it is 100% compatible.  There has been more then
> 2 years of software development happening on both sides.  Security was not
> done in a day, and porting it over to JStorm is not going to happen
> quickly, and because of the major architectural changes between storm and
> JStorm I believe we would have to make some serious enhancements to fully
> support a secure TopologyMaster, but I need to look into it more.  The blob
> store is another piece of code that has taken a very long time to develop.
> There are numberous others.  The big features are not the ones that make me
> nervous because we can plan for them, it is the hundreds of small JIRA and
> features that will result in minor incompatibilities.  If we start with
> storm itself, and follow the same process that we have been doing up until
> now, if there is a reason to stop the port add in an important feature and
> do a release, we can.  We will know that we have compatibility vs starting
> with JStorm we know from the start that we do not without adding feature X,
> Y, Z, ....
> >>
> >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> etc...  We just did some major performance enhancements that will be
> released with STORM 1.0.  We now have up to 6x the throughput that we had
> before with minimal changes to the latency (20 ms vs 5 ms).  We have
> automatic back-pressure so if someone was running with acking enabled just
> for flow control they can now process close to 16x the throughput they
> could before with the same hardware.  This puts our throughput very much on
> par with flink and Spark, but with a much lower latency compared to either
> of them.  Plus from what I have heard Flink is still calling the streaming
> API beta, and their storm API compatibility is very rudimentary.  They are
> also going to have more and more problems maintaining compatibility as we
> add in new features and functionality.
> >>
> >> Spark only really works well when it is running with several seconds of
> latency. Not every one needs sub-second processing, but when your platform
> is completely unable to handle it, locks you out of a lot of use cases.
> Their throughput is decent and can scale very high when you are willing to
> tolerate similarly very high latencies.
> >> Who knows about Heron until they actually release their code, but it is
> missing lots of critical features, and the one they touted, better
> performance, is a moot point with storm 1.0.  The only thing we really are
> lacking is advertising, we don't have a big company really pushing storm
> and getting it in the news all the time (Sorry Hortonworks, but I really
> have not seen much about it in the news).  I am trying to do more, but
> there is only so much I can do.
> >> Longda I very much agree with you about moving quickly to make the
> transition, but I do not believe in any way that starting with JStorm is
> going to reduce that transition time.
> >> My proposal is to give everyone about 2 weeks to finish merging new
> features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> for a release.  At the same time development work to port storm to java
> begins.  You said it took 4 developers 1 year to port storm to java the
> first time for JStorm.  We have 14+ active developers and over one hundred
> contributors not including those from the JStorm community.  If numbers
> scale linearly, I know they don't completely, we should be able to do a
> complete port with no JStorm reference in around 100 days.  With a copy and
> paste for a lot of this from the JStorm codebase, I would expect to be able
> to do it in 1 month of development, possibly less if the JStorm community
> can really help out too.  So by January we should be ready to begin pulling
> in features from JStorm that make since.  Looking at the feature matrix in
> https://github.com/apache/storm/pull/877 there are a few potentially big
> improvements that we would want to pull in, but they require architectural
> changes in some cases that I don't want to just do lightly.  I would
> propose that one the code has been ported to java we reopen for all new
> features in parallel with the JStorm feature migration, but I am open to
> others opinions as well.
> >>  - Bobby
> >>
> >>
> >>    On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> zhongyan.feng@alibaba-inc.com> wrote:
> >>
> >>
> >>  Sorry for changing the Subject.
> >>
> >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> JStorm.
> >>
> >> I think the change of this release will be the biggest one in history.
> It will probably take a long time to develop. At the same time, Heron is
> going to open source, and the latest release of Flink provides the
> compatibility to Storm’s API. These might be the threat to Storm. So I
> suggest we start the development of Storm 2.0 as quickly as possible. In
> order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> core and UI as the base version since this version is stable and compatible
> with API of Storm 1.0. Please refer to the phases below for the detailed
> merging plan.
> >>
> >> Note: We provide a demo of JStorm’s web UI. Please refer to
> storm.taobao.org . I think JStorm will give a totally different view to
> you.
> >>
> >> I would like to share the experience of initial development of JStorm
> (Migrate from clojure core to java core).
> >> Our team(4 developers) have spent almost one year to finish the
> migration. We took 4 months to release the first JStorm version, and 6
> months to make JStorm stable. During this period, we tried to switch more
> than online 100 applications with different scenarios from Storm to JStorm,
> and many bugs were fixed. Then more and more applications were switched to
> JStorm in Alibaba.
> >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> 2000+ applications are running on them. The JStorm Clusters here can handle
> 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> field but also in many other online scenarios.
> >> Besides it, we have experienced the November 11th Shopping Festival of
> Alibaba for last three years. At that day, the computation in our cluster
> increased several times than usual. All applications worked well during the
> peak time. I can say the stability of JStorm is no doubt today. Actually,
> besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> >>
> >>
> >> Phase 1:
> >>
> >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> >> 1. Open a new Umbrella Jira (
> https://issues.apache.org/jira/browse/STORM-717)
> >> 2. Create one 2.0 branch,
> >> 2.1 Copy modules from JStorm, one module from one module
> >> 2.2 The sequence is extern
> modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> >> 3.1 Discuss solution for each difference(jira)
> >> 3.2 Once the solution is finalized, we can start the merging. (Some
> issues could be start concurrently. It depends on the discussion.)
> >>
> >> The phase mainly try to define target and finalize the solution.
> Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> >>
> >>
> >> Phase 2:
> >> Release Storm 2.0 beta
> >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> >> 2. Integrate all modules, make the simplest storm example can run on
> the system.
> >> 3. Test with all example and modules in Storm code base.
> >> 4. All daily test can be passed.
> >>
> >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> >>
> >>
> >> Phase 3:
> >> Persuade some user to have a try.
> >> Alibaba will try to run some online applications on the beta version
> >>
> >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> >>
> >>
> >> Any comments are welcome.
> >>
> >>
> >> Thanks
> >>
> Longda------------------------------------------------------------------From:P.
> Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> [DISCUSS] Plan for Merging JStorm Code
> >> All I have at this point is a placeholder wiki entry [1], and a lot of
> local notes that likely would only make sense to me.
> >>
> >> Let me know your wiki username and I’ll give you permissions. The same
> goes for anyone else who wants to help.
> >>
> >> -Taylor
> >>
> >> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> >>
> >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> wrote:
> >> >
> >> > Taylor and others I was hoping to get started filing JIRA and
> planning on how we are going to do the java migration + JStorm merger.  Is
> anyone else starting to do this?  If not would anyone object to me starting
> on it? - Bobby
> >> >
> >> >
> >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> ptgoetz@gmail.com> wrote:
> >> >
> >> >
> >> > Thanks for putting this together Basti, that comparison helps a lot.
> >> >
> >> > And thanks Bobby for converting it into markdown. I was going to just
> attach the spreadsheet to JIRA, but markdown is a much better solution.
> >> >
> >> > -Taylor
> >> >
> >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> <ev...@yahoo-inc.com.INVALID> wrote:
> >> >>
> >> >> I translated the excel spreadsheet into a markdown file and put up a
> pull request for it.
> >> >> https://github.com/apache/storm/pull/877
> >> >> I did a few edits to it to make it work with Markdown, and to add in
> a few of my own comments.  I also put in a field for JIRAs to be able to
> track the migration.
> >> >> Overall I think your evaluation was very good.  We have a fair
> amount of work ahead of us to decide what version of various features we
> want to go forward with.
> >> >>  - Bobby
> >> >>
> >> >>
> >> >>    On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Bobby & Jungtaek,
> >> >>
> >> >> Thanks for your replay.
> >> >> I totally agree that compatibility is the most important thing.
> Actually, JStorm has been compatible with the user API of Storm.
> >> >> As you mentioned below, we indeed still have some features different
> between Storm and JStorm. I have tried to list them (minor update or
> improvements are not included).
> >> >> Please refer to attachment for details. If any missing, please help
> to point out. (The current working features are probably missing here.)
> >> >> Just have a look at these differences. For the missing features in
> JStorm, I did not see any obstacle which will block the merge to JStorm.
> >> >> For the features which has different solution between Storm and
> JStorm, we can evaluate the solution one by one to decision which one is
> appropriate.
> >> >> After the finalization of evaluation, I think JStorm team can take
> the merging job and publish a stable release in 2 months.
> >> >> But anyway, the detailed implementation for these features with
> different solution is transparent to user. So, from user's point of view,
> there is not any compatibility problem.
> >> >>
> >> >> Besides compatibility, by our experience, stability is also
> important and is not an easy job. 4 people in JStorm team took almost one
> year to finish the porting from "clojure core"
> >> >> to "java core", and to make it stable. Of course, we have many devs
> in community to make the porting job faster. But it still needs a long time
> to run many online complex topologys to find bugs and fix them. So, that is
> the reason why I proposed to do merging and build on a stable "java core".
> >> >>
> >> >> -----Original Message-----
> >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> >> To: dev@storm.apache.org
> >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> Migrating the APIs to org.apache.storm is a big non-backwards compatible
> move, and a major version bump to 2.x seems like a good move there.
> >> >> +1 for the release plan
> >> >>
> >> >> I would like the move for user facing APIs to org.apache to be one
> of the last things we do.  Translating clojure code into java and moving it
> to org.apache I am not too concerned about.
> >> >>
> >> >> Basti,
> >> >> We have two code bases that have diverged significantly from one
> another in terms of functionality.  The storm code now or soon will have A
> Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> Scheduling, a distributed cache like API, log searching, security, massive
> performance improvements, shaded almost all of our dependencies, a REST API
> for programtically accessing everything on the UI, and I am sure I am
> missing a few other things.  JStorm also has many changes including cgroup
> isolation, restructured zookeeper layout, classpath isolation, and more too.
> >> >> No matter what we do it will be a large effort to port changes from
> one code base to another, and from clojure to java.  I proposed this
> initially because it can be broken up into incremental changes.  It may
> take a little longer, but we will always have a working codebase that is
> testable and compatible with the current storm release, at least until we
> move the user facing APIs to be under org.apache.  This lets the community
> continue to build and test the master branch and report problems that they
> find, which is incredibly valuable.  I personally don't think it will be
> much easier, especially if we are intent on always maintaining
> compatibility with storm. - Bobby
> >> >>
> >> >>
> >> >>    On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Taylor,
> >> >>
> >> >>
> >> >>
> >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> >> >>
> >> >> Do you mean community plan to create a fresh new “java core” based
> on current “clojure core” firstly, and then migrate the features from
> JStorm?
> >> >>
> >> >> If so, it confused me.  It is really a huge job which might require
> a long developing time to make it stable, while JStorm is already a stable
> version.
> >> >>
> >> >> The release planned to be release after Nov 11th has already run
> online stably several month in Alibaba.
> >> >>
> >> >> Besides this, there are many valuable internal requirements in
> Alibaba, the fast evolution of JStorm is forseeable in next few months.
> >> >>
> >> >> If the “java core” is totally fresh new, it might bring many
> problems for the coming merge.
> >> >>
> >> >> So, from the point of this view,  I think it is much better and
> easier to migrate the features of “clojure core” basing on JStorm for the
> “java core”.
> >> >>
> >> >> Please correct me, if any misunderstanding.
> >> >>
> >> >>
> >> >>
> >> >> Regards
> >> >>
> >> >> Basti
> >> >>
> >> >>
> >> >>
> >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> >> >> 发送时间: 2015年11月11日 5:32
> >> >> 收件人: dev@storm.apache.org
> >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >>
> >> >>
> >> >> Based on a number of discussions regarding merging the JStorm code,
> I’ve tried to distill the ideas presented and inserted some of my own. The
> result is below.
> >> >>
> >> >>
> >> >>
> >> >> I’ve divided the plan into three phases, though they are not
> necessarily sequential — obviously some tasks can take place in parallel.
> >> >>
> >> >>
> >> >>
> >> >> None of this is set in stone, just presented for discussion. Any and
> all comments are welcome.
> >> >>
> >> >>
> >> >>
> >> >> -------
> >> >>
> >> >>
> >> >>
> >> >> Phase 1 - Plan for 0.11.x Release
> >> >>
> >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >> >>
> >> >> 2. Announce feature-freeze for 0.11.x
> >> >>
> >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >> >>
> >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >> >>
> >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >> >>
> >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> isolation, cgroups, etc.) and create JIRA for migrating the feature.
> >> >>
> >> >> 2. Create JIRA for migrating each clojure component (or logical
> group of components) to Java. Assumes tests will be ported as well.
> >> >>
> >> >> 3. Discuss/establish style guide for Java coding conventions.
> Consider using Oracle’s or Google’s Java conventions as a base — they are
> both pretty solid.
> >> >>
> >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 3 - Migrate Clojure --> Java
> >> >>
> >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> possible (core functionality only, features distinct to JStorm migrated
> separately).
> >> >>
> >> >> 2. Port JStorm-specific features.
> >> >>
> >> >> 3. Begin releasing preview/beta versions.
> >> >>
> >> >> 4. Code cleanup (across the board) and refactoring using established
> coding conventions, and leveraging PMD/Checkstyle reports for reference.
> (Note: good oportunity for new contributors.)
> >> >>
> >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> feature freeze.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Notes:
> >> >>
> >> >> We should consider bumping up to version 1.0 sometime soon and then
> switching to semantic versioning [3] from then on.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> With the exception of package name alignment, the "jstorm-import"
> branch will largely be read-only throughout the process.
> >> >>
> >> >>
> >> >>
> >> >> During migration, it's probably easiest to operate with two local
> clones of the Apache Storm repo: one for working (i.e. checked out to
> working branch) and one for reference/copying (i.e. checked out to
> "jstorm-import").
> >> >>
> >> >>
> >> >>
> >> >> Feature-freeze probably only needs to be enforced against core
> functionality. Components under "external" can likely be exempt, but we
> should figure out a process for accepting and releasing new features during
> the migration.
> >> >>
> >> >>
> >> >>
> >> >> Performance testing should be continuous throughout the process.
> Since we don't really have ASF infrastructure for performance testing, we
> will need a volunteer(s) to host and run the performance tests. Performance
> test results can be posted to the wiki [2]. It would probably be a good
> idea to establish a baseline with the 0.10.0 release.
> >> >>
> >> >>
> >> >>
> >> >> I’ve attached an analysis document Sean Zhong put together a while
> back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> release but is still relevant and has a lot of good information.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> [1]
> https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> >> >>
> >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> >> >>
> >> >> [3] http://semver.org
> >> >>
> >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> -Taylor
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >
> >
> >
> > --
> > Name : 임 정택
> > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > Twitter : http://twitter.com/heartsavior
> > LinkedIn : http://www.linkedin.com/in/heartsavior
>

  

Re: [DISCUSS] Storm 2.0 plan

Posted by "P. Taylor Goetz" <pt...@gmail.com>.
Thanks Bobby.

Just a FYI to all I'm on vacation until Tuesday, but would be happy to review/help out then.

-Taylor

> On Nov 20, 2015, at 5:13 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> 
> I just finished updating the wiki with a rough outline of a what I see as a plan.  It is mostly covered by two large tables, one covering clojure migration, and the other covering feature porting from Jstorm to storm. 
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> 
> Any feedback is welcome.  I hope to start filing some JIRAs for this work over the weekend or on Monday.  I will make sure to take everything with something that should let us all see and track our progress. - Bobby 
> 
> 
>    On Friday, November 20, 2015 10:35 AM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> 
> 
> Longda,
> Thanks for the clarification.  I think you stated things very well.  I am 100% behind you on the merger part and my team is very close to being ready to work on the merge 100% as well.  The important points for me is that we do it in the open, following the storm bylaws, that we never break the storm build, and that we move quickly.  This gives us the ability to constantly test and verify that we are not breaking compatibility, not regressing performance, and that we don't disrupt the community too much for too long.  I don't care about the order that we do things or the ultimate language that we end up using.  If you have already started on some of this work that is great, but please file JIRAs for what you are currently doing so we are all aware of it, and don't end up with duplicated efforts.  Hopefully soon we will do the 1.0 branch and we can start reviewing/merging the code changes you have been working on for the migration.
> 
> - Bobby 
> 
> 
>     On Friday, November 20, 2015 9:57 AM, Harsha <ma...@harsha.io> wrote:
> 
> 
> Hi All,
>           If possible can we have bi-weekly or monthly video hangouts to
>           discuss the plan. I think it will make it easier to discuss
>           the next steps. We can post the details of the discussion on
>           the mailing list so that everyone is involved in whats going .
> 
> Thanks,
> Harsha
>> On Fri, Nov 20, 2015, at 01:03 AM, Longda Feng wrote:
>> 
>> @Sean, Thanks for clarify.
>> @Taylor, @Bobby, @Sean, @Jungtaek, @Harsha, @dev,
>> Sorry for leading to misunderstanding.
>> The biggest point:We would like to merge two community into one
>> community, One community is stronger than two single communities. My team
>> hopes that Alibaba can directly use the Apache Storm version  in the next
>> few years. My team don't need to maintain JStorm any more, this is the
>> reason why Alibaba donated JStorm. 
>> Second point:Sean's point is right. The migration is not just "copy". It
>> should be "merge". I means that the module will not simply as the JStorm
>> module. It should be the result of our disccussion. I think the final
>> solution after merging can make Storm better. 
>> Third point:In fact, I don't scare other streaming process, especially
>> for Heron. I have work on Storm for 4 years, I am a deep fans of Storm. I
>> know what can Storm do and what storm cannot do . But I want to express
>> we need accelerate our evolve speed. This field is so active. We should
>> start to learn other framework's advantage as soon as possible.
>> Especally, we need more application level programming framework like
>> Trident. This wil attract more users to Storm.
>> Fourth point:We don't need to do everything from scratch, we can use
>> JStorm as much as possible. JStorm is here, why not use.
>> Last point:My team is already full time on this merge, we will try our
>> best to do contribution, make Storm better. 
>> 
>> ThanksLongda
>> 
>> ------------------------------------------------------------------From:Sean
>> Zhong <cl...@gmail.com>Send Time:2015年11月20日(星期五) 11:58To:dev
>> <de...@storm.apache.org>Subject:Re: [DISCUSS] Storm 2.0 plan
>> Hi All,
>> 
>> I think there are may be some misproper use of words or misunderstanding.
>> 
>> Here is what I can see both agrees these goals:
>> 1. We want to migrate clojure to java
>> 2. We want to merge important features together.
>> 3. We want to do this in a step by step, transparent, reviewable way,
>> especially with close examination and reviews for code that has
>> architecture change.
>> 4. We want the final version to remain compatibility.
>> 
>> The only difference is the process we use to achieve these goals.
>> Longda's view:
>> 1. do a parallel migration from clojure core to java part by part.
>> parallel
>> means equivalent, no new features added in this step. He suggest to
>> follow
>> the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
>> web ui/others"
>> He use word "copy", which is mis-proper in my idea. It is more like a
>> merging.
>> quote on his words.
>> 
>>>   2.1 Copy modules from JStorm, one module from one module
>> 
>> 2.2 The sequence is extern modules/client/utils/nimbus/
>>>  supervisor/drpc/worker & task/web ui/others
>> 
>> 2. upon the java core code base, incremental add new feature blocks.
>> quote on his words.
>> 
>>>  3.1 Discuss solution for each difference(jira)
>>>  3.2 Once the solution is finalized, we can start the
>>>  merging. (Some issues could be start concurrently. It
>>>  depends on the discussion.)
>> 
>> 3.  His goal is to remain compatibility. "this version is stable and
>> compatible with API of Storm 1.0." is not accurate statement from my
>> point,
>> at least not for the security feature.
>> 4. He share his concern on other streaming engines.
>> 
>> 
>> Bobby and Jungtaek 's view:
>> 1. "Copy" is not acceptable, it will impact the security features. (Copy
>> is
>> a wrong phase to use, I think Longda means more a merging)
>> 2. With JStorm team, we start with clojure -> java translation first,
>> 3. By optimistic view, with JStorm team, one month should be enough for
>> above stage.
>> 3. Adding new features after whole code is migrated to java.
>> 4. No need to that worry about other engines.
>> 
>> If my understanding of both parties are correct. I think we agree on most
>> of things about the process.
>> first: clojure -> java
>> second: merge features.
>> 
>> With a slight difference about how aggressive we want to do "clojure ->
>> java", and how long it takes.
>> 
>> 
>> @Longda, can you clarify whether my understanding of your opinion is
>> right?
>> 
>> 
>> Sean
>> 
>> 
>> On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com>
>> wrote:
>> 
>>>  Very well stated Juntaek.
>>> 
>>>  I should also point out that there's nothing stopping the JStorm team from
>>>  releasing new versions of JStorm, or adding new features. But you would
>>>  have to be careful to note that any such release is "JStorm" and not
>>>  "Apache Storm." And any such release cannot be hosted on Apache
>>>  infrastructure.
>>> 
>>>  We also shouldn't be too worried about competition with other stream
>>>  processing frameworks. Competition is healthy and leads to improvements
>>>  across the board. Spark Streaming borrowed ideas from Storm for its Kafka
>>>  integration. It also borrowed memory management ideas from Flink. I don't
>>>  see that as a problem. This is open source. We can, and should, do the same
>>>  where applicable.
>>> 
>>>  Did we learn anything from the Heron paper? Nothing we didn't already
>>>  know. And a lot of the points have been addressed. We dealt security first,
>>>  which is more important for adoption, especially in the enterprise. Now
>>>  we've addressed many performance, scaling, and usability issues. Most of
>>>  the production deployments I've seen are nowhere near the magnitude of what
>>>  twitter requires. But I've seen many deployments that  only exist because
>>>  we offer security. I doubt heron has that.
>>> 
>>>  We've also seen an uptick in community and developer involvement, which
>>>  means a likely increase in committers, which likely means a faster
>>>  turnaround for patch reviews, which means a tighter release cycle for new
>>>  features, which means we will be moving faster. This is healthy for an
>>>  Apache project.
>>> 
>>>  And the inclusion of the JStorm team will only make that more so.
>>> 
>>>  I feel we are headed in the right direction, and there are good things to
>>>  come.
>>> 
>>>  -Taylor
>>> 
>>> 
>>>  > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
>>>  >
>>>  > Sorry Longda, but I can't help telling that I also disagree about
>>>  changing codebase.
>>>  >
>>>  > Feature matrix shows us how far Apache Storm and JStorm are diverged,
>>>  just in point of feature's view. We can't be safe to change although
>>>  feature matrixes are identical, because feature matrix doesn't contain the
>>>  details.
>>>  >
>>>  > I mean, users could be scared when expected behaviors are not in place
>>>  although they're small. User experience is the one of the most important
>>>  part of the project, and if UX changes are huge, barrier for upgrading
>>>  their Storm cluster to 2.0 is not far easier than migrating to Heron. It
>>>  should be the worst scenario I can imagine after merging.
>>>  >
>>>  > The safest way to merge is applying JStorm's great features to Apache
>>>  Storm.
>>>  > I think porting language of Apache Storm to Java is not tightly related
>>>  to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
>>>  itself can port to other languages like Java, Scala, or something else
>>>  which are more popular than Clojure.
>>>  >
>>>  > And I'm also not scary about Flink, Heron, Spark, etc.
>>>  > It doesn't mean other projects are not greater then Storm. Just I'm
>>>  saying each projects have their own strength.
>>>  > For example, all conferences are saying about Spark, and as one of users
>>>  of Spark, Spark is really great. If you are a little bit familiar with
>>>  Scala, you can just apply Scala-like functional methods to RDD. Really easy
>>>  to use.
>>>  > But it doesn't mean that Spark can replace Storm in all kind of use
>>>  cases. Recently I've seen some articles that why Storm is more preferred in
>>>  realtime streaming processing.
>>>  >
>>>  > Competition should give us a positive motivation. I hope that our
>>>  roadmap isn't focused to defeat competitors, but is focused to present
>>>  great features, better performance, and better UX to Storm community. It's
>>>  not commercial product, it's open source project!
>>>  >
>>>  > tl;dr. Please don't change codebase unless we plan to release a brand
>>>  new project. It breaks UX completely which could make users leave.
>>>  >
>>>  > I'm also open to other opinions as well.
>>>  >
>>>  > Best,
>>>  > Jungtaek Lim (HeartSaVioR)
>>>  >
>>>  >
>>>  > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
>>>  >> I disagree completely.  You claim that JStorm is compatible with storm
>>>  1.0.  I don't believe that it is 100% compatible.  There has been more then
>>>  2 years of software development happening on both sides.  Security was not
>>>  done in a day, and porting it over to JStorm is not going to happen
>>>  quickly, and because of the major architectural changes between storm and
>>>  JStorm I believe we would have to make some serious enhancements to fully
>>>  support a secure TopologyMaster, but I need to look into it more.  The blob
>>>  store is another piece of code that has taken a very long time to develop.
>>>  There are numberous others.  The big features are not the ones that make me
>>>  nervous because we can plan for them, it is the hundreds of small JIRA and
>>>  features that will result in minor incompatibilities.  If we start with
>>>  storm itself, and follow the same process that we have been doing up until
>>>  now, if there is a reason to stop the port add in an important feature and
>>>  do a release, we can.  We will know that we have compatibility vs starting
>>>  with JStorm we know from the start that we do not without adding feature X,
>>>  Y, Z, ....
>>>  >>
>>>  >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
>>>  etc...  We just did some major performance enhancements that will be
>>>  released with STORM 1.0.  We now have up to 6x the throughput that we had
>>>  before with minimal changes to the latency (20 ms vs 5 ms).  We have
>>>  automatic back-pressure so if someone was running with acking enabled just
>>>  for flow control they can now process close to 16x the throughput they
>>>  could before with the same hardware.  This puts our throughput very much on
>>>  par with flink and Spark, but with a much lower latency compared to either
>>>  of them.  Plus from what I have heard Flink is still calling the streaming
>>>  API beta, and their storm API compatibility is very rudimentary.  They are
>>>  also going to have more and more problems maintaining compatibility as we
>>>  add in new features and functionality.
>>>  >>
>>>  >> Spark only really works well when it is running with several seconds of
>>>  latency. Not every one needs sub-second processing, but when your platform
>>>  is completely unable to handle it, locks you out of a lot of use cases.
>>>  Their throughput is decent and can scale very high when you are willing to
>>>  tolerate similarly very high latencies.
>>>  >> Who knows about Heron until they actually release their code, but it is
>>>  missing lots of critical features, and the one they touted, better
>>>  performance, is a moot point with storm 1.0.  The only thing we really are
>>>  lacking is advertising, we don't have a big company really pushing storm
>>>  and getting it in the news all the time (Sorry Hortonworks, but I really
>>>  have not seen much about it in the news).  I am trying to do more, but
>>>  there is only so much I can do.
>>>  >> Longda I very much agree with you about moving quickly to make the
>>>  transition, but I do not believe in any way that starting with JStorm is
>>>  going to reduce that transition time.
>>>  >> My proposal is to give everyone about 2 weeks to finish merging new
>>>  features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
>>>  for a release.  At the same time development work to port storm to java
>>>  begins.  You said it took 4 developers 1 year to port storm to java the
>>>  first time for JStorm.  We have 14+ active developers and over one hundred
>>>  contributors not including those from the JStorm community.  If numbers
>>>  scale linearly, I know they don't completely, we should be able to do a
>>>  complete port with no JStorm reference in around 100 days.  With a copy and
>>>  paste for a lot of this from the JStorm codebase, I would expect to be able
>>>  to do it in 1 month of development, possibly less if the JStorm community
>>>  can really help out too.  So by January we should be ready to begin pulling
>>>  in features from JStorm that make since.  Looking at the feature matrix in
>>>  https://github.com/apache/storm/pull/877 there are a few potentially big
>>>  improvements that we would want to pull in, but they require architectural
>>>  changes in some cases that I don't want to just do lightly.  I would
>>>  propose that one the code has been ported to java we reopen for all new
>>>  features in parallel with the JStorm feature migration, but I am open to
>>>  others opinions as well.
>>>  >>  - Bobby
>>>  >>
>>>  >>
>>>  >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
>>>  zhongyan.feng@alibaba-inc.com> wrote:
>>>  >>
>>>  >>
>>>  >>  Sorry for changing the Subject.
>>>  >>
>>>  >> I am +1 for releasing Storm 2.0 with java core, which is merged with
>>>  JStorm.
>>>  >>
>>>  >> I think the change of this release will be the biggest one in history.
>>>  It will probably take a long time to develop. At the same time, Heron is
>>>  going to open source, and the latest release of Flink provides the
>>>  compatibility to Storm’s API. These might be the threat to Storm. So I
>>>  suggest we start the development of Storm 2.0 as quickly as possible. In
>>>  order to accelerate the development cycle, I proposed to take JStorm 2.1.0
>>>  core and UI as the base version since this version is stable and compatible
>>>  with API of Storm 1.0. Please refer to the phases below for the detailed
>>>  merging plan.
>>>  >>
>>>  >> Note: We provide a demo of JStorm’s web UI. Please refer to
>>>  storm.taobao.org . I think JStorm will give a totally different view to
>>>  you.
>>>  >>
>>>  >> I would like to share the experience of initial development of JStorm
>>>  (Migrate from clojure core to java core).
>>>  >> Our team(4 developers) have spent almost one year to finish the
>>>  migration. We took 4 months to release the first JStorm version, and 6
>>>  months to make JStorm stable. During this period, we tried to switch more
>>>  than online 100 applications with different scenarios from Storm to JStorm,
>>>  and many bugs were fixed. Then more and more applications were switched to
>>>  JStorm in Alibaba.
>>>  >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
>>>  2000+ applications are running on them. The JStorm Clusters here can handle
>>>  1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
>>>  field but also in many other online scenarios.
>>>  >> Besides it, we have experienced the November 11th Shopping Festival of
>>>  Alibaba for last three years. At that day, the computation in our cluster
>>>  increased several times than usual. All applications worked well during the
>>>  peak time. I can say the stability of JStorm is no doubt today. Actually,
>>>  besides Alibaba, the most powerful Chinese IT company are also using JStorm.
>>>  >>
>>>  >>
>>>  >> Phase 1:
>>>  >>
>>>  >> Define the target of Storm 2.0. List the requirement of Storm 2.0
>>>  >> 1. Open a new Umbrella Jira (
>>>  https://issues.apache.org/jira/browse/STORM-717)
>>>  >> 2. Create one 2.0 branch,
>>>  >> 2.1 Copy modules from JStorm, one module from one module
>>>  >> 2.2 The sequence is extern
>>>  modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
>>>  >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
>>>  >> 3.1 Discuss solution for each difference(jira)
>>>  >> 3.2 Once the solution is finalized, we can start the merging. (Some
>>>  issues could be start concurrently. It depends on the discussion.)
>>>  >>
>>>  >> The phase mainly try to define target and finalize the solution.
>>>  Hopefully this phase could be finished in 2 month(before 2016/1/31). .
>>>  >>
>>>  >>
>>>  >> Phase 2:
>>>  >> Release Storm 2.0 beta
>>>  >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
>>>  >> 2. Integrate all modules, make the simplest storm example can run on
>>>  the system.
>>>  >> 3. Test with all example and modules in Storm code base.
>>>  >> 4. All daily test can be passed.
>>>  >>
>>>  >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
>>>  >>
>>>  >>
>>>  >> Phase 3:
>>>  >> Persuade some user to have a try.
>>>  >> Alibaba will try to run some online applications on the beta version
>>>  >>
>>>  >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
>>>  >>
>>>  >>
>>>  >> Any comments are welcome.
>>>  >>
>>>  >>
>>>  >> Thanks
>>>  >>
>>>  Longda------------------------------------------------------------------From:P.
>>>  Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
>>>  dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
>>>  [DISCUSS] Plan for Merging JStorm Code
>>>  >> All I have at this point is a placeholder wiki entry [1], and a lot of
>>>  local notes that likely would only make sense to me.
>>>  >>
>>>  >> Let me know your wiki username and I’ll give you permissions. The same
>>>  goes for anyone else who wants to help.
>>>  >>
>>>  >> -Taylor
>>>  >>
>>>  >> [1]
>>>  https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
>>>  >>
>>>  >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
>>>  wrote:
>>>  >> >
>>>  >> > Taylor and others I was hoping to get started filing JIRA and
>>>  planning on how we are going to do the java migration + JStorm merger.  Is
>>>  anyone else starting to do this?  If not would anyone object to me starting
>>>  on it? - Bobby
>>>  >> >
>>>  >> >
>>>  >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
>>>  ptgoetz@gmail.com> wrote:
>>>  >> >
>>>  >> >
>>>  >> > Thanks for putting this together Basti, that comparison helps a lot.
>>>  >> >
>>>  >> > And thanks Bobby for converting it into markdown. I was going to just
>>>  attach the spreadsheet to JIRA, but markdown is a much better solution.
>>>  >> >
>>>  >> > -Taylor
>>>  >> >
>>>  >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
>>>  <ev...@yahoo-inc.com.INVALID> wrote:
>>>  >> >>
>>>  >> >> I translated the excel spreadsheet into a markdown file and put up a
>>>  pull request for it.
>>>  >> >> https://github.com/apache/storm/pull/877
>>>  >> >> I did a few edits to it to make it work with Markdown, and to add in
>>>  a few of my own comments.  I also put in a field for JIRAs to be able to
>>>  track the migration.
>>>  >> >> Overall I think your evaluation was very good.  We have a fair
>>>  amount of work ahead of us to decide what version of various features we
>>>  want to go forward with.
>>>  >> >>   - Bobby
>>>  >> >>
>>>  >> >>
>>>  >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
>>>  basti.lj@alibaba-inc.com> wrote:
>>>  >> >>
>>>  >> >>
>>>  >> >> Hi Bobby & Jungtaek,
>>>  >> >>
>>>  >> >> Thanks for your replay.
>>>  >> >> I totally agree that compatibility is the most important thing.
>>>  Actually, JStorm has been compatible with the user API of Storm.
>>>  >> >> As you mentioned below, we indeed still have some features different
>>>  between Storm and JStorm. I have tried to list them (minor update or
>>>  improvements are not included).
>>>  >> >> Please refer to attachment for details. If any missing, please help
>>>  to point out. (The current working features are probably missing here.)
>>>  >> >> Just have a look at these differences. For the missing features in
>>>  JStorm, I did not see any obstacle which will block the merge to JStorm.
>>>  >> >> For the features which has different solution between Storm and
>>>  JStorm, we can evaluate the solution one by one to decision which one is
>>>  appropriate.
>>>  >> >> After the finalization of evaluation, I think JStorm team can take
>>>  the merging job and publish a stable release in 2 months.
>>>  >> >> But anyway, the detailed implementation for these features with
>>>  different solution is transparent to user. So, from user's point of view,
>>>  there is not any compatibility problem.
>>>  >> >>
>>>  >> >> Besides compatibility, by our experience, stability is also
>>>  important and is not an easy job. 4 people in JStorm team took almost one
>>>  year to finish the porting from "clojure core"
>>>  >> >> to "java core", and to make it stable. Of course, we have many devs
>>>  in community to make the porting job faster. But it still needs a long time
>>>  to run many online complex topologys to find bugs and fix them. So, that is
>>>  the reason why I proposed to do merging and build on a stable "java core".
>>>  >> >>
>>>  >> >> -----Original Message-----
>>>  >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
>>>  >> >> Sent: Wednesday, November 11, 2015 10:51 PM
>>>  >> >> To: dev@storm.apache.org
>>>  >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
>>>  >> >>
>>>  >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
>>>  Migrating the APIs to org.apache.storm is a big non-backwards compatible
>>>  move, and a major version bump to 2.x seems like a good move there.
>>>  >> >> +1 for the release plan
>>>  >> >>
>>>  >> >> I would like the move for user facing APIs to org.apache to be one
>>>  of the last things we do.  Translating clojure code into java and moving it
>>>  to org.apache I am not too concerned about.
>>>  >> >>
>>>  >> >> Basti,
>>>  >> >> We have two code bases that have diverged significantly from one
>>>  another in terms of functionality.  The storm code now or soon will have A
>>>  Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
>>>  Scheduling, a distributed cache like API, log searching, security, massive
>>>  performance improvements, shaded almost all of our dependencies, a REST API
>>>  for programtically accessing everything on the UI, and I am sure I am
>>>  missing a few other things.  JStorm also has many changes including cgroup
>>>  isolation, restructured zookeeper layout, classpath isolation, and more too.
>>>  >> >> No matter what we do it will be a large effort to port changes from
>>>  one code base to another, and from clojure to java.  I proposed this
>>>  initially because it can be broken up into incremental changes.  It may
>>>  take a little longer, but we will always have a working codebase that is
>>>  testable and compatible with the current storm release, at least until we
>>>  move the user facing APIs to be under org.apache.  This lets the community
>>>  continue to build and test the master branch and report problems that they
>>>  find, which is incredibly valuable.  I personally don't think it will be
>>>  much easier, especially if we are intent on always maintaining
>>>  compatibility with storm. - Bobby
>>>  >> >>
>>>  >> >>
>>>  >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
>>>  basti.lj@alibaba-inc.com> wrote:
>>>  >> >>
>>>  >> >>
>>>  >> >> Hi Taylor,
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
>>>  >> >>
>>>  >> >> Do you mean community plan to create a fresh new “java core” based
>>>  on current “clojure core” firstly, and then migrate the features from
>>>  JStorm?
>>>  >> >>
>>>  >> >> If so, it confused me.  It is really a huge job which might require
>>>  a long developing time to make it stable, while JStorm is already a stable
>>>  version.
>>>  >> >>
>>>  >> >> The release planned to be release after Nov 11th has already run
>>>  online stably several month in Alibaba.
>>>  >> >>
>>>  >> >> Besides this, there are many valuable internal requirements in
>>>  Alibaba, the fast evolution of JStorm is forseeable in next few months.
>>>  >> >>
>>>  >> >> If the “java core” is totally fresh new, it might bring many
>>>  problems for the coming merge.
>>>  >> >>
>>>  >> >> So, from the point of this view,  I think it is much better and
>>>  easier to migrate the features of “clojure core” basing on JStorm for the
>>>  “java core”.
>>>  >> >>
>>>  >> >> Please correct me, if any misunderstanding.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Regards
>>>  >> >>
>>>  >> >> Basti
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
>>>  >> >> 发送时间: 2015年11月11日 5:32
>>>  >> >> 收件人: dev@storm.apache.org
>>>  >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Based on a number of discussions regarding merging the JStorm code,
>>>  I’ve tried to distill the ideas presented and inserted some of my own. The
>>>  result is below.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> I’ve divided the plan into three phases, though they are not
>>>  necessarily sequential — obviously some tasks can take place in parallel.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> None of this is set in stone, just presented for discussion. Any and
>>>  all comments are welcome.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> -------
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Phase 1 - Plan for 0.11.x Release
>>>  >> >>
>>>  >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
>>>  >> >>
>>>  >> >> 2. Announce feature-freeze for 0.11.x
>>>  >> >>
>>>  >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
>>>  >> >>
>>>  >> >> 4. Release 0.11.0 (or whatever version # we want to use)
>>>  >> >>
>>>  >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
>>>  >> >>
>>>  >> >> 1. Determine/document unique features in JStorm (e.g. classpath
>>>  isolation, cgroups, etc.) and create JIRA for migrating the feature.
>>>  >> >>
>>>  >> >> 2. Create JIRA for migrating each clojure component (or logical
>>>  group of components) to Java. Assumes tests will be ported as well.
>>>  >> >>
>>>  >> >> 3. Discuss/establish style guide for Java coding conventions.
>>>  Consider using Oracle’s or Google’s Java conventions as a base — they are
>>>  both pretty solid.
>>>  >> >>
>>>  >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
>>>  com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Phase 3 - Migrate Clojure --> Java
>>>  >> >>
>>>  >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
>>>  possible (core functionality only, features distinct to JStorm migrated
>>>  separately).
>>>  >> >>
>>>  >> >> 2. Port JStorm-specific features.
>>>  >> >>
>>>  >> >> 3. Begin releasing preview/beta versions.
>>>  >> >>
>>>  >> >> 4. Code cleanup (across the board) and refactoring using established
>>>  coding conventions, and leveraging PMD/Checkstyle reports for reference.
>>>  (Note: good oportunity for new contributors.)
>>>  >> >>
>>>  >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
>>>  feature freeze.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Notes:
>>>  >> >>
>>>  >> >> We should consider bumping up to version 1.0 sometime soon and then
>>>  switching to semantic versioning [3] from then on.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> With the exception of package name alignment, the "jstorm-import"
>>>  branch will largely be read-only throughout the process.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> During migration, it's probably easiest to operate with two local
>>>  clones of the Apache Storm repo: one for working (i.e. checked out to
>>>  working branch) and one for reference/copying (i.e. checked out to
>>>  "jstorm-import").
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Feature-freeze probably only needs to be enforced against core
>>>  functionality. Components under "external" can likely be exempt, but we
>>>  should figure out a process for accepting and releasing new features during
>>>  the migration.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> Performance testing should be continuous throughout the process.
>>>  Since we don't really have ASF infrastructure for performance testing, we
>>>  will need a volunteer(s) to host and run the performance tests. Performance
>>>  test results can be posted to the wiki [2]. It would probably be a good
>>>  idea to establish a baseline with the 0.10.0 release.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> I’ve attached an analysis document Sean Zhong put together a while
>>>  back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
>>>  release but is still relevant and has a lot of good information.
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> [1]
>>>  https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
>>>  >> >>
>>>  >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
>>>  >> >>
>>>  >> >> [3] http://semver.org
>>>  >> >>
>>>  >> >> [4] https://issues.apache.org/jira/browse/STORM-717
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >> -Taylor
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >>
>>>  >> >
>>>  >> >
>>>  >
>>>  >
>>>  >
>>>  > --
>>>  > Name : 임 정택
>>>  > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
>>>  > Twitter : http://twitter.com/heartsavior
>>>  > LinkedIn : http://www.linkedin.com/in/heartsavior
>>> 
>> 
> 
> 
> 

Re: [DISCUSS] Storm 2.0 plan

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
I just finished updating the wiki with a rough outline of a what I see as a plan.  It is mostly covered by two large tables, one covering clojure migration, and the other covering feature porting from Jstorm to storm. 

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109

Any feedback is welcome.  I hope to start filing some JIRAs for this work over the weekend or on Monday.  I will make sure to take everything with something that should let us all see and track our progress. - Bobby 


    On Friday, November 20, 2015 10:35 AM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
 

 Longda,
Thanks for the clarification.  I think you stated things very well.  I am 100% behind you on the merger part and my team is very close to being ready to work on the merge 100% as well.  The important points for me is that we do it in the open, following the storm bylaws, that we never break the storm build, and that we move quickly.  This gives us the ability to constantly test and verify that we are not breaking compatibility, not regressing performance, and that we don't disrupt the community too much for too long.  I don't care about the order that we do things or the ultimate language that we end up using.  If you have already started on some of this work that is great, but please file JIRAs for what you are currently doing so we are all aware of it, and don't end up with duplicated efforts.  Hopefully soon we will do the 1.0 branch and we can start reviewing/merging the code changes you have been working on for the migration.

- Bobby 


    On Friday, November 20, 2015 9:57 AM, Harsha <ma...@harsha.io> wrote:
 

 Hi All,
          If possible can we have bi-weekly or monthly video hangouts to
          discuss the plan. I think it will make it easier to discuss
          the next steps. We can post the details of the discussion on
          the mailing list so that everyone is involved in whats going .

Thanks,
Harsha
On Fri, Nov 20, 2015, at 01:03 AM, Longda Feng wrote:
> 
> @Sean, Thanks for clarify.
> @Taylor, @Bobby, @Sean, @Jungtaek, @Harsha, @dev,
> Sorry for leading to misunderstanding.
> The biggest point:We would like to merge two community into one
> community, One community is stronger than two single communities. My team
> hopes that Alibaba can directly use the Apache Storm version  in the next
> few years. My team don't need to maintain JStorm any more, this is the
> reason why Alibaba donated JStorm. 
> Second point:Sean's point is right. The migration is not just "copy". It
> should be "merge". I means that the module will not simply as the JStorm
> module. It should be the result of our disccussion. I think the final
> solution after merging can make Storm better. 
> Third point:In fact, I don't scare other streaming process, especially
> for Heron. I have work on Storm for 4 years, I am a deep fans of Storm. I
> know what can Storm do and what storm cannot do . But I want to express
> we need accelerate our evolve speed. This field is so active. We should
> start to learn other framework's advantage as soon as possible.
> Especally, we need more application level programming framework like
> Trident. This wil attract more users to Storm.
> Fourth point:We don't need to do everything from scratch, we can use
> JStorm as much as possible. JStorm is here, why not use.
> Last point:My team is already full time on this merge, we will try our
> best to do contribution, make Storm better. 
> 
> ThanksLongda
> 
> ------------------------------------------------------------------From:Sean
> Zhong <cl...@gmail.com>Send Time:2015年11月20日(星期五) 11:58To:dev
> <de...@storm.apache.org>Subject:Re: [DISCUSS] Storm 2.0 plan
> Hi All,
> 
> I think there are may be some misproper use of words or misunderstanding.
> 
> Here is what I can see both agrees these goals:
> 1. We want to migrate clojure to java
> 2. We want to merge important features together.
> 3. We want to do this in a step by step, transparent, reviewable way,
> especially with close examination and reviews for code that has
> architecture change.
> 4. We want the final version to remain compatibility.
> 
> The only difference is the process we use to achieve these goals.
> Longda's view:
> 1. do a parallel migration from clojure core to java part by part.
> parallel
> means equivalent, no new features added in this step. He suggest to
> follow
> the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
> web ui/others"
> He use word "copy", which is mis-proper in my idea. It is more like a
> merging.
> quote on his words.
> 
> >  2.1 Copy modules from JStorm, one module from one module
> 
> 2.2 The sequence is extern modules/client/utils/nimbus/
> > supervisor/drpc/worker & task/web ui/others
> 
> 2. upon the java core code base, incremental add new feature blocks.
> quote on his words.
> 
> > 3.1 Discuss solution for each difference(jira)
> > 3.2 Once the solution is finalized, we can start the
> > merging. (Some issues could be start concurrently. It
> > depends on the discussion.)
> 
> 3.  His goal is to remain compatibility. "this version is stable and
> compatible with API of Storm 1.0." is not accurate statement from my
> point,
> at least not for the security feature.
> 4. He share his concern on other streaming engines.
> 
> 
> Bobby and Jungtaek 's view:
> 1. "Copy" is not acceptable, it will impact the security features. (Copy
> is
> a wrong phase to use, I think Longda means more a merging)
> 2. With JStorm team, we start with clojure -> java translation first,
> 3. By optimistic view, with JStorm team, one month should be enough for
> above stage.
> 3. Adding new features after whole code is migrated to java.
> 4. No need to that worry about other engines.
> 
> If my understanding of both parties are correct. I think we agree on most
> of things about the process.
> first: clojure -> java
> second: merge features.
> 
> With a slight difference about how aggressive we want to do "clojure ->
> java", and how long it takes.
> 
> 
> @Longda, can you clarify whether my understanding of your opinion is
> right?
> 
> 
> Sean
> 
> 
> On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com>
> wrote:
> 
> > Very well stated Juntaek.
> >
> > I should also point out that there's nothing stopping the JStorm team from
> > releasing new versions of JStorm, or adding new features. But you would
> > have to be careful to note that any such release is "JStorm" and not
> > "Apache Storm." And any such release cannot be hosted on Apache
> > infrastructure.
> >
> > We also shouldn't be too worried about competition with other stream
> > processing frameworks. Competition is healthy and leads to improvements
> > across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> > integration. It also borrowed memory management ideas from Flink. I don't
> > see that as a problem. This is open source. We can, and should, do the same
> > where applicable.
> >
> > Did we learn anything from the Heron paper? Nothing we didn't already
> > know. And a lot of the points have been addressed. We dealt security first,
> > which is more important for adoption, especially in the enterprise. Now
> > we've addressed many performance, scaling, and usability issues. Most of
> > the production deployments I've seen are nowhere near the magnitude of what
> > twitter requires. But I've seen many deployments that  only exist because
> > we offer security. I doubt heron has that.
> >
> > We've also seen an uptick in community and developer involvement, which
> > means a likely increase in committers, which likely means a faster
> > turnaround for patch reviews, which means a tighter release cycle for new
> > features, which means we will be moving faster. This is healthy for an
> > Apache project.
> >
> > And the inclusion of the JStorm team will only make that more so.
> >
> > I feel we are headed in the right direction, and there are good things to
> > come.
> >
> > -Taylor
> >
> >
> > > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> > >
> > > Sorry Longda, but I can't help telling that I also disagree about
> > changing codebase.
> > >
> > > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> > just in point of feature's view. We can't be safe to change although
> > feature matrixes are identical, because feature matrix doesn't contain the
> > details.
> > >
> > > I mean, users could be scared when expected behaviors are not in place
> > although they're small. User experience is the one of the most important
> > part of the project, and if UX changes are huge, barrier for upgrading
> > their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> > should be the worst scenario I can imagine after merging.
> > >
> > > The safest way to merge is applying JStorm's great features to Apache
> > Storm.
> > > I think porting language of Apache Storm to Java is not tightly related
> > to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> > itself can port to other languages like Java, Scala, or something else
> > which are more popular than Clojure.
> > >
> > > And I'm also not scary about Flink, Heron, Spark, etc.
> > > It doesn't mean other projects are not greater then Storm. Just I'm
> > saying each projects have their own strength.
> > > For example, all conferences are saying about Spark, and as one of users
> > of Spark, Spark is really great. If you are a little bit familiar with
> > Scala, you can just apply Scala-like functional methods to RDD. Really easy
> > to use.
> > > But it doesn't mean that Spark can replace Storm in all kind of use
> > cases. Recently I've seen some articles that why Storm is more preferred in
> > realtime streaming processing.
> > >
> > > Competition should give us a positive motivation. I hope that our
> > roadmap isn't focused to defeat competitors, but is focused to present
> > great features, better performance, and better UX to Storm community. It's
> > not commercial product, it's open source project!
> > >
> > > tl;dr. Please don't change codebase unless we plan to release a brand
> > new project. It breaks UX completely which could make users leave.
> > >
> > > I'm also open to other opinions as well.
> > >
> > > Best,
> > > Jungtaek Lim (HeartSaVioR)
> > >
> > >
> > > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> > >> I disagree completely.  You claim that JStorm is compatible with storm
> > 1.0.  I don't believe that it is 100% compatible.  There has been more then
> > 2 years of software development happening on both sides.  Security was not
> > done in a day, and porting it over to JStorm is not going to happen
> > quickly, and because of the major architectural changes between storm and
> > JStorm I believe we would have to make some serious enhancements to fully
> > support a secure TopologyMaster, but I need to look into it more.  The blob
> > store is another piece of code that has taken a very long time to develop.
> > There are numberous others.  The big features are not the ones that make me
> > nervous because we can plan for them, it is the hundreds of small JIRA and
> > features that will result in minor incompatibilities.  If we start with
> > storm itself, and follow the same process that we have been doing up until
> > now, if there is a reason to stop the port add in an important feature and
> > do a release, we can.  We will know that we have compatibility vs starting
> > with JStorm we know from the start that we do not without adding feature X,
> > Y, Z, ....
> > >>
> > >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> > etc...  We just did some major performance enhancements that will be
> > released with STORM 1.0.  We now have up to 6x the throughput that we had
> > before with minimal changes to the latency (20 ms vs 5 ms).  We have
> > automatic back-pressure so if someone was running with acking enabled just
> > for flow control they can now process close to 16x the throughput they
> > could before with the same hardware.  This puts our throughput very much on
> > par with flink and Spark, but with a much lower latency compared to either
> > of them.  Plus from what I have heard Flink is still calling the streaming
> > API beta, and their storm API compatibility is very rudimentary.  They are
> > also going to have more and more problems maintaining compatibility as we
> > add in new features and functionality.
> > >>
> > >> Spark only really works well when it is running with several seconds of
> > latency. Not every one needs sub-second processing, but when your platform
> > is completely unable to handle it, locks you out of a lot of use cases.
> > Their throughput is decent and can scale very high when you are willing to
> > tolerate similarly very high latencies.
> > >> Who knows about Heron until they actually release their code, but it is
> > missing lots of critical features, and the one they touted, better
> > performance, is a moot point with storm 1.0.  The only thing we really are
> > lacking is advertising, we don't have a big company really pushing storm
> > and getting it in the news all the time (Sorry Hortonworks, but I really
> > have not seen much about it in the news).  I am trying to do more, but
> > there is only so much I can do.
> > >> Longda I very much agree with you about moving quickly to make the
> > transition, but I do not believe in any way that starting with JStorm is
> > going to reduce that transition time.
> > >> My proposal is to give everyone about 2 weeks to finish merging new
> > features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> > for a release.  At the same time development work to port storm to java
> > begins.  You said it took 4 developers 1 year to port storm to java the
> > first time for JStorm.  We have 14+ active developers and over one hundred
> > contributors not including those from the JStorm community.  If numbers
> > scale linearly, I know they don't completely, we should be able to do a
> > complete port with no JStorm reference in around 100 days.  With a copy and
> > paste for a lot of this from the JStorm codebase, I would expect to be able
> > to do it in 1 month of development, possibly less if the JStorm community
> > can really help out too.  So by January we should be ready to begin pulling
> > in features from JStorm that make since.  Looking at the feature matrix in
> > https://github.com/apache/storm/pull/877 there are a few potentially big
> > improvements that we would want to pull in, but they require architectural
> > changes in some cases that I don't want to just do lightly.  I would
> > propose that one the code has been ported to java we reopen for all new
> > features in parallel with the JStorm feature migration, but I am open to
> > others opinions as well.
> > >>  - Bobby
> > >>
> > >>
> > >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> > zhongyan.feng@alibaba-inc.com> wrote:
> > >>
> > >>
> > >>  Sorry for changing the Subject.
> > >>
> > >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> > JStorm.
> > >>
> > >> I think the change of this release will be the biggest one in history.
> > It will probably take a long time to develop. At the same time, Heron is
> > going to open source, and the latest release of Flink provides the
> > compatibility to Storm’s API. These might be the threat to Storm. So I
> > suggest we start the development of Storm 2.0 as quickly as possible. In
> > order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> > core and UI as the base version since this version is stable and compatible
> > with API of Storm 1.0. Please refer to the phases below for the detailed
> > merging plan.
> > >>
> > >> Note: We provide a demo of JStorm’s web UI. Please refer to
> > storm.taobao.org . I think JStorm will give a totally different view to
> > you.
> > >>
> > >> I would like to share the experience of initial development of JStorm
> > (Migrate from clojure core to java core).
> > >> Our team(4 developers) have spent almost one year to finish the
> > migration. We took 4 months to release the first JStorm version, and 6
> > months to make JStorm stable. During this period, we tried to switch more
> > than online 100 applications with different scenarios from Storm to JStorm,
> > and many bugs were fixed. Then more and more applications were switched to
> > JStorm in Alibaba.
> > >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> > 2000+ applications are running on them. The JStorm Clusters here can handle
> > 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> > field but also in many other online scenarios.
> > >> Besides it, we have experienced the November 11th Shopping Festival of
> > Alibaba for last three years. At that day, the computation in our cluster
> > increased several times than usual. All applications worked well during the
> > peak time. I can say the stability of JStorm is no doubt today. Actually,
> > besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> > >>
> > >>
> > >> Phase 1:
> > >>
> > >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> > >> 1. Open a new Umbrella Jira (
> > https://issues.apache.org/jira/browse/STORM-717)
> > >> 2. Create one 2.0 branch,
> > >> 2.1 Copy modules from JStorm, one module from one module
> > >> 2.2 The sequence is extern
> > modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> > >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> > >> 3.1 Discuss solution for each difference(jira)
> > >> 3.2 Once the solution is finalized, we can start the merging. (Some
> > issues could be start concurrently. It depends on the discussion.)
> > >>
> > >> The phase mainly try to define target and finalize the solution.
> > Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> > >>
> > >>
> > >> Phase 2:
> > >> Release Storm 2.0 beta
> > >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> > >> 2. Integrate all modules, make the simplest storm example can run on
> > the system.
> > >> 3. Test with all example and modules in Storm code base.
> > >> 4. All daily test can be passed.
> > >>
> > >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> > >>
> > >>
> > >> Phase 3:
> > >> Persuade some user to have a try.
> > >> Alibaba will try to run some online applications on the beta version
> > >>
> > >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> > >>
> > >>
> > >> Any comments are welcome.
> > >>
> > >>
> > >> Thanks
> > >>
> > Longda------------------------------------------------------------------From:P.
> > Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> > dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> > [DISCUSS] Plan for Merging JStorm Code
> > >> All I have at this point is a placeholder wiki entry [1], and a lot of
> > local notes that likely would only make sense to me.
> > >>
> > >> Let me know your wiki username and I’ll give you permissions. The same
> > goes for anyone else who wants to help.
> > >>
> > >> -Taylor
> > >>
> > >> [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> > >>
> > >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> > wrote:
> > >> >
> > >> > Taylor and others I was hoping to get started filing JIRA and
> > planning on how we are going to do the java migration + JStorm merger.  Is
> > anyone else starting to do this?  If not would anyone object to me starting
> > on it? - Bobby
> > >> >
> > >> >
> > >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> > ptgoetz@gmail.com> wrote:
> > >> >
> > >> >
> > >> > Thanks for putting this together Basti, that comparison helps a lot.
> > >> >
> > >> > And thanks Bobby for converting it into markdown. I was going to just
> > attach the spreadsheet to JIRA, but markdown is a much better solution.
> > >> >
> > >> > -Taylor
> > >> >
> > >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> > <ev...@yahoo-inc.com.INVALID> wrote:
> > >> >>
> > >> >> I translated the excel spreadsheet into a markdown file and put up a
> > pull request for it.
> > >> >> https://github.com/apache/storm/pull/877
> > >> >> I did a few edits to it to make it work with Markdown, and to add in
> > a few of my own comments.  I also put in a field for JIRAs to be able to
> > track the migration.
> > >> >> Overall I think your evaluation was very good.  We have a fair
> > amount of work ahead of us to decide what version of various features we
> > want to go forward with.
> > >> >>   - Bobby
> > >> >>
> > >> >>
> > >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Bobby & Jungtaek,
> > >> >>
> > >> >> Thanks for your replay.
> > >> >> I totally agree that compatibility is the most important thing.
> > Actually, JStorm has been compatible with the user API of Storm.
> > >> >> As you mentioned below, we indeed still have some features different
> > between Storm and JStorm. I have tried to list them (minor update or
> > improvements are not included).
> > >> >> Please refer to attachment for details. If any missing, please help
> > to point out. (The current working features are probably missing here.)
> > >> >> Just have a look at these differences. For the missing features in
> > JStorm, I did not see any obstacle which will block the merge to JStorm.
> > >> >> For the features which has different solution between Storm and
> > JStorm, we can evaluate the solution one by one to decision which one is
> > appropriate.
> > >> >> After the finalization of evaluation, I think JStorm team can take
> > the merging job and publish a stable release in 2 months.
> > >> >> But anyway, the detailed implementation for these features with
> > different solution is transparent to user. So, from user's point of view,
> > there is not any compatibility problem.
> > >> >>
> > >> >> Besides compatibility, by our experience, stability is also
> > important and is not an easy job. 4 people in JStorm team took almost one
> > year to finish the porting from "clojure core"
> > >> >> to "java core", and to make it stable. Of course, we have many devs
> > in community to make the porting job faster. But it still needs a long time
> > to run many online complex topologys to find bugs and fix them. So, that is
> > the reason why I proposed to do merging and build on a stable "java core".
> > >> >>
> > >> >> -----Original Message-----
> > >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> > >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> > >> >> To: dev@storm.apache.org
> > >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> > Migrating the APIs to org.apache.storm is a big non-backwards compatible
> > move, and a major version bump to 2.x seems like a good move there.
> > >> >> +1 for the release plan
> > >> >>
> > >> >> I would like the move for user facing APIs to org.apache to be one
> > of the last things we do.  Translating clojure code into java and moving it
> > to org.apache I am not too concerned about.
> > >> >>
> > >> >> Basti,
> > >> >> We have two code bases that have diverged significantly from one
> > another in terms of functionality.  The storm code now or soon will have A
> > Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> > Scheduling, a distributed cache like API, log searching, security, massive
> > performance improvements, shaded almost all of our dependencies, a REST API
> > for programtically accessing everything on the UI, and I am sure I am
> > missing a few other things.  JStorm also has many changes including cgroup
> > isolation, restructured zookeeper layout, classpath isolation, and more too.
> > >> >> No matter what we do it will be a large effort to port changes from
> > one code base to another, and from clojure to java.  I proposed this
> > initially because it can be broken up into incremental changes.  It may
> > take a little longer, but we will always have a working codebase that is
> > testable and compatible with the current storm release, at least until we
> > move the user facing APIs to be under org.apache.  This lets the community
> > continue to build and test the master branch and report problems that they
> > find, which is incredibly valuable.  I personally don't think it will be
> > much easier, especially if we are intent on always maintaining
> > compatibility with storm. - Bobby
> > >> >>
> > >> >>
> > >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Taylor,
> > >> >>
> > >> >>
> > >> >>
> > >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> > >> >>
> > >> >> Do you mean community plan to create a fresh new “java core” based
> > on current “clojure core” firstly, and then migrate the features from
> > JStorm?
> > >> >>
> > >> >> If so, it confused me.  It is really a huge job which might require
> > a long developing time to make it stable, while JStorm is already a stable
> > version.
> > >> >>
> > >> >> The release planned to be release after Nov 11th has already run
> > online stably several month in Alibaba.
> > >> >>
> > >> >> Besides this, there are many valuable internal requirements in
> > Alibaba, the fast evolution of JStorm is forseeable in next few months.
> > >> >>
> > >> >> If the “java core” is totally fresh new, it might bring many
> > problems for the coming merge.
> > >> >>
> > >> >> So, from the point of this view,  I think it is much better and
> > easier to migrate the features of “clojure core” basing on JStorm for the
> > “java core”.
> > >> >>
> > >> >> Please correct me, if any misunderstanding.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Regards
> > >> >>
> > >> >> Basti
> > >> >>
> > >> >>
> > >> >>
> > >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> > >> >> 发送时间: 2015年11月11日 5:32
> > >> >> 收件人: dev@storm.apache.org
> > >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >>
> > >> >>
> > >> >> Based on a number of discussions regarding merging the JStorm code,
> > I’ve tried to distill the ideas presented and inserted some of my own. The
> > result is below.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve divided the plan into three phases, though they are not
> > necessarily sequential — obviously some tasks can take place in parallel.
> > >> >>
> > >> >>
> > >> >>
> > >> >> None of this is set in stone, just presented for discussion. Any and
> > all comments are welcome.
> > >> >>
> > >> >>
> > >> >>
> > >> >> -------
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 1 - Plan for 0.11.x Release
> > >> >>
> > >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> > >> >>
> > >> >> 2. Announce feature-freeze for 0.11.x
> > >> >>
> > >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> > >> >>
> > >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> > >> >>
> > >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> > >> >>
> > >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> > isolation, cgroups, etc.) and create JIRA for migrating the feature.
> > >> >>
> > >> >> 2. Create JIRA for migrating each clojure component (or logical
> > group of components) to Java. Assumes tests will be ported as well.
> > >> >>
> > >> >> 3. Discuss/establish style guide for Java coding conventions.
> > Consider using Oracle’s or Google’s Java conventions as a base — they are
> > both pretty solid.
> > >> >>
> > >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> > com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 3 - Migrate Clojure --> Java
> > >> >>
> > >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> > possible (core functionality only, features distinct to JStorm migrated
> > separately).
> > >> >>
> > >> >> 2. Port JStorm-specific features.
> > >> >>
> > >> >> 3. Begin releasing preview/beta versions.
> > >> >>
> > >> >> 4. Code cleanup (across the board) and refactoring using established
> > coding conventions, and leveraging PMD/Checkstyle reports for reference.
> > (Note: good oportunity for new contributors.)
> > >> >>
> > >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> > feature freeze.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Notes:
> > >> >>
> > >> >> We should consider bumping up to version 1.0 sometime soon and then
> > switching to semantic versioning [3] from then on.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> With the exception of package name alignment, the "jstorm-import"
> > branch will largely be read-only throughout the process.
> > >> >>
> > >> >>
> > >> >>
> > >> >> During migration, it's probably easiest to operate with two local
> > clones of the Apache Storm repo: one for working (i.e. checked out to
> > working branch) and one for reference/copying (i.e. checked out to
> > "jstorm-import").
> > >> >>
> > >> >>
> > >> >>
> > >> >> Feature-freeze probably only needs to be enforced against core
> > functionality. Components under "external" can likely be exempt, but we
> > should figure out a process for accepting and releasing new features during
> > the migration.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Performance testing should be continuous throughout the process.
> > Since we don't really have ASF infrastructure for performance testing, we
> > will need a volunteer(s) to host and run the performance tests. Performance
> > test results can be posted to the wiki [2]. It would probably be a good
> > idea to establish a baseline with the 0.10.0 release.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve attached an analysis document Sean Zhong put together a while
> > back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> > release but is still relevant and has a lot of good information.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> [1]
> > https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> > >> >>
> > >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> > >> >>
> > >> >> [3] http://semver.org
> > >> >>
> > >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> -Taylor
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >
> > >
> > >
> > > --
> > > Name : 임 정택
> > > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > > Twitter : http://twitter.com/heartsavior
> > > LinkedIn : http://www.linkedin.com/in/heartsavior
> >
> 



  

Re: [DISCUSS] Storm 2.0 plan

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
Longda,
Thanks for the clarification.  I think you stated things very well.  I am 100% behind you on the merger part and my team is very close to being ready to work on the merge 100% as well.  The important points for me is that we do it in the open, following the storm bylaws, that we never break the storm build, and that we move quickly.  This gives us the ability to constantly test and verify that we are not breaking compatibility, not regressing performance, and that we don't disrupt the community too much for too long.  I don't care about the order that we do things or the ultimate language that we end up using.  If you have already started on some of this work that is great, but please file JIRAs for what you are currently doing so we are all aware of it, and don't end up with duplicated efforts.  Hopefully soon we will do the 1.0 branch and we can start reviewing/merging the code changes you have been working on for the migration.

- Bobby 


    On Friday, November 20, 2015 9:57 AM, Harsha <ma...@harsha.io> wrote:
 

 Hi All,
          If possible can we have bi-weekly or monthly video hangouts to
          discuss the plan. I think it will make it easier to discuss
          the next steps. We can post the details of the discussion on
          the mailing list so that everyone is involved in whats going .

Thanks,
Harsha
On Fri, Nov 20, 2015, at 01:03 AM, Longda Feng wrote:
> 
> @Sean, Thanks for clarify.
> @Taylor, @Bobby, @Sean, @Jungtaek, @Harsha, @dev,
> Sorry for leading to misunderstanding.
> The biggest point:We would like to merge two community into one
> community, One community is stronger than two single communities. My team
> hopes that Alibaba can directly use the Apache Storm version  in the next
> few years. My team don't need to maintain JStorm any more, this is the
> reason why Alibaba donated JStorm. 
> Second point:Sean's point is right. The migration is not just "copy". It
> should be "merge". I means that the module will not simply as the JStorm
> module. It should be the result of our disccussion. I think the final
> solution after merging can make Storm better. 
> Third point:In fact, I don't scare other streaming process, especially
> for Heron. I have work on Storm for 4 years, I am a deep fans of Storm. I
> know what can Storm do and what storm cannot do . But I want to express
> we need accelerate our evolve speed. This field is so active. We should
> start to learn other framework's advantage as soon as possible.
> Especally, we need more application level programming framework like
> Trident. This wil attract more users to Storm.
> Fourth point:We don't need to do everything from scratch, we can use
> JStorm as much as possible. JStorm is here, why not use.
> Last point:My team is already full time on this merge, we will try our
> best to do contribution, make Storm better. 
> 
> ThanksLongda
> 
> ------------------------------------------------------------------From:Sean
> Zhong <cl...@gmail.com>Send Time:2015年11月20日(星期五) 11:58To:dev
> <de...@storm.apache.org>Subject:Re: [DISCUSS] Storm 2.0 plan
> Hi All,
> 
> I think there are may be some misproper use of words or misunderstanding.
> 
> Here is what I can see both agrees these goals:
> 1. We want to migrate clojure to java
> 2. We want to merge important features together.
> 3. We want to do this in a step by step, transparent, reviewable way,
> especially with close examination and reviews for code that has
> architecture change.
> 4. We want the final version to remain compatibility.
> 
> The only difference is the process we use to achieve these goals.
> Longda's view:
> 1. do a parallel migration from clojure core to java part by part.
> parallel
> means equivalent, no new features added in this step. He suggest to
> follow
> the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
> web ui/others"
> He use word "copy", which is mis-proper in my idea. It is more like a
> merging.
> quote on his words.
> 
> >  2.1 Copy modules from JStorm, one module from one module
> 
> 2.2 The sequence is extern modules/client/utils/nimbus/
> > supervisor/drpc/worker & task/web ui/others
> 
> 2. upon the java core code base, incremental add new feature blocks.
> quote on his words.
> 
> > 3.1 Discuss solution for each difference(jira)
> > 3.2 Once the solution is finalized, we can start the
> > merging. (Some issues could be start concurrently. It
> > depends on the discussion.)
> 
> 3.  His goal is to remain compatibility. "this version is stable and
> compatible with API of Storm 1.0." is not accurate statement from my
> point,
> at least not for the security feature.
> 4. He share his concern on other streaming engines.
> 
> 
> Bobby and Jungtaek 's view:
> 1. "Copy" is not acceptable, it will impact the security features. (Copy
> is
> a wrong phase to use, I think Longda means more a merging)
> 2. With JStorm team, we start with clojure -> java translation first,
> 3. By optimistic view, with JStorm team, one month should be enough for
> above stage.
> 3. Adding new features after whole code is migrated to java.
> 4. No need to that worry about other engines.
> 
> If my understanding of both parties are correct. I think we agree on most
> of things about the process.
> first: clojure -> java
> second: merge features.
> 
> With a slight difference about how aggressive we want to do "clojure ->
> java", and how long it takes.
> 
> 
> @Longda, can you clarify whether my understanding of your opinion is
> right?
> 
> 
> Sean
> 
> 
> On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com>
> wrote:
> 
> > Very well stated Juntaek.
> >
> > I should also point out that there's nothing stopping the JStorm team from
> > releasing new versions of JStorm, or adding new features. But you would
> > have to be careful to note that any such release is "JStorm" and not
> > "Apache Storm." And any such release cannot be hosted on Apache
> > infrastructure.
> >
> > We also shouldn't be too worried about competition with other stream
> > processing frameworks. Competition is healthy and leads to improvements
> > across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> > integration. It also borrowed memory management ideas from Flink. I don't
> > see that as a problem. This is open source. We can, and should, do the same
> > where applicable.
> >
> > Did we learn anything from the Heron paper? Nothing we didn't already
> > know. And a lot of the points have been addressed. We dealt security first,
> > which is more important for adoption, especially in the enterprise. Now
> > we've addressed many performance, scaling, and usability issues. Most of
> > the production deployments I've seen are nowhere near the magnitude of what
> > twitter requires. But I've seen many deployments that  only exist because
> > we offer security. I doubt heron has that.
> >
> > We've also seen an uptick in community and developer involvement, which
> > means a likely increase in committers, which likely means a faster
> > turnaround for patch reviews, which means a tighter release cycle for new
> > features, which means we will be moving faster. This is healthy for an
> > Apache project.
> >
> > And the inclusion of the JStorm team will only make that more so.
> >
> > I feel we are headed in the right direction, and there are good things to
> > come.
> >
> > -Taylor
> >
> >
> > > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> > >
> > > Sorry Longda, but I can't help telling that I also disagree about
> > changing codebase.
> > >
> > > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> > just in point of feature's view. We can't be safe to change although
> > feature matrixes are identical, because feature matrix doesn't contain the
> > details.
> > >
> > > I mean, users could be scared when expected behaviors are not in place
> > although they're small. User experience is the one of the most important
> > part of the project, and if UX changes are huge, barrier for upgrading
> > their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> > should be the worst scenario I can imagine after merging.
> > >
> > > The safest way to merge is applying JStorm's great features to Apache
> > Storm.
> > > I think porting language of Apache Storm to Java is not tightly related
> > to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> > itself can port to other languages like Java, Scala, or something else
> > which are more popular than Clojure.
> > >
> > > And I'm also not scary about Flink, Heron, Spark, etc.
> > > It doesn't mean other projects are not greater then Storm. Just I'm
> > saying each projects have their own strength.
> > > For example, all conferences are saying about Spark, and as one of users
> > of Spark, Spark is really great. If you are a little bit familiar with
> > Scala, you can just apply Scala-like functional methods to RDD. Really easy
> > to use.
> > > But it doesn't mean that Spark can replace Storm in all kind of use
> > cases. Recently I've seen some articles that why Storm is more preferred in
> > realtime streaming processing.
> > >
> > > Competition should give us a positive motivation. I hope that our
> > roadmap isn't focused to defeat competitors, but is focused to present
> > great features, better performance, and better UX to Storm community. It's
> > not commercial product, it's open source project!
> > >
> > > tl;dr. Please don't change codebase unless we plan to release a brand
> > new project. It breaks UX completely which could make users leave.
> > >
> > > I'm also open to other opinions as well.
> > >
> > > Best,
> > > Jungtaek Lim (HeartSaVioR)
> > >
> > >
> > > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> > >> I disagree completely.  You claim that JStorm is compatible with storm
> > 1.0.  I don't believe that it is 100% compatible.  There has been more then
> > 2 years of software development happening on both sides.  Security was not
> > done in a day, and porting it over to JStorm is not going to happen
> > quickly, and because of the major architectural changes between storm and
> > JStorm I believe we would have to make some serious enhancements to fully
> > support a secure TopologyMaster, but I need to look into it more.  The blob
> > store is another piece of code that has taken a very long time to develop.
> > There are numberous others.  The big features are not the ones that make me
> > nervous because we can plan for them, it is the hundreds of small JIRA and
> > features that will result in minor incompatibilities.  If we start with
> > storm itself, and follow the same process that we have been doing up until
> > now, if there is a reason to stop the port add in an important feature and
> > do a release, we can.  We will know that we have compatibility vs starting
> > with JStorm we know from the start that we do not without adding feature X,
> > Y, Z, ....
> > >>
> > >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> > etc...  We just did some major performance enhancements that will be
> > released with STORM 1.0.  We now have up to 6x the throughput that we had
> > before with minimal changes to the latency (20 ms vs 5 ms).  We have
> > automatic back-pressure so if someone was running with acking enabled just
> > for flow control they can now process close to 16x the throughput they
> > could before with the same hardware.  This puts our throughput very much on
> > par with flink and Spark, but with a much lower latency compared to either
> > of them.  Plus from what I have heard Flink is still calling the streaming
> > API beta, and their storm API compatibility is very rudimentary.  They are
> > also going to have more and more problems maintaining compatibility as we
> > add in new features and functionality.
> > >>
> > >> Spark only really works well when it is running with several seconds of
> > latency. Not every one needs sub-second processing, but when your platform
> > is completely unable to handle it, locks you out of a lot of use cases.
> > Their throughput is decent and can scale very high when you are willing to
> > tolerate similarly very high latencies.
> > >> Who knows about Heron until they actually release their code, but it is
> > missing lots of critical features, and the one they touted, better
> > performance, is a moot point with storm 1.0.  The only thing we really are
> > lacking is advertising, we don't have a big company really pushing storm
> > and getting it in the news all the time (Sorry Hortonworks, but I really
> > have not seen much about it in the news).  I am trying to do more, but
> > there is only so much I can do.
> > >> Longda I very much agree with you about moving quickly to make the
> > transition, but I do not believe in any way that starting with JStorm is
> > going to reduce that transition time.
> > >> My proposal is to give everyone about 2 weeks to finish merging new
> > features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> > for a release.  At the same time development work to port storm to java
> > begins.  You said it took 4 developers 1 year to port storm to java the
> > first time for JStorm.  We have 14+ active developers and over one hundred
> > contributors not including those from the JStorm community.  If numbers
> > scale linearly, I know they don't completely, we should be able to do a
> > complete port with no JStorm reference in around 100 days.  With a copy and
> > paste for a lot of this from the JStorm codebase, I would expect to be able
> > to do it in 1 month of development, possibly less if the JStorm community
> > can really help out too.  So by January we should be ready to begin pulling
> > in features from JStorm that make since.  Looking at the feature matrix in
> > https://github.com/apache/storm/pull/877 there are a few potentially big
> > improvements that we would want to pull in, but they require architectural
> > changes in some cases that I don't want to just do lightly.  I would
> > propose that one the code has been ported to java we reopen for all new
> > features in parallel with the JStorm feature migration, but I am open to
> > others opinions as well.
> > >>  - Bobby
> > >>
> > >>
> > >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> > zhongyan.feng@alibaba-inc.com> wrote:
> > >>
> > >>
> > >>  Sorry for changing the Subject.
> > >>
> > >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> > JStorm.
> > >>
> > >> I think the change of this release will be the biggest one in history.
> > It will probably take a long time to develop. At the same time, Heron is
> > going to open source, and the latest release of Flink provides the
> > compatibility to Storm’s API. These might be the threat to Storm. So I
> > suggest we start the development of Storm 2.0 as quickly as possible. In
> > order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> > core and UI as the base version since this version is stable and compatible
> > with API of Storm 1.0. Please refer to the phases below for the detailed
> > merging plan.
> > >>
> > >> Note: We provide a demo of JStorm’s web UI. Please refer to
> > storm.taobao.org . I think JStorm will give a totally different view to
> > you.
> > >>
> > >> I would like to share the experience of initial development of JStorm
> > (Migrate from clojure core to java core).
> > >> Our team(4 developers) have spent almost one year to finish the
> > migration. We took 4 months to release the first JStorm version, and 6
> > months to make JStorm stable. During this period, we tried to switch more
> > than online 100 applications with different scenarios from Storm to JStorm,
> > and many bugs were fixed. Then more and more applications were switched to
> > JStorm in Alibaba.
> > >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> > 2000+ applications are running on them. The JStorm Clusters here can handle
> > 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> > field but also in many other online scenarios.
> > >> Besides it, we have experienced the November 11th Shopping Festival of
> > Alibaba for last three years. At that day, the computation in our cluster
> > increased several times than usual. All applications worked well during the
> > peak time. I can say the stability of JStorm is no doubt today. Actually,
> > besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> > >>
> > >>
> > >> Phase 1:
> > >>
> > >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> > >> 1. Open a new Umbrella Jira (
> > https://issues.apache.org/jira/browse/STORM-717)
> > >> 2. Create one 2.0 branch,
> > >> 2.1 Copy modules from JStorm, one module from one module
> > >> 2.2 The sequence is extern
> > modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> > >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> > >> 3.1 Discuss solution for each difference(jira)
> > >> 3.2 Once the solution is finalized, we can start the merging. (Some
> > issues could be start concurrently. It depends on the discussion.)
> > >>
> > >> The phase mainly try to define target and finalize the solution.
> > Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> > >>
> > >>
> > >> Phase 2:
> > >> Release Storm 2.0 beta
> > >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> > >> 2. Integrate all modules, make the simplest storm example can run on
> > the system.
> > >> 3. Test with all example and modules in Storm code base.
> > >> 4. All daily test can be passed.
> > >>
> > >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> > >>
> > >>
> > >> Phase 3:
> > >> Persuade some user to have a try.
> > >> Alibaba will try to run some online applications on the beta version
> > >>
> > >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> > >>
> > >>
> > >> Any comments are welcome.
> > >>
> > >>
> > >> Thanks
> > >>
> > Longda------------------------------------------------------------------From:P.
> > Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> > dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> > [DISCUSS] Plan for Merging JStorm Code
> > >> All I have at this point is a placeholder wiki entry [1], and a lot of
> > local notes that likely would only make sense to me.
> > >>
> > >> Let me know your wiki username and I’ll give you permissions. The same
> > goes for anyone else who wants to help.
> > >>
> > >> -Taylor
> > >>
> > >> [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> > >>
> > >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> > wrote:
> > >> >
> > >> > Taylor and others I was hoping to get started filing JIRA and
> > planning on how we are going to do the java migration + JStorm merger.  Is
> > anyone else starting to do this?  If not would anyone object to me starting
> > on it? - Bobby
> > >> >
> > >> >
> > >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> > ptgoetz@gmail.com> wrote:
> > >> >
> > >> >
> > >> > Thanks for putting this together Basti, that comparison helps a lot.
> > >> >
> > >> > And thanks Bobby for converting it into markdown. I was going to just
> > attach the spreadsheet to JIRA, but markdown is a much better solution.
> > >> >
> > >> > -Taylor
> > >> >
> > >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> > <ev...@yahoo-inc.com.INVALID> wrote:
> > >> >>
> > >> >> I translated the excel spreadsheet into a markdown file and put up a
> > pull request for it.
> > >> >> https://github.com/apache/storm/pull/877
> > >> >> I did a few edits to it to make it work with Markdown, and to add in
> > a few of my own comments.  I also put in a field for JIRAs to be able to
> > track the migration.
> > >> >> Overall I think your evaluation was very good.  We have a fair
> > amount of work ahead of us to decide what version of various features we
> > want to go forward with.
> > >> >>   - Bobby
> > >> >>
> > >> >>
> > >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Bobby & Jungtaek,
> > >> >>
> > >> >> Thanks for your replay.
> > >> >> I totally agree that compatibility is the most important thing.
> > Actually, JStorm has been compatible with the user API of Storm.
> > >> >> As you mentioned below, we indeed still have some features different
> > between Storm and JStorm. I have tried to list them (minor update or
> > improvements are not included).
> > >> >> Please refer to attachment for details. If any missing, please help
> > to point out. (The current working features are probably missing here.)
> > >> >> Just have a look at these differences. For the missing features in
> > JStorm, I did not see any obstacle which will block the merge to JStorm.
> > >> >> For the features which has different solution between Storm and
> > JStorm, we can evaluate the solution one by one to decision which one is
> > appropriate.
> > >> >> After the finalization of evaluation, I think JStorm team can take
> > the merging job and publish a stable release in 2 months.
> > >> >> But anyway, the detailed implementation for these features with
> > different solution is transparent to user. So, from user's point of view,
> > there is not any compatibility problem.
> > >> >>
> > >> >> Besides compatibility, by our experience, stability is also
> > important and is not an easy job. 4 people in JStorm team took almost one
> > year to finish the porting from "clojure core"
> > >> >> to "java core", and to make it stable. Of course, we have many devs
> > in community to make the porting job faster. But it still needs a long time
> > to run many online complex topologys to find bugs and fix them. So, that is
> > the reason why I proposed to do merging and build on a stable "java core".
> > >> >>
> > >> >> -----Original Message-----
> > >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> > >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> > >> >> To: dev@storm.apache.org
> > >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> > Migrating the APIs to org.apache.storm is a big non-backwards compatible
> > move, and a major version bump to 2.x seems like a good move there.
> > >> >> +1 for the release plan
> > >> >>
> > >> >> I would like the move for user facing APIs to org.apache to be one
> > of the last things we do.  Translating clojure code into java and moving it
> > to org.apache I am not too concerned about.
> > >> >>
> > >> >> Basti,
> > >> >> We have two code bases that have diverged significantly from one
> > another in terms of functionality.  The storm code now or soon will have A
> > Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> > Scheduling, a distributed cache like API, log searching, security, massive
> > performance improvements, shaded almost all of our dependencies, a REST API
> > for programtically accessing everything on the UI, and I am sure I am
> > missing a few other things.  JStorm also has many changes including cgroup
> > isolation, restructured zookeeper layout, classpath isolation, and more too.
> > >> >> No matter what we do it will be a large effort to port changes from
> > one code base to another, and from clojure to java.  I proposed this
> > initially because it can be broken up into incremental changes.  It may
> > take a little longer, but we will always have a working codebase that is
> > testable and compatible with the current storm release, at least until we
> > move the user facing APIs to be under org.apache.  This lets the community
> > continue to build and test the master branch and report problems that they
> > find, which is incredibly valuable.  I personally don't think it will be
> > much easier, especially if we are intent on always maintaining
> > compatibility with storm. - Bobby
> > >> >>
> > >> >>
> > >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Taylor,
> > >> >>
> > >> >>
> > >> >>
> > >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> > >> >>
> > >> >> Do you mean community plan to create a fresh new “java core” based
> > on current “clojure core” firstly, and then migrate the features from
> > JStorm?
> > >> >>
> > >> >> If so, it confused me.  It is really a huge job which might require
> > a long developing time to make it stable, while JStorm is already a stable
> > version.
> > >> >>
> > >> >> The release planned to be release after Nov 11th has already run
> > online stably several month in Alibaba.
> > >> >>
> > >> >> Besides this, there are many valuable internal requirements in
> > Alibaba, the fast evolution of JStorm is forseeable in next few months.
> > >> >>
> > >> >> If the “java core” is totally fresh new, it might bring many
> > problems for the coming merge.
> > >> >>
> > >> >> So, from the point of this view,  I think it is much better and
> > easier to migrate the features of “clojure core” basing on JStorm for the
> > “java core”.
> > >> >>
> > >> >> Please correct me, if any misunderstanding.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Regards
> > >> >>
> > >> >> Basti
> > >> >>
> > >> >>
> > >> >>
> > >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> > >> >> 发送时间: 2015年11月11日 5:32
> > >> >> 收件人: dev@storm.apache.org
> > >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >>
> > >> >>
> > >> >> Based on a number of discussions regarding merging the JStorm code,
> > I’ve tried to distill the ideas presented and inserted some of my own. The
> > result is below.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve divided the plan into three phases, though they are not
> > necessarily sequential — obviously some tasks can take place in parallel.
> > >> >>
> > >> >>
> > >> >>
> > >> >> None of this is set in stone, just presented for discussion. Any and
> > all comments are welcome.
> > >> >>
> > >> >>
> > >> >>
> > >> >> -------
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 1 - Plan for 0.11.x Release
> > >> >>
> > >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> > >> >>
> > >> >> 2. Announce feature-freeze for 0.11.x
> > >> >>
> > >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> > >> >>
> > >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> > >> >>
> > >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> > >> >>
> > >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> > isolation, cgroups, etc.) and create JIRA for migrating the feature.
> > >> >>
> > >> >> 2. Create JIRA for migrating each clojure component (or logical
> > group of components) to Java. Assumes tests will be ported as well.
> > >> >>
> > >> >> 3. Discuss/establish style guide for Java coding conventions.
> > Consider using Oracle’s or Google’s Java conventions as a base — they are
> > both pretty solid.
> > >> >>
> > >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> > com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 3 - Migrate Clojure --> Java
> > >> >>
> > >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> > possible (core functionality only, features distinct to JStorm migrated
> > separately).
> > >> >>
> > >> >> 2. Port JStorm-specific features.
> > >> >>
> > >> >> 3. Begin releasing preview/beta versions.
> > >> >>
> > >> >> 4. Code cleanup (across the board) and refactoring using established
> > coding conventions, and leveraging PMD/Checkstyle reports for reference.
> > (Note: good oportunity for new contributors.)
> > >> >>
> > >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> > feature freeze.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Notes:
> > >> >>
> > >> >> We should consider bumping up to version 1.0 sometime soon and then
> > switching to semantic versioning [3] from then on.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> With the exception of package name alignment, the "jstorm-import"
> > branch will largely be read-only throughout the process.
> > >> >>
> > >> >>
> > >> >>
> > >> >> During migration, it's probably easiest to operate with two local
> > clones of the Apache Storm repo: one for working (i.e. checked out to
> > working branch) and one for reference/copying (i.e. checked out to
> > "jstorm-import").
> > >> >>
> > >> >>
> > >> >>
> > >> >> Feature-freeze probably only needs to be enforced against core
> > functionality. Components under "external" can likely be exempt, but we
> > should figure out a process for accepting and releasing new features during
> > the migration.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Performance testing should be continuous throughout the process.
> > Since we don't really have ASF infrastructure for performance testing, we
> > will need a volunteer(s) to host and run the performance tests. Performance
> > test results can be posted to the wiki [2]. It would probably be a good
> > idea to establish a baseline with the 0.10.0 release.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve attached an analysis document Sean Zhong put together a while
> > back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> > release but is still relevant and has a lot of good information.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> [1]
> > https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> > >> >>
> > >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> > >> >>
> > >> >> [3] http://semver.org
> > >> >>
> > >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> -Taylor
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >
> > >
> > >
> > > --
> > > Name : 임 정택
> > > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > > Twitter : http://twitter.com/heartsavior
> > > LinkedIn : http://www.linkedin.com/in/heartsavior
> >
> 

  

Re: [DISCUSS] Storm 2.0 plan

Posted by Harsha <ma...@harsha.io>.
Hi All,
          If possible can we have bi-weekly or monthly video hangouts to
          discuss the plan. I think it will make it easier to discuss
          the next steps. We can post the details of the discussion on
          the mailing list so that everyone is involved in whats going .

Thanks,
Harsha
On Fri, Nov 20, 2015, at 01:03 AM, Longda Feng wrote:
> 
> @Sean, Thanks for clarify.
> @Taylor, @Bobby, @Sean, @Jungtaek, @Harsha, @dev,
> Sorry for leading to misunderstanding.
> The biggest point:We would like to merge two community into one
> community, One community is stronger than two single communities. My team
> hopes that Alibaba can directly use the Apache Storm version  in the next
> few years. My team don't need to maintain JStorm any more, this is the
> reason why Alibaba donated JStorm. 
> Second point:Sean's point is right. The migration is not just "copy". It
> should be "merge". I means that the module will not simply as the JStorm
> module. It should be the result of our disccussion. I think the final
> solution after merging can make Storm better. 
> Third point:In fact, I don't scare other streaming process, especially
> for Heron. I have work on Storm for 4 years, I am a deep fans of Storm. I
> know what can Storm do and what storm cannot do . But I want to express
> we need accelerate our evolve speed. This field is so active. We should
> start to learn other framework's advantage as soon as possible.
> Especally, we need more application level programming framework like
> Trident. This wil attract more users to Storm.
> Fourth point:We don't need to do everything from scratch, we can use
> JStorm as much as possible. JStorm is here, why not use.
> Last point:My team is already full time on this merge, we will try our
> best to do contribution, make Storm better. 
> 
> ThanksLongda
> 
> ------------------------------------------------------------------From:Sean
> Zhong <cl...@gmail.com>Send Time:2015年11月20日(星期五) 11:58To:dev
> <de...@storm.apache.org>Subject:Re: [DISCUSS] Storm 2.0 plan
> Hi All,
> 
> I think there are may be some misproper use of words or misunderstanding.
> 
> Here is what I can see both agrees these goals:
> 1. We want to migrate clojure to java
> 2. We want to merge important features together.
> 3. We want to do this in a step by step, transparent, reviewable way,
> especially with close examination and reviews for code that has
> architecture change.
> 4. We want the final version to remain compatibility.
> 
> The only difference is the process we use to achieve these goals.
> Longda's view:
> 1. do a parallel migration from clojure core to java part by part.
> parallel
> means equivalent, no new features added in this step. He suggest to
> follow
> the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
> web ui/others"
> He use word "copy", which is mis-proper in my idea. It is more like a
> merging.
> quote on his words.
> 
> >  2.1 Copy modules from JStorm, one module from one module
> 
> 2.2 The sequence is extern modules/client/utils/nimbus/
> > supervisor/drpc/worker & task/web ui/others
> 
> 2. upon the java core code base, incremental add new feature blocks.
> quote on his words.
> 
> > 3.1 Discuss solution for each difference(jira)
> > 3.2 Once the solution is finalized, we can start the
> > merging. (Some issues could be start concurrently. It
> > depends on the discussion.)
> 
> 3.  His goal is to remain compatibility. "this version is stable and
> compatible with API of Storm 1.0." is not accurate statement from my
> point,
> at least not for the security feature.
> 4. He share his concern on other streaming engines.
> 
> 
> Bobby and Jungtaek 's view:
> 1. "Copy" is not acceptable, it will impact the security features. (Copy
> is
> a wrong phase to use, I think Longda means more a merging)
> 2. With JStorm team, we start with clojure -> java translation first,
> 3. By optimistic view, with JStorm team, one month should be enough for
> above stage.
> 3. Adding new features after whole code is migrated to java.
> 4. No need to that worry about other engines.
> 
> If my understanding of both parties are correct. I think we agree on most
> of things about the process.
> first: clojure -> java
> second: merge features.
> 
> With a slight difference about how aggressive we want to do "clojure ->
> java", and how long it takes.
> 
> 
> @Longda, can you clarify whether my understanding of your opinion is
> right?
> 
> 
> Sean
> 
> 
> On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com>
> wrote:
> 
> > Very well stated Juntaek.
> >
> > I should also point out that there's nothing stopping the JStorm team from
> > releasing new versions of JStorm, or adding new features. But you would
> > have to be careful to note that any such release is "JStorm" and not
> > "Apache Storm." And any such release cannot be hosted on Apache
> > infrastructure.
> >
> > We also shouldn't be too worried about competition with other stream
> > processing frameworks. Competition is healthy and leads to improvements
> > across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> > integration. It also borrowed memory management ideas from Flink. I don't
> > see that as a problem. This is open source. We can, and should, do the same
> > where applicable.
> >
> > Did we learn anything from the Heron paper? Nothing we didn't already
> > know. And a lot of the points have been addressed. We dealt security first,
> > which is more important for adoption, especially in the enterprise. Now
> > we've addressed many performance, scaling, and usability issues. Most of
> > the production deployments I've seen are nowhere near the magnitude of what
> > twitter requires. But I've seen many deployments that  only exist because
> > we offer security. I doubt heron has that.
> >
> > We've also seen an uptick in community and developer involvement, which
> > means a likely increase in committers, which likely means a faster
> > turnaround for patch reviews, which means a tighter release cycle for new
> > features, which means we will be moving faster. This is healthy for an
> > Apache project.
> >
> > And the inclusion of the JStorm team will only make that more so.
> >
> > I feel we are headed in the right direction, and there are good things to
> > come.
> >
> > -Taylor
> >
> >
> > > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> > >
> > > Sorry Longda, but I can't help telling that I also disagree about
> > changing codebase.
> > >
> > > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> > just in point of feature's view. We can't be safe to change although
> > feature matrixes are identical, because feature matrix doesn't contain the
> > details.
> > >
> > > I mean, users could be scared when expected behaviors are not in place
> > although they're small. User experience is the one of the most important
> > part of the project, and if UX changes are huge, barrier for upgrading
> > their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> > should be the worst scenario I can imagine after merging.
> > >
> > > The safest way to merge is applying JStorm's great features to Apache
> > Storm.
> > > I think porting language of Apache Storm to Java is not tightly related
> > to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> > itself can port to other languages like Java, Scala, or something else
> > which are more popular than Clojure.
> > >
> > > And I'm also not scary about Flink, Heron, Spark, etc.
> > > It doesn't mean other projects are not greater then Storm. Just I'm
> > saying each projects have their own strength.
> > > For example, all conferences are saying about Spark, and as one of users
> > of Spark, Spark is really great. If you are a little bit familiar with
> > Scala, you can just apply Scala-like functional methods to RDD. Really easy
> > to use.
> > > But it doesn't mean that Spark can replace Storm in all kind of use
> > cases. Recently I've seen some articles that why Storm is more preferred in
> > realtime streaming processing.
> > >
> > > Competition should give us a positive motivation. I hope that our
> > roadmap isn't focused to defeat competitors, but is focused to present
> > great features, better performance, and better UX to Storm community. It's
> > not commercial product, it's open source project!
> > >
> > > tl;dr. Please don't change codebase unless we plan to release a brand
> > new project. It breaks UX completely which could make users leave.
> > >
> > > I'm also open to other opinions as well.
> > >
> > > Best,
> > > Jungtaek Lim (HeartSaVioR)
> > >
> > >
> > > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> > >> I disagree completely.  You claim that JStorm is compatible with storm
> > 1.0.  I don't believe that it is 100% compatible.  There has been more then
> > 2 years of software development happening on both sides.  Security was not
> > done in a day, and porting it over to JStorm is not going to happen
> > quickly, and because of the major architectural changes between storm and
> > JStorm I believe we would have to make some serious enhancements to fully
> > support a secure TopologyMaster, but I need to look into it more.  The blob
> > store is another piece of code that has taken a very long time to develop.
> > There are numberous others.  The big features are not the ones that make me
> > nervous because we can plan for them, it is the hundreds of small JIRA and
> > features that will result in minor incompatibilities.  If we start with
> > storm itself, and follow the same process that we have been doing up until
> > now, if there is a reason to stop the port add in an important feature and
> > do a release, we can.  We will know that we have compatibility vs starting
> > with JStorm we know from the start that we do not without adding feature X,
> > Y, Z, ....
> > >>
> > >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> > etc...  We just did some major performance enhancements that will be
> > released with STORM 1.0.  We now have up to 6x the throughput that we had
> > before with minimal changes to the latency (20 ms vs 5 ms).  We have
> > automatic back-pressure so if someone was running with acking enabled just
> > for flow control they can now process close to 16x the throughput they
> > could before with the same hardware.  This puts our throughput very much on
> > par with flink and Spark, but with a much lower latency compared to either
> > of them.  Plus from what I have heard Flink is still calling the streaming
> > API beta, and their storm API compatibility is very rudimentary.  They are
> > also going to have more and more problems maintaining compatibility as we
> > add in new features and functionality.
> > >>
> > >> Spark only really works well when it is running with several seconds of
> > latency. Not every one needs sub-second processing, but when your platform
> > is completely unable to handle it, locks you out of a lot of use cases.
> > Their throughput is decent and can scale very high when you are willing to
> > tolerate similarly very high latencies.
> > >> Who knows about Heron until they actually release their code, but it is
> > missing lots of critical features, and the one they touted, better
> > performance, is a moot point with storm 1.0.  The only thing we really are
> > lacking is advertising, we don't have a big company really pushing storm
> > and getting it in the news all the time (Sorry Hortonworks, but I really
> > have not seen much about it in the news).  I am trying to do more, but
> > there is only so much I can do.
> > >> Longda I very much agree with you about moving quickly to make the
> > transition, but I do not believe in any way that starting with JStorm is
> > going to reduce that transition time.
> > >> My proposal is to give everyone about 2 weeks to finish merging new
> > features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> > for a release.  At the same time development work to port storm to java
> > begins.  You said it took 4 developers 1 year to port storm to java the
> > first time for JStorm.  We have 14+ active developers and over one hundred
> > contributors not including those from the JStorm community.  If numbers
> > scale linearly, I know they don't completely, we should be able to do a
> > complete port with no JStorm reference in around 100 days.  With a copy and
> > paste for a lot of this from the JStorm codebase, I would expect to be able
> > to do it in 1 month of development, possibly less if the JStorm community
> > can really help out too.  So by January we should be ready to begin pulling
> > in features from JStorm that make since.  Looking at the feature matrix in
> > https://github.com/apache/storm/pull/877 there are a few potentially big
> > improvements that we would want to pull in, but they require architectural
> > changes in some cases that I don't want to just do lightly.  I would
> > propose that one the code has been ported to java we reopen for all new
> > features in parallel with the JStorm feature migration, but I am open to
> > others opinions as well.
> > >>  - Bobby
> > >>
> > >>
> > >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> > zhongyan.feng@alibaba-inc.com> wrote:
> > >>
> > >>
> > >>  Sorry for changing the Subject.
> > >>
> > >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> > JStorm.
> > >>
> > >> I think the change of this release will be the biggest one in history.
> > It will probably take a long time to develop. At the same time, Heron is
> > going to open source, and the latest release of Flink provides the
> > compatibility to Storm’s API. These might be the threat to Storm. So I
> > suggest we start the development of Storm 2.0 as quickly as possible. In
> > order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> > core and UI as the base version since this version is stable and compatible
> > with API of Storm 1.0. Please refer to the phases below for the detailed
> > merging plan.
> > >>
> > >> Note: We provide a demo of JStorm’s web UI. Please refer to
> > storm.taobao.org . I think JStorm will give a totally different view to
> > you.
> > >>
> > >> I would like to share the experience of initial development of JStorm
> > (Migrate from clojure core to java core).
> > >> Our team(4 developers) have spent almost one year to finish the
> > migration. We took 4 months to release the first JStorm version, and 6
> > months to make JStorm stable. During this period, we tried to switch more
> > than online 100 applications with different scenarios from Storm to JStorm,
> > and many bugs were fixed. Then more and more applications were switched to
> > JStorm in Alibaba.
> > >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> > 2000+ applications are running on them. The JStorm Clusters here can handle
> > 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> > field but also in many other online scenarios.
> > >> Besides it, we have experienced the November 11th Shopping Festival of
> > Alibaba for last three years. At that day, the computation in our cluster
> > increased several times than usual. All applications worked well during the
> > peak time. I can say the stability of JStorm is no doubt today. Actually,
> > besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> > >>
> > >>
> > >> Phase 1:
> > >>
> > >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> > >> 1. Open a new Umbrella Jira (
> > https://issues.apache.org/jira/browse/STORM-717)
> > >> 2. Create one 2.0 branch,
> > >> 2.1 Copy modules from JStorm, one module from one module
> > >> 2.2 The sequence is extern
> > modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> > >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> > >> 3.1 Discuss solution for each difference(jira)
> > >> 3.2 Once the solution is finalized, we can start the merging. (Some
> > issues could be start concurrently. It depends on the discussion.)
> > >>
> > >> The phase mainly try to define target and finalize the solution.
> > Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> > >>
> > >>
> > >> Phase 2:
> > >> Release Storm 2.0 beta
> > >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> > >> 2. Integrate all modules, make the simplest storm example can run on
> > the system.
> > >> 3. Test with all example and modules in Storm code base.
> > >> 4. All daily test can be passed.
> > >>
> > >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> > >>
> > >>
> > >> Phase 3:
> > >> Persuade some user to have a try.
> > >> Alibaba will try to run some online applications on the beta version
> > >>
> > >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> > >>
> > >>
> > >> Any comments are welcome.
> > >>
> > >>
> > >> Thanks
> > >>
> > Longda------------------------------------------------------------------From:P.
> > Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> > dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> > [DISCUSS] Plan for Merging JStorm Code
> > >> All I have at this point is a placeholder wiki entry [1], and a lot of
> > local notes that likely would only make sense to me.
> > >>
> > >> Let me know your wiki username and I’ll give you permissions. The same
> > goes for anyone else who wants to help.
> > >>
> > >> -Taylor
> > >>
> > >> [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> > >>
> > >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> > wrote:
> > >> >
> > >> > Taylor and others I was hoping to get started filing JIRA and
> > planning on how we are going to do the java migration + JStorm merger.  Is
> > anyone else starting to do this?  If not would anyone object to me starting
> > on it? - Bobby
> > >> >
> > >> >
> > >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> > ptgoetz@gmail.com> wrote:
> > >> >
> > >> >
> > >> > Thanks for putting this together Basti, that comparison helps a lot.
> > >> >
> > >> > And thanks Bobby for converting it into markdown. I was going to just
> > attach the spreadsheet to JIRA, but markdown is a much better solution.
> > >> >
> > >> > -Taylor
> > >> >
> > >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> > <ev...@yahoo-inc.com.INVALID> wrote:
> > >> >>
> > >> >> I translated the excel spreadsheet into a markdown file and put up a
> > pull request for it.
> > >> >> https://github.com/apache/storm/pull/877
> > >> >> I did a few edits to it to make it work with Markdown, and to add in
> > a few of my own comments.  I also put in a field for JIRAs to be able to
> > track the migration.
> > >> >> Overall I think your evaluation was very good.  We have a fair
> > amount of work ahead of us to decide what version of various features we
> > want to go forward with.
> > >> >>   - Bobby
> > >> >>
> > >> >>
> > >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Bobby & Jungtaek,
> > >> >>
> > >> >> Thanks for your replay.
> > >> >> I totally agree that compatibility is the most important thing.
> > Actually, JStorm has been compatible with the user API of Storm.
> > >> >> As you mentioned below, we indeed still have some features different
> > between Storm and JStorm. I have tried to list them (minor update or
> > improvements are not included).
> > >> >> Please refer to attachment for details. If any missing, please help
> > to point out. (The current working features are probably missing here.)
> > >> >> Just have a look at these differences. For the missing features in
> > JStorm, I did not see any obstacle which will block the merge to JStorm.
> > >> >> For the features which has different solution between Storm and
> > JStorm, we can evaluate the solution one by one to decision which one is
> > appropriate.
> > >> >> After the finalization of evaluation, I think JStorm team can take
> > the merging job and publish a stable release in 2 months.
> > >> >> But anyway, the detailed implementation for these features with
> > different solution is transparent to user. So, from user's point of view,
> > there is not any compatibility problem.
> > >> >>
> > >> >> Besides compatibility, by our experience, stability is also
> > important and is not an easy job. 4 people in JStorm team took almost one
> > year to finish the porting from "clojure core"
> > >> >> to "java core", and to make it stable. Of course, we have many devs
> > in community to make the porting job faster. But it still needs a long time
> > to run many online complex topologys to find bugs and fix them. So, that is
> > the reason why I proposed to do merging and build on a stable "java core".
> > >> >>
> > >> >> -----Original Message-----
> > >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> > >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> > >> >> To: dev@storm.apache.org
> > >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> > Migrating the APIs to org.apache.storm is a big non-backwards compatible
> > move, and a major version bump to 2.x seems like a good move there.
> > >> >> +1 for the release plan
> > >> >>
> > >> >> I would like the move for user facing APIs to org.apache to be one
> > of the last things we do.  Translating clojure code into java and moving it
> > to org.apache I am not too concerned about.
> > >> >>
> > >> >> Basti,
> > >> >> We have two code bases that have diverged significantly from one
> > another in terms of functionality.  The storm code now or soon will have A
> > Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> > Scheduling, a distributed cache like API, log searching, security, massive
> > performance improvements, shaded almost all of our dependencies, a REST API
> > for programtically accessing everything on the UI, and I am sure I am
> > missing a few other things.  JStorm also has many changes including cgroup
> > isolation, restructured zookeeper layout, classpath isolation, and more too.
> > >> >> No matter what we do it will be a large effort to port changes from
> > one code base to another, and from clojure to java.  I proposed this
> > initially because it can be broken up into incremental changes.  It may
> > take a little longer, but we will always have a working codebase that is
> > testable and compatible with the current storm release, at least until we
> > move the user facing APIs to be under org.apache.  This lets the community
> > continue to build and test the master branch and report problems that they
> > find, which is incredibly valuable.  I personally don't think it will be
> > much easier, especially if we are intent on always maintaining
> > compatibility with storm. - Bobby
> > >> >>
> > >> >>
> > >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> > basti.lj@alibaba-inc.com> wrote:
> > >> >>
> > >> >>
> > >> >> Hi Taylor,
> > >> >>
> > >> >>
> > >> >>
> > >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> > >> >>
> > >> >> Do you mean community plan to create a fresh new “java core” based
> > on current “clojure core” firstly, and then migrate the features from
> > JStorm?
> > >> >>
> > >> >> If so, it confused me.  It is really a huge job which might require
> > a long developing time to make it stable, while JStorm is already a stable
> > version.
> > >> >>
> > >> >> The release planned to be release after Nov 11th has already run
> > online stably several month in Alibaba.
> > >> >>
> > >> >> Besides this, there are many valuable internal requirements in
> > Alibaba, the fast evolution of JStorm is forseeable in next few months.
> > >> >>
> > >> >> If the “java core” is totally fresh new, it might bring many
> > problems for the coming merge.
> > >> >>
> > >> >> So, from the point of this view,  I think it is much better and
> > easier to migrate the features of “clojure core” basing on JStorm for the
> > “java core”.
> > >> >>
> > >> >> Please correct me, if any misunderstanding.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Regards
> > >> >>
> > >> >> Basti
> > >> >>
> > >> >>
> > >> >>
> > >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> > >> >> 发送时间: 2015年11月11日 5:32
> > >> >> 收件人: dev@storm.apache.org
> > >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> > >> >>
> > >> >>
> > >> >>
> > >> >> Based on a number of discussions regarding merging the JStorm code,
> > I’ve tried to distill the ideas presented and inserted some of my own. The
> > result is below.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve divided the plan into three phases, though they are not
> > necessarily sequential — obviously some tasks can take place in parallel.
> > >> >>
> > >> >>
> > >> >>
> > >> >> None of this is set in stone, just presented for discussion. Any and
> > all comments are welcome.
> > >> >>
> > >> >>
> > >> >>
> > >> >> -------
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 1 - Plan for 0.11.x Release
> > >> >>
> > >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> > >> >>
> > >> >> 2. Announce feature-freeze for 0.11.x
> > >> >>
> > >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> > >> >>
> > >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> > >> >>
> > >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> > >> >>
> > >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> > isolation, cgroups, etc.) and create JIRA for migrating the feature.
> > >> >>
> > >> >> 2. Create JIRA for migrating each clojure component (or logical
> > group of components) to Java. Assumes tests will be ported as well.
> > >> >>
> > >> >> 3. Discuss/establish style guide for Java coding conventions.
> > Consider using Oracle’s or Google’s Java conventions as a base — they are
> > both pretty solid.
> > >> >>
> > >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> > com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Phase 3 - Migrate Clojure --> Java
> > >> >>
> > >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> > possible (core functionality only, features distinct to JStorm migrated
> > separately).
> > >> >>
> > >> >> 2. Port JStorm-specific features.
> > >> >>
> > >> >> 3. Begin releasing preview/beta versions.
> > >> >>
> > >> >> 4. Code cleanup (across the board) and refactoring using established
> > coding conventions, and leveraging PMD/Checkstyle reports for reference.
> > (Note: good oportunity for new contributors.)
> > >> >>
> > >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> > feature freeze.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> Notes:
> > >> >>
> > >> >> We should consider bumping up to version 1.0 sometime soon and then
> > switching to semantic versioning [3] from then on.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> With the exception of package name alignment, the "jstorm-import"
> > branch will largely be read-only throughout the process.
> > >> >>
> > >> >>
> > >> >>
> > >> >> During migration, it's probably easiest to operate with two local
> > clones of the Apache Storm repo: one for working (i.e. checked out to
> > working branch) and one for reference/copying (i.e. checked out to
> > "jstorm-import").
> > >> >>
> > >> >>
> > >> >>
> > >> >> Feature-freeze probably only needs to be enforced against core
> > functionality. Components under "external" can likely be exempt, but we
> > should figure out a process for accepting and releasing new features during
> > the migration.
> > >> >>
> > >> >>
> > >> >>
> > >> >> Performance testing should be continuous throughout the process.
> > Since we don't really have ASF infrastructure for performance testing, we
> > will need a volunteer(s) to host and run the performance tests. Performance
> > test results can be posted to the wiki [2]. It would probably be a good
> > idea to establish a baseline with the 0.10.0 release.
> > >> >>
> > >> >>
> > >> >>
> > >> >> I’ve attached an analysis document Sean Zhong put together a while
> > back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> > release but is still relevant and has a lot of good information.
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> [1]
> > https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> > >> >>
> > >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> > >> >>
> > >> >> [3] http://semver.org
> > >> >>
> > >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >> -Taylor
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >>
> > >> >
> > >> >
> > >
> > >
> > >
> > > --
> > > Name : 임 정택
> > > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > > Twitter : http://twitter.com/heartsavior
> > > LinkedIn : http://www.linkedin.com/in/heartsavior
> >
> 

Re: [DISCUSS] Storm 2.0 plan

Posted by Longda Feng <zh...@alibaba-inc.com>.
@Sean, Thanks for clarify.
@Taylor, @Bobby, @Sean, @Jungtaek, @Harsha, @dev,
Sorry for leading to misunderstanding.
The biggest point:We would like to merge two community into one community, One community is stronger than two single communities. My team hopes that Alibaba can directly use the Apache Storm version  in the next few years. My team don't need to maintain JStorm any more, this is the reason why Alibaba donated JStorm. 
Second point:Sean's point is right. The migration is not just "copy". It should be "merge". I means that the module will not simply as the JStorm module. It should be the result of our disccussion. I think the final solution after merging can make Storm better. 
Third point:In fact, I don't scare other streaming process, especially for Heron. I have work on Storm for 4 years, I am a deep fans of Storm. I know what can Storm do and what storm cannot do . But I want to express we need accelerate our evolve speed. This field is so active. We should start to learn other framework's advantage as soon as possible. Especally, we need more application level programming framework like Trident. This wil attract more users to Storm.
Fourth point:We don't need to do everything from scratch, we can use JStorm as much as possible. JStorm is here, why not use.
Last point:My team is already full time on this merge, we will try our best to do contribution, make Storm better. 

ThanksLongda

------------------------------------------------------------------From:Sean Zhong <cl...@gmail.com>Send Time:2015年11月20日(星期五) 11:58To:dev <de...@storm.apache.org>Subject:Re: [DISCUSS] Storm 2.0 plan
Hi All,

I think there are may be some misproper use of words or misunderstanding.

Here is what I can see both agrees these goals:
1. We want to migrate clojure to java
2. We want to merge important features together.
3. We want to do this in a step by step, transparent, reviewable way,
especially with close examination and reviews for code that has
architecture change.
4. We want the final version to remain compatibility.

The only difference is the process we use to achieve these goals.
Longda's view:
1. do a parallel migration from clojure core to java part by part. parallel
means equivalent, no new features added in this step. He suggest to follow
the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
web ui/others"
He use word "copy", which is mis-proper in my idea. It is more like a
merging.
quote on his words.

>  2.1 Copy modules from JStorm, one module from one module

2.2 The sequence is extern modules/client/utils/nimbus/
> supervisor/drpc/worker & task/web ui/others

2. upon the java core code base, incremental add new feature blocks.
quote on his words.

> 3.1 Discuss solution for each difference(jira)
> 3.2 Once the solution is finalized, we can start the
> merging. (Some issues could be start concurrently. It
> depends on the discussion.)

3.  His goal is to remain compatibility. "this version is stable and
compatible with API of Storm 1.0." is not accurate statement from my point,
at least not for the security feature.
4. He share his concern on other streaming engines.


Bobby and Jungtaek 's view:
1. "Copy" is not acceptable, it will impact the security features. (Copy is
a wrong phase to use, I think Longda means more a merging)
2. With JStorm team, we start with clojure -> java translation first,
3. By optimistic view, with JStorm team, one month should be enough for
above stage.
3. Adding new features after whole code is migrated to java.
4. No need to that worry about other engines.

If my understanding of both parties are correct. I think we agree on most
of things about the process.
first: clojure -> java
second: merge features.

With a slight difference about how aggressive we want to do "clojure ->
java", and how long it takes.


@Longda, can you clarify whether my understanding of your opinion is right?


Sean


On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com> wrote:

> Very well stated Juntaek.
>
> I should also point out that there's nothing stopping the JStorm team from
> releasing new versions of JStorm, or adding new features. But you would
> have to be careful to note that any such release is "JStorm" and not
> "Apache Storm." And any such release cannot be hosted on Apache
> infrastructure.
>
> We also shouldn't be too worried about competition with other stream
> processing frameworks. Competition is healthy and leads to improvements
> across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> integration. It also borrowed memory management ideas from Flink. I don't
> see that as a problem. This is open source. We can, and should, do the same
> where applicable.
>
> Did we learn anything from the Heron paper? Nothing we didn't already
> know. And a lot of the points have been addressed. We dealt security first,
> which is more important for adoption, especially in the enterprise. Now
> we've addressed many performance, scaling, and usability issues. Most of
> the production deployments I've seen are nowhere near the magnitude of what
> twitter requires. But I've seen many deployments that  only exist because
> we offer security. I doubt heron has that.
>
> We've also seen an uptick in community and developer involvement, which
> means a likely increase in committers, which likely means a faster
> turnaround for patch reviews, which means a tighter release cycle for new
> features, which means we will be moving faster. This is healthy for an
> Apache project.
>
> And the inclusion of the JStorm team will only make that more so.
>
> I feel we are headed in the right direction, and there are good things to
> come.
>
> -Taylor
>
>
> > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> >
> > Sorry Longda, but I can't help telling that I also disagree about
> changing codebase.
> >
> > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> just in point of feature's view. We can't be safe to change although
> feature matrixes are identical, because feature matrix doesn't contain the
> details.
> >
> > I mean, users could be scared when expected behaviors are not in place
> although they're small. User experience is the one of the most important
> part of the project, and if UX changes are huge, barrier for upgrading
> their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> should be the worst scenario I can imagine after merging.
> >
> > The safest way to merge is applying JStorm's great features to Apache
> Storm.
> > I think porting language of Apache Storm to Java is not tightly related
> to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> itself can port to other languages like Java, Scala, or something else
> which are more popular than Clojure.
> >
> > And I'm also not scary about Flink, Heron, Spark, etc.
> > It doesn't mean other projects are not greater then Storm. Just I'm
> saying each projects have their own strength.
> > For example, all conferences are saying about Spark, and as one of users
> of Spark, Spark is really great. If you are a little bit familiar with
> Scala, you can just apply Scala-like functional methods to RDD. Really easy
> to use.
> > But it doesn't mean that Spark can replace Storm in all kind of use
> cases. Recently I've seen some articles that why Storm is more preferred in
> realtime streaming processing.
> >
> > Competition should give us a positive motivation. I hope that our
> roadmap isn't focused to defeat competitors, but is focused to present
> great features, better performance, and better UX to Storm community. It's
> not commercial product, it's open source project!
> >
> > tl;dr. Please don't change codebase unless we plan to release a brand
> new project. It breaks UX completely which could make users leave.
> >
> > I'm also open to other opinions as well.
> >
> > Best,
> > Jungtaek Lim (HeartSaVioR)
> >
> >
> > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> >> I disagree completely.  You claim that JStorm is compatible with storm
> 1.0.  I don't believe that it is 100% compatible.  There has been more then
> 2 years of software development happening on both sides.  Security was not
> done in a day, and porting it over to JStorm is not going to happen
> quickly, and because of the major architectural changes between storm and
> JStorm I believe we would have to make some serious enhancements to fully
> support a secure TopologyMaster, but I need to look into it more.  The blob
> store is another piece of code that has taken a very long time to develop.
> There are numberous others.  The big features are not the ones that make me
> nervous because we can plan for them, it is the hundreds of small JIRA and
> features that will result in minor incompatibilities.  If we start with
> storm itself, and follow the same process that we have been doing up until
> now, if there is a reason to stop the port add in an important feature and
> do a release, we can.  We will know that we have compatibility vs starting
> with JStorm we know from the start that we do not without adding feature X,
> Y, Z, ....
> >>
> >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> etc...  We just did some major performance enhancements that will be
> released with STORM 1.0.  We now have up to 6x the throughput that we had
> before with minimal changes to the latency (20 ms vs 5 ms).  We have
> automatic back-pressure so if someone was running with acking enabled just
> for flow control they can now process close to 16x the throughput they
> could before with the same hardware.  This puts our throughput very much on
> par with flink and Spark, but with a much lower latency compared to either
> of them.  Plus from what I have heard Flink is still calling the streaming
> API beta, and their storm API compatibility is very rudimentary.  They are
> also going to have more and more problems maintaining compatibility as we
> add in new features and functionality.
> >>
> >> Spark only really works well when it is running with several seconds of
> latency. Not every one needs sub-second processing, but when your platform
> is completely unable to handle it, locks you out of a lot of use cases.
> Their throughput is decent and can scale very high when you are willing to
> tolerate similarly very high latencies.
> >> Who knows about Heron until they actually release their code, but it is
> missing lots of critical features, and the one they touted, better
> performance, is a moot point with storm 1.0.  The only thing we really are
> lacking is advertising, we don't have a big company really pushing storm
> and getting it in the news all the time (Sorry Hortonworks, but I really
> have not seen much about it in the news).  I am trying to do more, but
> there is only so much I can do.
> >> Longda I very much agree with you about moving quickly to make the
> transition, but I do not believe in any way that starting with JStorm is
> going to reduce that transition time.
> >> My proposal is to give everyone about 2 weeks to finish merging new
> features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> for a release.  At the same time development work to port storm to java
> begins.  You said it took 4 developers 1 year to port storm to java the
> first time for JStorm.  We have 14+ active developers and over one hundred
> contributors not including those from the JStorm community.  If numbers
> scale linearly, I know they don't completely, we should be able to do a
> complete port with no JStorm reference in around 100 days.  With a copy and
> paste for a lot of this from the JStorm codebase, I would expect to be able
> to do it in 1 month of development, possibly less if the JStorm community
> can really help out too.  So by January we should be ready to begin pulling
> in features from JStorm that make since.  Looking at the feature matrix in
> https://github.com/apache/storm/pull/877 there are a few potentially big
> improvements that we would want to pull in, but they require architectural
> changes in some cases that I don't want to just do lightly.  I would
> propose that one the code has been ported to java we reopen for all new
> features in parallel with the JStorm feature migration, but I am open to
> others opinions as well.
> >>  - Bobby
> >>
> >>
> >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> zhongyan.feng@alibaba-inc.com> wrote:
> >>
> >>
> >>  Sorry for changing the Subject.
> >>
> >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> JStorm.
> >>
> >> I think the change of this release will be the biggest one in history.
> It will probably take a long time to develop. At the same time, Heron is
> going to open source, and the latest release of Flink provides the
> compatibility to Storm’s API. These might be the threat to Storm. So I
> suggest we start the development of Storm 2.0 as quickly as possible. In
> order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> core and UI as the base version since this version is stable and compatible
> with API of Storm 1.0. Please refer to the phases below for the detailed
> merging plan.
> >>
> >> Note: We provide a demo of JStorm’s web UI. Please refer to
> storm.taobao.org . I think JStorm will give a totally different view to
> you.
> >>
> >> I would like to share the experience of initial development of JStorm
> (Migrate from clojure core to java core).
> >> Our team(4 developers) have spent almost one year to finish the
> migration. We took 4 months to release the first JStorm version, and 6
> months to make JStorm stable. During this period, we tried to switch more
> than online 100 applications with different scenarios from Storm to JStorm,
> and many bugs were fixed. Then more and more applications were switched to
> JStorm in Alibaba.
> >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> 2000+ applications are running on them. The JStorm Clusters here can handle
> 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> field but also in many other online scenarios.
> >> Besides it, we have experienced the November 11th Shopping Festival of
> Alibaba for last three years. At that day, the computation in our cluster
> increased several times than usual. All applications worked well during the
> peak time. I can say the stability of JStorm is no doubt today. Actually,
> besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> >>
> >>
> >> Phase 1:
> >>
> >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> >> 1. Open a new Umbrella Jira (
> https://issues.apache.org/jira/browse/STORM-717)
> >> 2. Create one 2.0 branch,
> >> 2.1 Copy modules from JStorm, one module from one module
> >> 2.2 The sequence is extern
> modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> >> 3.1 Discuss solution for each difference(jira)
> >> 3.2 Once the solution is finalized, we can start the merging. (Some
> issues could be start concurrently. It depends on the discussion.)
> >>
> >> The phase mainly try to define target and finalize the solution.
> Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> >>
> >>
> >> Phase 2:
> >> Release Storm 2.0 beta
> >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> >> 2. Integrate all modules, make the simplest storm example can run on
> the system.
> >> 3. Test with all example and modules in Storm code base.
> >> 4. All daily test can be passed.
> >>
> >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> >>
> >>
> >> Phase 3:
> >> Persuade some user to have a try.
> >> Alibaba will try to run some online applications on the beta version
> >>
> >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> >>
> >>
> >> Any comments are welcome.
> >>
> >>
> >> Thanks
> >>
> Longda------------------------------------------------------------------From:P.
> Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> [DISCUSS] Plan for Merging JStorm Code
> >> All I have at this point is a placeholder wiki entry [1], and a lot of
> local notes that likely would only make sense to me.
> >>
> >> Let me know your wiki username and I’ll give you permissions. The same
> goes for anyone else who wants to help.
> >>
> >> -Taylor
> >>
> >> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> >>
> >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> wrote:
> >> >
> >> > Taylor and others I was hoping to get started filing JIRA and
> planning on how we are going to do the java migration + JStorm merger.  Is
> anyone else starting to do this?  If not would anyone object to me starting
> on it? - Bobby
> >> >
> >> >
> >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> ptgoetz@gmail.com> wrote:
> >> >
> >> >
> >> > Thanks for putting this together Basti, that comparison helps a lot.
> >> >
> >> > And thanks Bobby for converting it into markdown. I was going to just
> attach the spreadsheet to JIRA, but markdown is a much better solution.
> >> >
> >> > -Taylor
> >> >
> >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> <ev...@yahoo-inc.com.INVALID> wrote:
> >> >>
> >> >> I translated the excel spreadsheet into a markdown file and put up a
> pull request for it.
> >> >> https://github.com/apache/storm/pull/877
> >> >> I did a few edits to it to make it work with Markdown, and to add in
> a few of my own comments.  I also put in a field for JIRAs to be able to
> track the migration.
> >> >> Overall I think your evaluation was very good.  We have a fair
> amount of work ahead of us to decide what version of various features we
> want to go forward with.
> >> >>   - Bobby
> >> >>
> >> >>
> >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Bobby & Jungtaek,
> >> >>
> >> >> Thanks for your replay.
> >> >> I totally agree that compatibility is the most important thing.
> Actually, JStorm has been compatible with the user API of Storm.
> >> >> As you mentioned below, we indeed still have some features different
> between Storm and JStorm. I have tried to list them (minor update or
> improvements are not included).
> >> >> Please refer to attachment for details. If any missing, please help
> to point out. (The current working features are probably missing here.)
> >> >> Just have a look at these differences. For the missing features in
> JStorm, I did not see any obstacle which will block the merge to JStorm.
> >> >> For the features which has different solution between Storm and
> JStorm, we can evaluate the solution one by one to decision which one is
> appropriate.
> >> >> After the finalization of evaluation, I think JStorm team can take
> the merging job and publish a stable release in 2 months.
> >> >> But anyway, the detailed implementation for these features with
> different solution is transparent to user. So, from user's point of view,
> there is not any compatibility problem.
> >> >>
> >> >> Besides compatibility, by our experience, stability is also
> important and is not an easy job. 4 people in JStorm team took almost one
> year to finish the porting from "clojure core"
> >> >> to "java core", and to make it stable. Of course, we have many devs
> in community to make the porting job faster. But it still needs a long time
> to run many online complex topologys to find bugs and fix them. So, that is
> the reason why I proposed to do merging and build on a stable "java core".
> >> >>
> >> >> -----Original Message-----
> >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> >> To: dev@storm.apache.org
> >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> Migrating the APIs to org.apache.storm is a big non-backwards compatible
> move, and a major version bump to 2.x seems like a good move there.
> >> >> +1 for the release plan
> >> >>
> >> >> I would like the move for user facing APIs to org.apache to be one
> of the last things we do.  Translating clojure code into java and moving it
> to org.apache I am not too concerned about.
> >> >>
> >> >> Basti,
> >> >> We have two code bases that have diverged significantly from one
> another in terms of functionality.  The storm code now or soon will have A
> Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> Scheduling, a distributed cache like API, log searching, security, massive
> performance improvements, shaded almost all of our dependencies, a REST API
> for programtically accessing everything on the UI, and I am sure I am
> missing a few other things.  JStorm also has many changes including cgroup
> isolation, restructured zookeeper layout, classpath isolation, and more too.
> >> >> No matter what we do it will be a large effort to port changes from
> one code base to another, and from clojure to java.  I proposed this
> initially because it can be broken up into incremental changes.  It may
> take a little longer, but we will always have a working codebase that is
> testable and compatible with the current storm release, at least until we
> move the user facing APIs to be under org.apache.  This lets the community
> continue to build and test the master branch and report problems that they
> find, which is incredibly valuable.  I personally don't think it will be
> much easier, especially if we are intent on always maintaining
> compatibility with storm. - Bobby
> >> >>
> >> >>
> >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Taylor,
> >> >>
> >> >>
> >> >>
> >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> >> >>
> >> >> Do you mean community plan to create a fresh new “java core” based
> on current “clojure core” firstly, and then migrate the features from
> JStorm?
> >> >>
> >> >> If so, it confused me.  It is really a huge job which might require
> a long developing time to make it stable, while JStorm is already a stable
> version.
> >> >>
> >> >> The release planned to be release after Nov 11th has already run
> online stably several month in Alibaba.
> >> >>
> >> >> Besides this, there are many valuable internal requirements in
> Alibaba, the fast evolution of JStorm is forseeable in next few months.
> >> >>
> >> >> If the “java core” is totally fresh new, it might bring many
> problems for the coming merge.
> >> >>
> >> >> So, from the point of this view,  I think it is much better and
> easier to migrate the features of “clojure core” basing on JStorm for the
> “java core”.
> >> >>
> >> >> Please correct me, if any misunderstanding.
> >> >>
> >> >>
> >> >>
> >> >> Regards
> >> >>
> >> >> Basti
> >> >>
> >> >>
> >> >>
> >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> >> >> 发送时间: 2015年11月11日 5:32
> >> >> 收件人: dev@storm.apache.org
> >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >>
> >> >>
> >> >> Based on a number of discussions regarding merging the JStorm code,
> I’ve tried to distill the ideas presented and inserted some of my own. The
> result is below.
> >> >>
> >> >>
> >> >>
> >> >> I’ve divided the plan into three phases, though they are not
> necessarily sequential — obviously some tasks can take place in parallel.
> >> >>
> >> >>
> >> >>
> >> >> None of this is set in stone, just presented for discussion. Any and
> all comments are welcome.
> >> >>
> >> >>
> >> >>
> >> >> -------
> >> >>
> >> >>
> >> >>
> >> >> Phase 1 - Plan for 0.11.x Release
> >> >>
> >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >> >>
> >> >> 2. Announce feature-freeze for 0.11.x
> >> >>
> >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >> >>
> >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >> >>
> >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >> >>
> >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> isolation, cgroups, etc.) and create JIRA for migrating the feature.
> >> >>
> >> >> 2. Create JIRA for migrating each clojure component (or logical
> group of components) to Java. Assumes tests will be ported as well.
> >> >>
> >> >> 3. Discuss/establish style guide for Java coding conventions.
> Consider using Oracle’s or Google’s Java conventions as a base — they are
> both pretty solid.
> >> >>
> >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 3 - Migrate Clojure --> Java
> >> >>
> >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> possible (core functionality only, features distinct to JStorm migrated
> separately).
> >> >>
> >> >> 2. Port JStorm-specific features.
> >> >>
> >> >> 3. Begin releasing preview/beta versions.
> >> >>
> >> >> 4. Code cleanup (across the board) and refactoring using established
> coding conventions, and leveraging PMD/Checkstyle reports for reference.
> (Note: good oportunity for new contributors.)
> >> >>
> >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> feature freeze.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Notes:
> >> >>
> >> >> We should consider bumping up to version 1.0 sometime soon and then
> switching to semantic versioning [3] from then on.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> With the exception of package name alignment, the "jstorm-import"
> branch will largely be read-only throughout the process.
> >> >>
> >> >>
> >> >>
> >> >> During migration, it's probably easiest to operate with two local
> clones of the Apache Storm repo: one for working (i.e. checked out to
> working branch) and one for reference/copying (i.e. checked out to
> "jstorm-import").
> >> >>
> >> >>
> >> >>
> >> >> Feature-freeze probably only needs to be enforced against core
> functionality. Components under "external" can likely be exempt, but we
> should figure out a process for accepting and releasing new features during
> the migration.
> >> >>
> >> >>
> >> >>
> >> >> Performance testing should be continuous throughout the process.
> Since we don't really have ASF infrastructure for performance testing, we
> will need a volunteer(s) to host and run the performance tests. Performance
> test results can be posted to the wiki [2]. It would probably be a good
> idea to establish a baseline with the 0.10.0 release.
> >> >>
> >> >>
> >> >>
> >> >> I’ve attached an analysis document Sean Zhong put together a while
> back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> release but is still relevant and has a lot of good information.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> [1]
> https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> >> >>
> >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> >> >>
> >> >> [3] http://semver.org
> >> >>
> >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> -Taylor
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >
> >
> >
> > --
> > Name : 임 정택
> > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > Twitter : http://twitter.com/heartsavior
> > LinkedIn : http://www.linkedin.com/in/heartsavior
>


Re: [DISCUSS] Storm 2.0 plan

Posted by Sean Zhong <cl...@gmail.com>.
Hi All,

I think there are may be some misproper use of words or misunderstanding.

Here is what I can see both agrees these goals:
1. We want to migrate clojure to java
2. We want to merge important features together.
3. We want to do this in a step by step, transparent, reviewable way,
especially with close examination and reviews for code that has
architecture change.
4. We want the final version to remain compatibility.

The only difference is the process we use to achieve these goals.
Longda's view:
1. do a parallel migration from clojure core to java part by part. parallel
means equivalent, no new features added in this step. He suggest to follow
the order "modules/client/utils/nimbus/supervisor/drpc/worker & task/
web ui/others"
He use word "copy", which is mis-proper in my idea. It is more like a
merging.
quote on his words.

>  2.1 Copy modules from JStorm, one module from one module

2.2 The sequence is extern modules/client/utils/nimbus/
> supervisor/drpc/worker & task/web ui/others

2. upon the java core code base, incremental add new feature blocks.
quote on his words.

> 3.1 Discuss solution for each difference(jira)
> 3.2 Once the solution is finalized, we can start the
> merging. (Some issues could be start concurrently. It
> depends on the discussion.)

3.  His goal is to remain compatibility. "this version is stable and
compatible with API of Storm 1.0." is not accurate statement from my point,
at least not for the security feature.
4. He share his concern on other streaming engines.


Bobby and Jungtaek 's view:
1. "Copy" is not acceptable, it will impact the security features. (Copy is
a wrong phase to use, I think Longda means more a merging)
2. With JStorm team, we start with clojure -> java translation first,
3. By optimistic view, with JStorm team, one month should be enough for
above stage.
3. Adding new features after whole code is migrated to java.
4. No need to that worry about other engines.

If my understanding of both parties are correct. I think we agree on most
of things about the process.
first: clojure -> java
second: merge features.

With a slight difference about how aggressive we want to do "clojure ->
java", and how long it takes.


@Longda, can you clarify whether my understanding of your opinion is right?


Sean


On Fri, Nov 20, 2015 at 11:40 AM, P. Taylor Goetz <pt...@gmail.com> wrote:

> Very well stated Juntaek.
>
> I should also point out that there's nothing stopping the JStorm team from
> releasing new versions of JStorm, or adding new features. But you would
> have to be careful to note that any such release is "JStorm" and not
> "Apache Storm." And any such release cannot be hosted on Apache
> infrastructure.
>
> We also shouldn't be too worried about competition with other stream
> processing frameworks. Competition is healthy and leads to improvements
> across the board. Spark Streaming borrowed ideas from Storm for its Kafka
> integration. It also borrowed memory management ideas from Flink. I don't
> see that as a problem. This is open source. We can, and should, do the same
> where applicable.
>
> Did we learn anything from the Heron paper? Nothing we didn't already
> know. And a lot of the points have been addressed. We dealt security first,
> which is more important for adoption, especially in the enterprise. Now
> we've addressed many performance, scaling, and usability issues. Most of
> the production deployments I've seen are nowhere near the magnitude of what
> twitter requires. But I've seen many deployments that  only exist because
> we offer security. I doubt heron has that.
>
> We've also seen an uptick in community and developer involvement, which
> means a likely increase in committers, which likely means a faster
> turnaround for patch reviews, which means a tighter release cycle for new
> features, which means we will be moving faster. This is healthy for an
> Apache project.
>
> And the inclusion of the JStorm team will only make that more so.
>
> I feel we are headed in the right direction, and there are good things to
> come.
>
> -Taylor
>
>
> > On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> >
> > Sorry Longda, but I can't help telling that I also disagree about
> changing codebase.
> >
> > Feature matrix shows us how far Apache Storm and JStorm are diverged,
> just in point of feature's view. We can't be safe to change although
> feature matrixes are identical, because feature matrix doesn't contain the
> details.
> >
> > I mean, users could be scared when expected behaviors are not in place
> although they're small. User experience is the one of the most important
> part of the project, and if UX changes are huge, barrier for upgrading
> their Storm cluster to 2.0 is not far easier than migrating to Heron. It
> should be the worst scenario I can imagine after merging.
> >
> > The safest way to merge is applying JStorm's great features to Apache
> Storm.
> > I think porting language of Apache Storm to Java is not tightly related
> to merge JStorm. I agree that merging becomes a trigger, but Apache Storm
> itself can port to other languages like Java, Scala, or something else
> which are more popular than Clojure.
> >
> > And I'm also not scary about Flink, Heron, Spark, etc.
> > It doesn't mean other projects are not greater then Storm. Just I'm
> saying each projects have their own strength.
> > For example, all conferences are saying about Spark, and as one of users
> of Spark, Spark is really great. If you are a little bit familiar with
> Scala, you can just apply Scala-like functional methods to RDD. Really easy
> to use.
> > But it doesn't mean that Spark can replace Storm in all kind of use
> cases. Recently I've seen some articles that why Storm is more preferred in
> realtime streaming processing.
> >
> > Competition should give us a positive motivation. I hope that our
> roadmap isn't focused to defeat competitors, but is focused to present
> great features, better performance, and better UX to Storm community. It's
> not commercial product, it's open source project!
> >
> > tl;dr. Please don't change codebase unless we plan to release a brand
> new project. It breaks UX completely which could make users leave.
> >
> > I'm also open to other opinions as well.
> >
> > Best,
> > Jungtaek Lim (HeartSaVioR)
> >
> >
> > 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
> >> I disagree completely.  You claim that JStorm is compatible with storm
> 1.0.  I don't believe that it is 100% compatible.  There has been more then
> 2 years of software development happening on both sides.  Security was not
> done in a day, and porting it over to JStorm is not going to happen
> quickly, and because of the major architectural changes between storm and
> JStorm I believe we would have to make some serious enhancements to fully
> support a secure TopologyMaster, but I need to look into it more.  The blob
> store is another piece of code that has taken a very long time to develop.
> There are numberous others.  The big features are not the ones that make me
> nervous because we can plan for them, it is the hundreds of small JIRA and
> features that will result in minor incompatibilities.  If we start with
> storm itself, and follow the same process that we have been doing up until
> now, if there is a reason to stop the port add in an important feature and
> do a release, we can.  We will know that we have compatibility vs starting
> with JStorm we know from the start that we do not without adding feature X,
> Y, Z, ....
> >>
> >> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> etc...  We just did some major performance enhancements that will be
> released with STORM 1.0.  We now have up to 6x the throughput that we had
> before with minimal changes to the latency (20 ms vs 5 ms).  We have
> automatic back-pressure so if someone was running with acking enabled just
> for flow control they can now process close to 16x the throughput they
> could before with the same hardware.  This puts our throughput very much on
> par with flink and Spark, but with a much lower latency compared to either
> of them.  Plus from what I have heard Flink is still calling the streaming
> API beta, and their storm API compatibility is very rudimentary.  They are
> also going to have more and more problems maintaining compatibility as we
> add in new features and functionality.
> >>
> >> Spark only really works well when it is running with several seconds of
> latency. Not every one needs sub-second processing, but when your platform
> is completely unable to handle it, locks you out of a lot of use cases.
> Their throughput is decent and can scale very high when you are willing to
> tolerate similarly very high latencies.
> >> Who knows about Heron until they actually release their code, but it is
> missing lots of critical features, and the one they touted, better
> performance, is a moot point with storm 1.0.  The only thing we really are
> lacking is advertising, we don't have a big company really pushing storm
> and getting it in the news all the time (Sorry Hortonworks, but I really
> have not seen much about it in the news).  I am trying to do more, but
> there is only so much I can do.
> >> Longda I very much agree with you about moving quickly to make the
> transition, but I do not believe in any way that starting with JStorm is
> going to reduce that transition time.
> >> My proposal is to give everyone about 2 weeks to finish merging new
> features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> for a release.  At the same time development work to port storm to java
> begins.  You said it took 4 developers 1 year to port storm to java the
> first time for JStorm.  We have 14+ active developers and over one hundred
> contributors not including those from the JStorm community.  If numbers
> scale linearly, I know they don't completely, we should be able to do a
> complete port with no JStorm reference in around 100 days.  With a copy and
> paste for a lot of this from the JStorm codebase, I would expect to be able
> to do it in 1 month of development, possibly less if the JStorm community
> can really help out too.  So by January we should be ready to begin pulling
> in features from JStorm that make since.  Looking at the feature matrix in
> https://github.com/apache/storm/pull/877 there are a few potentially big
> improvements that we would want to pull in, but they require architectural
> changes in some cases that I don't want to just do lightly.  I would
> propose that one the code has been ported to java we reopen for all new
> features in parallel with the JStorm feature migration, but I am open to
> others opinions as well.
> >>  - Bobby
> >>
> >>
> >>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> zhongyan.feng@alibaba-inc.com> wrote:
> >>
> >>
> >>  Sorry for changing the Subject.
> >>
> >> I am +1 for releasing Storm 2.0 with java core, which is merged with
> JStorm.
> >>
> >> I think the change of this release will be the biggest one in history.
> It will probably take a long time to develop. At the same time, Heron is
> going to open source, and the latest release of Flink provides the
> compatibility to Storm’s API. These might be the threat to Storm. So I
> suggest we start the development of Storm 2.0 as quickly as possible. In
> order to accelerate the development cycle, I proposed to take JStorm 2.1.0
> core and UI as the base version since this version is stable and compatible
> with API of Storm 1.0. Please refer to the phases below for the detailed
> merging plan.
> >>
> >> Note: We provide a demo of JStorm’s web UI. Please refer to
> storm.taobao.org . I think JStorm will give a totally different view to
> you.
> >>
> >> I would like to share the experience of initial development of JStorm
> (Migrate from clojure core to java core).
> >> Our team(4 developers) have spent almost one year to finish the
> migration. We took 4 months to release the first JStorm version, and 6
> months to make JStorm stable. During this period, we tried to switch more
> than online 100 applications with different scenarios from Storm to JStorm,
> and many bugs were fixed. Then more and more applications were switched to
> JStorm in Alibaba.
> >> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and
> 2000+ applications are running on them. The JStorm Clusters here can handle
> 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData
> field but also in many other online scenarios.
> >> Besides it, we have experienced the November 11th Shopping Festival of
> Alibaba for last three years. At that day, the computation in our cluster
> increased several times than usual. All applications worked well during the
> peak time. I can say the stability of JStorm is no doubt today. Actually,
> besides Alibaba, the most powerful Chinese IT company are also using JStorm.
> >>
> >>
> >> Phase 1:
> >>
> >> Define the target of Storm 2.0. List the requirement of Storm 2.0
> >> 1. Open a new Umbrella Jira (
> https://issues.apache.org/jira/browse/STORM-717)
> >> 2. Create one 2.0 branch,
> >> 2.1 Copy modules from JStorm, one module from one module
> >> 2.2 The sequence is extern
> modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> >> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> >> 3.1 Discuss solution for each difference(jira)
> >> 3.2 Once the solution is finalized, we can start the merging. (Some
> issues could be start concurrently. It depends on the discussion.)
> >>
> >> The phase mainly try to define target and finalize the solution.
> Hopefully this phase could be finished in 2 month(before 2016/1/31). .
> >>
> >>
> >> Phase 2:
> >> Release Storm 2.0 beta
> >> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> >> 2. Integrate all modules, make the simplest storm example can run on
> the system.
> >> 3. Test with all example and modules in Storm code base.
> >> 4. All daily test can be passed.
> >>
> >> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> >>
> >>
> >> Phase 3:
> >> Persuade some user to have a try.
> >> Alibaba will try to run some online applications on the beta version
> >>
> >> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> >>
> >>
> >> Any comments are welcome.
> >>
> >>
> >> Thanks
> >>
> Longda------------------------------------------------------------------From:P.
> Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> [DISCUSS] Plan for Merging JStorm Code
> >> All I have at this point is a placeholder wiki entry [1], and a lot of
> local notes that likely would only make sense to me.
> >>
> >> Let me know your wiki username and I’ll give you permissions. The same
> goes for anyone else who wants to help.
> >>
> >> -Taylor
> >>
> >> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> >>
> >> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID>
> wrote:
> >> >
> >> > Taylor and others I was hoping to get started filing JIRA and
> planning on how we are going to do the java migration + JStorm merger.  Is
> anyone else starting to do this?  If not would anyone object to me starting
> on it? - Bobby
> >> >
> >> >
> >> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> ptgoetz@gmail.com> wrote:
> >> >
> >> >
> >> > Thanks for putting this together Basti, that comparison helps a lot.
> >> >
> >> > And thanks Bobby for converting it into markdown. I was going to just
> attach the spreadsheet to JIRA, but markdown is a much better solution.
> >> >
> >> > -Taylor
> >> >
> >> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans
> <ev...@yahoo-inc.com.INVALID> wrote:
> >> >>
> >> >> I translated the excel spreadsheet into a markdown file and put up a
> pull request for it.
> >> >> https://github.com/apache/storm/pull/877
> >> >> I did a few edits to it to make it work with Markdown, and to add in
> a few of my own comments.  I also put in a field for JIRAs to be able to
> track the migration.
> >> >> Overall I think your evaluation was very good.  We have a fair
> amount of work ahead of us to decide what version of various features we
> want to go forward with.
> >> >>   - Bobby
> >> >>
> >> >>
> >> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Bobby & Jungtaek,
> >> >>
> >> >> Thanks for your replay.
> >> >> I totally agree that compatibility is the most important thing.
> Actually, JStorm has been compatible with the user API of Storm.
> >> >> As you mentioned below, we indeed still have some features different
> between Storm and JStorm. I have tried to list them (minor update or
> improvements are not included).
> >> >> Please refer to attachment for details. If any missing, please help
> to point out. (The current working features are probably missing here.)
> >> >> Just have a look at these differences. For the missing features in
> JStorm, I did not see any obstacle which will block the merge to JStorm.
> >> >> For the features which has different solution between Storm and
> JStorm, we can evaluate the solution one by one to decision which one is
> appropriate.
> >> >> After the finalization of evaluation, I think JStorm team can take
> the merging job and publish a stable release in 2 months.
> >> >> But anyway, the detailed implementation for these features with
> different solution is transparent to user. So, from user's point of view,
> there is not any compatibility problem.
> >> >>
> >> >> Besides compatibility, by our experience, stability is also
> important and is not an easy job. 4 people in JStorm team took almost one
> year to finish the porting from "clojure core"
> >> >> to "java core", and to make it stable. Of course, we have many devs
> in community to make the porting job faster. But it still needs a long time
> to run many online complex topologys to find bugs and fix them. So, that is
> the reason why I proposed to do merging and build on a stable "java core".
> >> >>
> >> >> -----Original Message-----
> >> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> >> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> >> To: dev@storm.apache.org
> >> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.
> Migrating the APIs to org.apache.storm is a big non-backwards compatible
> move, and a major version bump to 2.x seems like a good move there.
> >> >> +1 for the release plan
> >> >>
> >> >> I would like the move for user facing APIs to org.apache to be one
> of the last things we do.  Translating clojure code into java and moving it
> to org.apache I am not too concerned about.
> >> >>
> >> >> Basti,
> >> >> We have two code bases that have diverged significantly from one
> another in terms of functionality.  The storm code now or soon will have A
> Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware
> Scheduling, a distributed cache like API, log searching, security, massive
> performance improvements, shaded almost all of our dependencies, a REST API
> for programtically accessing everything on the UI, and I am sure I am
> missing a few other things.  JStorm also has many changes including cgroup
> isolation, restructured zookeeper layout, classpath isolation, and more too.
> >> >> No matter what we do it will be a large effort to port changes from
> one code base to another, and from clojure to java.  I proposed this
> initially because it can be broken up into incremental changes.  It may
> take a little longer, but we will always have a working codebase that is
> testable and compatible with the current storm release, at least until we
> move the user facing APIs to be under org.apache.  This lets the community
> continue to build and test the master branch and report problems that they
> find, which is incredibly valuable.  I personally don't think it will be
> much easier, especially if we are intent on always maintaining
> compatibility with storm. - Bobby
> >> >>
> >> >>
> >> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >> >>
> >> >>
> >> >> Hi Taylor,
> >> >>
> >> >>
> >> >>
> >> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> >> >>
> >> >> Do you mean community plan to create a fresh new “java core” based
> on current “clojure core” firstly, and then migrate the features from
> JStorm?
> >> >>
> >> >> If so, it confused me.  It is really a huge job which might require
> a long developing time to make it stable, while JStorm is already a stable
> version.
> >> >>
> >> >> The release planned to be release after Nov 11th has already run
> online stably several month in Alibaba.
> >> >>
> >> >> Besides this, there are many valuable internal requirements in
> Alibaba, the fast evolution of JStorm is forseeable in next few months.
> >> >>
> >> >> If the “java core” is totally fresh new, it might bring many
> problems for the coming merge.
> >> >>
> >> >> So, from the point of this view,  I think it is much better and
> easier to migrate the features of “clojure core” basing on JStorm for the
> “java core”.
> >> >>
> >> >> Please correct me, if any misunderstanding.
> >> >>
> >> >>
> >> >>
> >> >> Regards
> >> >>
> >> >> Basti
> >> >>
> >> >>
> >> >>
> >> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> >> >> 发送时间: 2015年11月11日 5:32
> >> >> 收件人: dev@storm.apache.org
> >> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> >> >>
> >> >>
> >> >>
> >> >> Based on a number of discussions regarding merging the JStorm code,
> I’ve tried to distill the ideas presented and inserted some of my own. The
> result is below.
> >> >>
> >> >>
> >> >>
> >> >> I’ve divided the plan into three phases, though they are not
> necessarily sequential — obviously some tasks can take place in parallel.
> >> >>
> >> >>
> >> >>
> >> >> None of this is set in stone, just presented for discussion. Any and
> all comments are welcome.
> >> >>
> >> >>
> >> >>
> >> >> -------
> >> >>
> >> >>
> >> >>
> >> >> Phase 1 - Plan for 0.11.x Release
> >> >>
> >> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >> >>
> >> >> 2. Announce feature-freeze for 0.11.x
> >> >>
> >> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >> >>
> >> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >> >>
> >> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >> >>
> >> >> 1. Determine/document unique features in JStorm (e.g. classpath
> isolation, cgroups, etc.) and create JIRA for migrating the feature.
> >> >>
> >> >> 2. Create JIRA for migrating each clojure component (or logical
> group of components) to Java. Assumes tests will be ported as well.
> >> >>
> >> >> 3. Discuss/establish style guide for Java coding conventions.
> Consider using Oracle’s or Google’s Java conventions as a base — they are
> both pretty solid.
> >> >>
> >> >> 4. align package names (e.g backtype.storm --> org.apache.storm /
> com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Phase 3 - Migrate Clojure --> Java
> >> >>
> >> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever
> possible (core functionality only, features distinct to JStorm migrated
> separately).
> >> >>
> >> >> 2. Port JStorm-specific features.
> >> >>
> >> >> 3. Begin releasing preview/beta versions.
> >> >>
> >> >> 4. Code cleanup (across the board) and refactoring using established
> coding conventions, and leveraging PMD/Checkstyle reports for reference.
> (Note: good oportunity for new contributors.)
> >> >>
> >> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift
> feature freeze.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Notes:
> >> >>
> >> >> We should consider bumping up to version 1.0 sometime soon and then
> switching to semantic versioning [3] from then on.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> With the exception of package name alignment, the "jstorm-import"
> branch will largely be read-only throughout the process.
> >> >>
> >> >>
> >> >>
> >> >> During migration, it's probably easiest to operate with two local
> clones of the Apache Storm repo: one for working (i.e. checked out to
> working branch) and one for reference/copying (i.e. checked out to
> "jstorm-import").
> >> >>
> >> >>
> >> >>
> >> >> Feature-freeze probably only needs to be enforced against core
> functionality. Components under "external" can likely be exempt, but we
> should figure out a process for accepting and releasing new features during
> the migration.
> >> >>
> >> >>
> >> >>
> >> >> Performance testing should be continuous throughout the process.
> Since we don't really have ASF infrastructure for performance testing, we
> will need a volunteer(s) to host and run the performance tests. Performance
> test results can be posted to the wiki [2]. It would probably be a good
> idea to establish a baseline with the 0.10.0 release.
> >> >>
> >> >>
> >> >>
> >> >> I’ve attached an analysis document Sean Zhong put together a while
> back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3
> release but is still relevant and has a lot of good information.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> [1]
> https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> >> >>
> >> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> >> >>
> >> >> [3] http://semver.org
> >> >>
> >> >> [4] https://issues.apache.org/jira/browse/STORM-717
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> -Taylor
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >
> >
> >
> > --
> > Name : 임 정택
> > Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> > Twitter : http://twitter.com/heartsavior
> > LinkedIn : http://www.linkedin.com/in/heartsavior
>

Re: [DISCUSS] Storm 2.0 plan

Posted by "P. Taylor Goetz" <pt...@gmail.com>.
Very well stated Juntaek.

I should also point out that there's nothing stopping the JStorm team from releasing new versions of JStorm, or adding new features. But you would have to be careful to note that any such release is "JStorm" and not "Apache Storm." And any such release cannot be hosted on Apache infrastructure.

We also shouldn't be too worried about competition with other stream processing frameworks. Competition is healthy and leads to improvements across the board. Spark Streaming borrowed ideas from Storm for its Kafka integration. It also borrowed memory management ideas from Flink. I don't see that as a problem. This is open source. We can, and should, do the same where applicable.

Did we learn anything from the Heron paper? Nothing we didn't already know. And a lot of the points have been addressed. We dealt security first, which is more important for adoption, especially in the enterprise. Now we've addressed many performance, scaling, and usability issues. Most of the production deployments I've seen are nowhere near the magnitude of what twitter requires. But I've seen many deployments that  only exist because we offer security. I doubt heron has that.

We've also seen an uptick in community and developer involvement, which means a likely increase in committers, which likely means a faster turnaround for patch reviews, which means a tighter release cycle for new features, which means we will be moving faster. This is healthy for an Apache project.

And the inclusion of the JStorm team will only make that more so.

I feel we are headed in the right direction, and there are good things to come.

-Taylor


> On Nov 19, 2015, at 6:38 PM, 임정택 <ka...@gmail.com> wrote:
> 
> Sorry Longda, but I can't help telling that I also disagree about changing codebase.
> 
> Feature matrix shows us how far Apache Storm and JStorm are diverged, just in point of feature's view. We can't be safe to change although feature matrixes are identical, because feature matrix doesn't contain the details.
> 
> I mean, users could be scared when expected behaviors are not in place although they're small. User experience is the one of the most important part of the project, and if UX changes are huge, barrier for upgrading their Storm cluster to 2.0 is not far easier than migrating to Heron. It should be the worst scenario I can imagine after merging.
> 
> The safest way to merge is applying JStorm's great features to Apache Storm.
> I think porting language of Apache Storm to Java is not tightly related to merge JStorm. I agree that merging becomes a trigger, but Apache Storm itself can port to other languages like Java, Scala, or something else which are more popular than Clojure.
> 
> And I'm also not scary about Flink, Heron, Spark, etc.
> It doesn't mean other projects are not greater then Storm. Just I'm saying each projects have their own strength.
> For example, all conferences are saying about Spark, and as one of users of Spark, Spark is really great. If you are a little bit familiar with Scala, you can just apply Scala-like functional methods to RDD. Really easy to use.
> But it doesn't mean that Spark can replace Storm in all kind of use cases. Recently I've seen some articles that why Storm is more preferred in realtime streaming processing.
> 
> Competition should give us a positive motivation. I hope that our roadmap isn't focused to defeat competitors, but is focused to present great features, better performance, and better UX to Storm community. It's not commercial product, it's open source project!
> 
> tl;dr. Please don't change codebase unless we plan to release a brand new project. It breaks UX completely which could make users leave.
> 
> I'm also open to other opinions as well.
> 
> Best,
> Jungtaek Lim (HeartSaVioR)
> 
> 
> 2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:
>> I disagree completely.  You claim that JStorm is compatible with storm 1.0.  I don't believe that it is 100% compatible.  There has been more then 2 years of software development happening on both sides.  Security was not done in a day, and porting it over to JStorm is not going to happen quickly, and because of the major architectural changes between storm and JStorm I believe we would have to make some serious enhancements to fully support a secure TopologyMaster, but I need to look into it more.  The blob store is another piece of code that has taken a very long time to develop.  There are numberous others.  The big features are not the ones that make me nervous because we can plan for them, it is the hundreds of small JIRA and features that will result in minor incompatibilities.  If we start with storm itself, and follow the same process that we have been doing up until now, if there is a reason to stop the port add in an important feature and do a release, we can.  We will know that we have compatibility vs starting with JStorm we know from the start that we do not without adding feature X, Y, Z, ....
>> 
>> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams, etc...  We just did some major performance enhancements that will be released with STORM 1.0.  We now have up to 6x the throughput that we had before with minimal changes to the latency (20 ms vs 5 ms).  We have automatic back-pressure so if someone was running with acking enabled just for flow control they can now process close to 16x the throughput they could before with the same hardware.  This puts our throughput very much on par with flink and Spark, but with a much lower latency compared to either of them.  Plus from what I have heard Flink is still calling the streaming API beta, and their storm API compatibility is very rudimentary.  They are also going to have more and more problems maintaining compatibility as we add in new features and functionality. 
>> 
>> Spark only really works well when it is running with several seconds of latency. Not every one needs sub-second processing, but when your platform is completely unable to handle it, locks you out of a lot of use cases.  Their throughput is decent and can scale very high when you are willing to tolerate similarly very high latencies.
>> Who knows about Heron until they actually release their code, but it is missing lots of critical features, and the one they touted, better performance, is a moot point with storm 1.0.  The only thing we really are lacking is advertising, we don't have a big company really pushing storm and getting it in the news all the time (Sorry Hortonworks, but I really have not seen much about it in the news).  I am trying to do more, but there is only so much I can do.
>> Longda I very much agree with you about moving quickly to make the transition, but I do not believe in any way that starting with JStorm is going to reduce that transition time.
>> My proposal is to give everyone about 2 weeks to finish merging new features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call for a release.  At the same time development work to port storm to java begins.  You said it took 4 developers 1 year to port storm to java the first time for JStorm.  We have 14+ active developers and over one hundred contributors not including those from the JStorm community.  If numbers scale linearly, I know they don't completely, we should be able to do a complete port with no JStorm reference in around 100 days.  With a copy and paste for a lot of this from the JStorm codebase, I would expect to be able to do it in 1 month of development, possibly less if the JStorm community can really help out too.  So by January we should be ready to begin pulling in features from JStorm that make since.  Looking at the feature matrix in https://github.com/apache/storm/pull/877 there are a few potentially big improvements that we would want to pull in, but they require architectural changes in some cases that I don't want to just do lightly.  I would propose that one the code has been ported to java we reopen for all new features in parallel with the JStorm feature migration, but I am open to others opinions as well.
>>  - Bobby
>> 
>> 
>>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <zh...@alibaba-inc.com> wrote:
>> 
>> 
>>  Sorry for changing the Subject.
>> 
>> I am +1 for releasing Storm 2.0 with java core, which is merged with JStorm.
>> 
>> I think the change of this release will be the biggest one in history. It will probably take a long time to develop. At the same time, Heron is going to open source, and the latest release of Flink provides the compatibility to Storm’s API. These might be the threat to Storm. So I suggest we start the development of Storm 2.0 as quickly as possible. In order to accelerate the development cycle, I proposed to take JStorm 2.1.0 core and UI as the base version since this version is stable and compatible with API of Storm 1.0. Please refer to the phases below for the detailed merging plan.
>> 
>> Note: We provide a demo of JStorm’s web UI. Please refer to storm.taobao.org . I think JStorm will give a totally different view to you.
>> 
>> I would like to share the experience of initial development of JStorm (Migrate from clojure core to java core). 
>> Our team(4 developers) have spent almost one year to finish the migration. We took 4 months to release the first JStorm version, and 6 months to make JStorm stable. During this period, we tried to switch more than online 100 applications with different scenarios from Storm to JStorm, and many bugs were fixed. Then more and more applications were switched to JStorm in Alibaba.
>> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and 2000+ applications are running on them. The JStorm Clusters here can handle 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData field but also in many other online scenarios.
>> Besides it, we have experienced the November 11th Shopping Festival of Alibaba for last three years. At that day, the computation in our cluster increased several times than usual. All applications worked well during the peak time. I can say the stability of JStorm is no doubt today. Actually, besides Alibaba, the most powerful Chinese IT company are also using JStorm.
>> 
>> 
>> Phase 1:
>>  
>> Define the target of Storm 2.0. List the requirement of Storm 2.0
>> 1. Open a new Umbrella Jira (https://issues.apache.org/jira/browse/STORM-717)
>> 2. Create one 2.0 branch, 
>> 2.1 Copy modules from JStorm, one module from one module
>> 2.2 The sequence is extern modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
>> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
>> 3.1 Discuss solution for each difference(jira)
>> 3.2 Once the solution is finalized, we can start the merging. (Some issues could be start concurrently. It depends on the discussion.)
>> 
>> The phase mainly try to define target and finalize the solution. Hopefully this phase could be finished in 2 month(before 2016/1/31). . 
>> 
>> 
>> Phase 2:
>> Release Storm 2.0 beta
>> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
>> 2. Integrate all modules, make the simplest storm example can run on the system.
>> 3. Test with all example and modules in Storm code base.
>> 4. All daily test can be passed.
>>  
>> Hopefully this phase could be finished in 2 month(before 2016/3/31)
>> 
>> 
>> Phase 3:
>> Persuade some user to have a try.
>> Alibaba will try to run some online applications on the beta version
>> 
>> Hopefully this phase could be finished in 1 month(before 2016/4/31).
>> 
>> 
>> Any comments are welcome.
>> 
>> 
>> Thanks
>> Longda------------------------------------------------------------------From:P. Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <de...@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re: [DISCUSS] Plan for Merging JStorm Code
>> All I have at this point is a placeholder wiki entry [1], and a lot of local notes that likely would only make sense to me.
>> 
>> Let me know your wiki username and I’ll give you permissions. The same goes for anyone else who wants to help.
>> 
>> -Taylor
>> 
>> [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
>> 
>> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
>> > 
>> > Taylor and others I was hoping to get started filing JIRA and planning on how we are going to do the java migration + JStorm merger.  Is anyone else starting to do this?  If not would anyone object to me starting on it? - Bobby
>> > 
>> > 
>> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <pt...@gmail.com> wrote:
>> > 
>> > 
>> > Thanks for putting this together Basti, that comparison helps a lot.
>> > 
>> > And thanks Bobby for converting it into markdown. I was going to just attach the spreadsheet to JIRA, but markdown is a much better solution.
>> > 
>> > -Taylor
>> > 
>> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
>> >> 
>> >> I translated the excel spreadsheet into a markdown file and put up a pull request for it.
>> >> https://github.com/apache/storm/pull/877
>> >> I did a few edits to it to make it work with Markdown, and to add in a few of my own comments.  I also put in a field for JIRAs to be able to track the migration.
>> >> Overall I think your evaluation was very good.  We have a fair amount of work ahead of us to decide what version of various features we want to go forward with.
>> >>   - Bobby
>> >> 
>> >> 
>> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> >> 
>> >> 
>> >> Hi Bobby & Jungtaek,
>> >> 
>> >> Thanks for your replay.
>> >> I totally agree that compatibility is the most important thing. Actually, JStorm has been compatible with the user API of Storm.
>> >> As you mentioned below, we indeed still have some features different between Storm and JStorm. I have tried to list them (minor update or improvements are not included).
>> >> Please refer to attachment for details. If any missing, please help to point out. (The current working features are probably missing here.)
>> >> Just have a look at these differences. For the missing features in JStorm, I did not see any obstacle which will block the merge to JStorm.
>> >> For the features which has different solution between Storm and JStorm, we can evaluate the solution one by one to decision which one is appropriate.
>> >> After the finalization of evaluation, I think JStorm team can take the merging job and publish a stable release in 2 months.
>> >> But anyway, the detailed implementation for these features with different solution is transparent to user. So, from user's point of view, there is not any compatibility problem.
>> >> 
>> >> Besides compatibility, by our experience, stability is also important and is not an easy job. 4 people in JStorm team took almost one year to finish the porting from "clojure core"
>> >> to "java core", and to make it stable. Of course, we have many devs in community to make the porting job faster. But it still needs a long time to run many online complex topologys to find bugs and fix them. So, that is the reason why I proposed to do merging and build on a stable "java core".
>> >> 
>> >> -----Original Message-----
>> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
>> >> Sent: Wednesday, November 11, 2015 10:51 PM
>> >> To: dev@storm.apache.org
>> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
>> >> 
>> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.  Migrating the APIs to org.apache.storm is a big non-backwards compatible move, and a major version bump to 2.x seems like a good move there.
>> >> +1 for the release plan
>> >> 
>> >> I would like the move for user facing APIs to org.apache to be one of the last things we do.  Translating clojure code into java and moving it to org.apache I am not too concerned about.
>> >> 
>> >> Basti,
>> >> We have two code bases that have diverged significantly from one another in terms of functionality.  The storm code now or soon will have A Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware Scheduling, a distributed cache like API, log searching, security, massive performance improvements, shaded almost all of our dependencies, a REST API for programtically accessing everything on the UI, and I am sure I am missing a few other things.  JStorm also has many changes including cgroup isolation, restructured zookeeper layout, classpath isolation, and more too.
>> >> No matter what we do it will be a large effort to port changes from one code base to another, and from clojure to java.  I proposed this initially because it can be broken up into incremental changes.  It may take a little longer, but we will always have a working codebase that is testable and compatible with the current storm release, at least until we move the user facing APIs to be under org.apache.  This lets the community continue to build and test the master branch and report problems that they find, which is incredibly valuable.  I personally don't think it will be much easier, especially if we are intent on always maintaining compatibility with storm. - Bobby
>> >> 
>> >> 
>> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> >> 
>> >> 
>> >> Hi Taylor,
>> >> 
>> >> 
>> >> 
>> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
>> >> 
>> >> Do you mean community plan to create a fresh new “java core” based on current “clojure core” firstly, and then migrate the features from JStorm?
>> >> 
>> >> If so, it confused me.  It is really a huge job which might require a long developing time to make it stable, while JStorm is already a stable version.
>> >> 
>> >> The release planned to be release after Nov 11th has already run online stably several month in Alibaba.
>> >> 
>> >> Besides this, there are many valuable internal requirements in Alibaba, the fast evolution of JStorm is forseeable in next few months.
>> >> 
>> >> If the “java core” is totally fresh new, it might bring many problems for the coming merge.
>> >> 
>> >> So, from the point of this view,  I think it is much better and easier to migrate the features of “clojure core” basing on JStorm for the “java core”.
>> >> 
>> >> Please correct me, if any misunderstanding.
>> >> 
>> >> 
>> >> 
>> >> Regards
>> >> 
>> >> Basti
>> >> 
>> >> 
>> >> 
>> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
>> >> 发送时间: 2015年11月11日 5:32
>> >> 收件人: dev@storm.apache.org
>> >> 主题: [DISCUSS] Plan for Merging JStorm Code
>> >> 
>> >> 
>> >> 
>> >> Based on a number of discussions regarding merging the JStorm code, I’ve tried to distill the ideas presented and inserted some of my own. The result is below.
>> >> 
>> >> 
>> >> 
>> >> I’ve divided the plan into three phases, though they are not necessarily sequential — obviously some tasks can take place in parallel.
>> >> 
>> >> 
>> >> 
>> >> None of this is set in stone, just presented for discussion. Any and all comments are welcome.
>> >> 
>> >> 
>> >> 
>> >> -------
>> >> 
>> >> 
>> >> 
>> >> Phase 1 - Plan for 0.11.x Release
>> >> 
>> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
>> >> 
>> >> 2. Announce feature-freeze for 0.11.x
>> >> 
>> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
>> >> 
>> >> 4. Release 0.11.0 (or whatever version # we want to use)
>> >> 
>> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
>> >> 
>> >> 1. Determine/document unique features in JStorm (e.g. classpath isolation, cgroups, etc.) and create JIRA for migrating the feature.
>> >> 
>> >> 2. Create JIRA for migrating each clojure component (or logical group of components) to Java. Assumes tests will be ported as well.
>> >> 
>> >> 3. Discuss/establish style guide for Java coding conventions. Consider using Oracle’s or Google’s Java conventions as a base — they are both pretty solid.
>> >> 
>> >> 4. align package names (e.g backtype.storm --> org.apache.storm / com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Phase 3 - Migrate Clojure --> Java
>> >> 
>> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever possible (core functionality only, features distinct to JStorm migrated separately).
>> >> 
>> >> 2. Port JStorm-specific features.
>> >> 
>> >> 3. Begin releasing preview/beta versions.
>> >> 
>> >> 4. Code cleanup (across the board) and refactoring using established coding conventions, and leveraging PMD/Checkstyle reports for reference. (Note: good oportunity for new contributors.)
>> >> 
>> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift feature freeze.
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Notes:
>> >> 
>> >> We should consider bumping up to version 1.0 sometime soon and then switching to semantic versioning [3] from then on.
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> With the exception of package name alignment, the "jstorm-import" branch will largely be read-only throughout the process.
>> >> 
>> >> 
>> >> 
>> >> During migration, it's probably easiest to operate with two local clones of the Apache Storm repo: one for working (i.e. checked out to working branch) and one for reference/copying (i.e. checked out to "jstorm-import").
>> >> 
>> >> 
>> >> 
>> >> Feature-freeze probably only needs to be enforced against core functionality. Components under "external" can likely be exempt, but we should figure out a process for accepting and releasing new features during the migration.
>> >> 
>> >> 
>> >> 
>> >> Performance testing should be continuous throughout the process. Since we don't really have ASF infrastructure for performance testing, we will need a volunteer(s) to host and run the performance tests. Performance test results can be posted to the wiki [2]. It would probably be a good idea to establish a baseline with the 0.10.0 release.
>> >> 
>> >> 
>> >> 
>> >> I’ve attached an analysis document Sean Zhong put together a while back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3 release but is still relevant and has a lot of good information.
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> [1] https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
>> >> 
>> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
>> >> 
>> >> [3] http://semver.org
>> >> 
>> >> [4] https://issues.apache.org/jira/browse/STORM-717
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> -Taylor
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> > 
>> > 
> 
> 
> 
> -- 
> Name : 임 정택
> Blog : http://www.heartsavior.net / http://dev.heartsavior.net
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior

Re: [DISCUSS] Storm 2.0 plan

Posted by 임정택 <ka...@gmail.com>.
Sorry Longda, but I can't help telling that I also disagree about changing
codebase.

Feature matrix shows us how far Apache Storm and JStorm are diverged, just
in point of feature's view. We can't be safe to change although feature
matrixes are identical, because feature matrix doesn't contain the details.

I mean, users could be scared when expected behaviors are not in place
although they're small. User experience is the one of the most important
part of the project, and if UX changes are huge, barrier for upgrading
their Storm cluster to 2.0 is not far easier than migrating to Heron. It
should be the worst scenario I can imagine after merging.

The safest way to merge is applying JStorm's great features to Apache Storm.
I think porting language of Apache Storm to Java is not tightly related to
merge JStorm. I agree that merging becomes a trigger, but Apache Storm
itself can port to other languages like Java, Scala, or something else
which are more popular than Clojure.

And I'm also not scary about Flink, Heron, Spark, etc.
It doesn't mean other projects are not greater then Storm. Just I'm saying
each projects have their own strength.
For example, all conferences are saying about Spark, and as one of users of
Spark, Spark is really great. If you are a little bit familiar with Scala,
you can just apply Scala-like functional methods to RDD. Really easy to use.
But it doesn't mean that Spark can replace Storm in all kind of use cases.
Recently I've seen some articles that why Storm is more preferred in
realtime streaming processing.

Competition should give us a positive motivation. I hope that our roadmap
isn't focused to defeat competitors, but is focused to present great
features, better performance, and better UX to Storm community. It's not
commercial product, it's open source project!

tl;dr. Please don't change codebase unless we plan to release a brand new
project. It breaks UX completely which could make users leave.

I'm also open to other opinions as well.

Best,
Jungtaek Lim (HeartSaVioR)


2015-11-20 0:00 GMT+09:00 Bobby Evans <ev...@yahoo-inc.com.invalid>:

> I disagree completely.  You claim that JStorm is compatible with storm
> 1.0.  I don't believe that it is 100% compatible.  There has been more then
> 2 years of software development happening on both sides.  Security was not
> done in a day, and porting it over to JStorm is not going to happen
> quickly, and because of the major architectural changes between storm and
> JStorm I believe we would have to make some serious enhancements to fully
> support a secure TopologyMaster, but I need to look into it more.  The blob
> store is another piece of code that has taken a very long time to develop.
> There are numberous others.  The big features are not the ones that make me
> nervous because we can plan for them, it is the hundreds of small JIRA and
> features that will result in minor incompatibilities.  If we start with
> storm itself, and follow the same process that we have been doing up until
> now, if there is a reason to stop the port add in an important feature and
> do a release, we can.  We will know that we have compatibility vs starting
> with JStorm we know from the start that we do not without adding feature X,
> Y, Z, ....
>
> I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams,
> etc...  We just did some major performance enhancements that will be
> released with STORM 1.0.  We now have up to 6x the throughput that we had
> before with minimal changes to the latency (20 ms vs 5 ms).  We have
> automatic back-pressure so if someone was running with acking enabled just
> for flow control they can now process close to 16x the throughput they
> could before with the same hardware.  This puts our throughput very much on
> par with flink and Spark, but with a much lower latency compared to either
> of them.  Plus from what I have heard Flink is still calling the streaming
> API beta, and their storm API compatibility is very rudimentary.  They are
> also going to have more and more problems maintaining compatibility as we
> add in new features and functionality.
>
> Spark only really works well when it is running with several seconds of
> latency. Not every one needs sub-second processing, but when your platform
> is completely unable to handle it, locks you out of a lot of use cases.
> Their throughput is decent and can scale very high when you are willing to
> tolerate similarly very high latencies.
> Who knows about Heron until they actually release their code, but it is
> missing lots of critical features, and the one they touted, better
> performance, is a moot point with storm 1.0.  The only thing we really are
> lacking is advertising, we don't have a big company really pushing storm
> and getting it in the news all the time (Sorry Hortonworks, but I really
> have not seen much about it in the news).  I am trying to do more, but
> there is only so much I can do.
> Longda I very much agree with you about moving quickly to make the
> transition, but I do not believe in any way that starting with JStorm is
> going to reduce that transition time.
> My proposal is to give everyone about 2 weeks to finish merging new
> features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call
> for a release.  At the same time development work to port storm to java
> begins.  You said it took 4 developers 1 year to port storm to java the
> first time for JStorm.  We have 14+ active developers and over one hundred
> contributors not including those from the JStorm community.  If numbers
> scale linearly, I know they don't completely, we should be able to do a
> complete port with no JStorm reference in around 100 days.  With a copy and
> paste for a lot of this from the JStorm codebase, I would expect to be able
> to do it in 1 month of development, possibly less if the JStorm community
> can really help out too.  So by January we should be ready to begin pulling
> in features from JStorm that make since.  Looking at the feature matrix in
> https://github.com/apache/storm/pull/877 there are a few potentially big
> improvements that we would want to pull in, but they require architectural
> changes in some cases that I don't want to just do lightly.  I would
> propose that one the code has been ported to java we reopen for all new
> features in parallel with the JStorm feature migration, but I am open to
> others opinions as well.
>  - Bobby
>
>
>     On Thursday, November 19, 2015 12:14 AM, Longda Feng <
> zhongyan.feng@alibaba-inc.com> wrote:
>
>
>  Sorry for changing the Subject.
>
>
> I am +1 for releasing Storm 2.0 with java core, which is merged with JStorm.
>
>
> I think the change of this release will be the biggest one in history. It will probably take a long time to develop. At the same time, Heron is going to open source, and the latest release of Flink provides the compatibility to Storm’s API. These might be the threat to Storm. So I suggest we start the development of Storm 2.0 as quickly as possible. In order to accelerate the development cycle, I proposed to take JStorm 2.1.0 core and UI as the base version since this version is stable and compatible with API of Storm 1.0. Please refer to the phases below for the detailed merging plan.
>
> Note: We provide a demo of JStorm’s web UI. Please refer to
> storm.taobao.org
> . I think JStorm will give a totally different view to you.
>
>
> I would like to share the experience of initial development of JStorm (Migrate from clojure core to java core).
>
> Our team(4 developers) have spent almost one year to finish the migration. We took 4 months to release the first JStorm version, and 6 months to make JStorm stable. During this period, we tried to switch more than online 100 applications with different scenarios from Storm to JStorm, and many bugs were fixed. Then more and more applications were switched to JStorm in Alibaba.
>
> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and 2000+ applications are running on them. The JStorm Clusters here can handle 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData field but also in many other online scenarios.
>
> Besides it, we have experienced the November 11th Shopping Festival of Alibaba for last three years. At that day, the computation in our cluster increased several times than usual. All applications worked well during the peak time. I can say the stability of JStorm is no doubt today. Actually, besides Alibaba, the most powerful Chinese IT company are also using JStorm.
>
>
> Phase 1:
>
> Define the target of Storm 2.0. List the requirement of Storm 2.0
> 1. Open a new Umbrella Jira (
> https://issues.apache.org/jira/browse/STORM-717)
> 2. Create one 2.0 branch,
> 2.1 Copy modules from JStorm, one module from one module
>
> 2.2 The sequence is extern modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> 3.1 Discuss solution for each difference(jira)
>
> 3.2 Once the solution is finalized, we can start the merging. (Some issues could be start concurrently. It depends on the discussion.)
>
>
> The phase mainly try to define target and finalize the solution. Hopefully this phase could be finished in 2 month(before 2016/1/31). .
>
>
> Phase 2:
> Release Storm 2.0 beta
> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
>
> 2. Integrate all modules, make the simplest storm example can run on the system.
> 3. Test with all example and modules in Storm code base.
> 4. All daily test can be passed.
>
> Hopefully this phase could be finished in 2 month(before 2016/3/31)
>
>
> Phase 3:
> Persuade some user to have a try.
> Alibaba will try to run some online applications on the beta version
>
> Hopefully this phase could be finished in 1 month(before 2016/4/31).
>
>
> Any comments are welcome.
>
>
> Thanks
> Longda------------------------------------------------------------------From:P.
> Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <
> dev@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> [DISCUSS] Plan for Merging JStorm Code
>
> All I have at this point is a placeholder wiki entry [1], and a lot of local notes that likely would only make sense to me.
>
>
> Let me know your wiki username and I’ll give you permissions. The same goes for anyone else who wants to help.
>
> -Taylor
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
>
> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <evans@yahoo-inc.com.INVALID
> > wrote:
> >
>
> > Taylor and others I was hoping to get started filing JIRA and planning on how we are going to do the java migration + JStorm merger.  Is anyone else starting to do this?  If not would anyone object to me starting on it? - Bobby
> >
> >
> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <
> ptgoetz@gmail.com> wrote:
> >
> >
> > Thanks for putting this together Basti, that comparison helps a lot.
> >
>
> > And thanks Bobby for converting it into markdown. I was going to just attach the spreadsheet to JIRA, but markdown is a much better solution.
> >
> > -Taylor
> >
> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans <evans@yahoo-inc.com.INVALID
> > wrote:
> >>
>
> >> I translated the excel spreadsheet into a markdown file and put up a pull request for it.
> >> https://github.com/apache/storm/pull/877
>
> >> I did a few edits to it to make it work with Markdown, and to add in a few of my own comments.  I also put in a field for JIRAs to be able to track the migration.
>
> >> Overall I think your evaluation was very good.  We have a fair amount of work ahead of us to decide what version of various features we want to go forward with.
> >>   - Bobby
> >>
> >>
> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >>
> >>
> >> Hi Bobby & Jungtaek,
> >>
> >> Thanks for your replay.
>
> >> I totally agree that compatibility is the most important thing. Actually, JStorm has been compatible with the user API of Storm.
>
> >> As you mentioned below, we indeed still have some features different between Storm and JStorm. I have tried to list them (minor update or improvements are not included).
>
> >> Please refer to attachment for details. If any missing, please help to point out. (The current working features are probably missing here.)
>
> >> Just have a look at these differences. For the missing features in JStorm, I did not see any obstacle which will block the merge to JStorm.
>
> >> For the features which has different solution between Storm and JStorm, we can evaluate the solution one by one to decision which one is appropriate.
>
> >> After the finalization of evaluation, I think JStorm team can take the merging job and publish a stable release in 2 months.
>
> >> But anyway, the detailed implementation for these features with different solution is transparent to user. So, from user's point of view, there is not any compatibility problem.
> >>
>
> >> Besides compatibility, by our experience, stability is also important and is not an easy job. 4 people in JStorm team took almost one year to finish the porting from "clojure core"
>
> >> to "java core", and to make it stable. Of course, we have many devs in community to make the porting job faster. But it still needs a long time to run many online complex topologys to find bugs and fix them. So, that is the reason why I proposed to do merging and build on a stable "java core".
> >>
> >> -----Original Message-----
> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> To: dev@storm.apache.org
> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >>
>
> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.  Migrating the APIs to org.apache.storm is a big non-backwards compatible move, and a major version bump to 2.x seems like a good move there.
> >> +1 for the release plan
> >>
>
> >> I would like the move for user facing APIs to org.apache to be one of the last things we do.  Translating clojure code into java and moving it to org.apache I am not too concerned about.
> >>
> >> Basti,
>
> >> We have two code bases that have diverged significantly from one another in terms of functionality.  The storm code now or soon will have A Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware Scheduling, a distributed cache like API, log searching, security, massive performance improvements, shaded almost all of our dependencies, a REST API for programtically accessing everything on the UI, and I am sure I am missing a few other things.  JStorm also has many changes including cgroup isolation, restructured zookeeper layout, classpath isolation, and more too.
>
> >> No matter what we do it will be a large effort to port changes from one code base to another, and from clojure to java.  I proposed this initially because it can be broken up into incremental changes.  It may take a little longer, but we will always have a working codebase that is testable and compatible with the current storm release, at least until we move the user facing APIs to be under org.apache.  This lets the community continue to build and test the master branch and report problems that they find, which is incredibly valuable.  I personally don't think it will be much easier, especially if we are intent on always maintaining compatibility with storm. - Bobby
> >>
> >>
> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <
> basti.lj@alibaba-inc.com> wrote:
> >>
> >>
> >> Hi Taylor,
> >>
> >>
> >>
> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> >>
>
> >> Do you mean community plan to create a fresh new “java core” based on current “clojure core” firstly, and then migrate the features from JStorm?
> >>
>
> >> If so, it confused me.  It is really a huge job which might require a long developing time to make it stable, while JStorm is already a stable version.
> >>
>
> >> The release planned to be release after Nov 11th has already run online stably several month in Alibaba.
> >>
>
> >> Besides this, there are many valuable internal requirements in Alibaba, the fast evolution of JStorm is forseeable in next few months.
> >>
>
> >> If the “java core” is totally fresh new, it might bring many problems for the coming merge.
> >>
>
> >> So, from the point of this view,  I think it is much better and easier to migrate the features of “clojure core” basing on JStorm for the “java core”.
> >>
> >> Please correct me, if any misunderstanding.
> >>
> >>
> >>
> >> Regards
> >>
> >> Basti
> >>
> >>
> >>
> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> >> 发送时间: 2015年11月11日 5:32
> >> 收件人: dev@storm.apache.org
> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> >>
> >>
> >>
>
> >> Based on a number of discussions regarding merging the JStorm code, I’ve tried to distill the ideas presented and inserted some of my own. The result is below.
> >>
> >>
> >>
>
> >> I’ve divided the plan into three phases, though they are not necessarily sequential — obviously some tasks can take place in parallel.
> >>
> >>
> >>
>
> >> None of this is set in stone, just presented for discussion. Any and all comments are welcome.
> >>
> >>
> >>
> >> -------
> >>
> >>
> >>
> >> Phase 1 - Plan for 0.11.x Release
> >>
> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >>
> >> 2. Announce feature-freeze for 0.11.x
> >>
> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >>
> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >>
> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >>
> >>
> >>
> >>
> >>
> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >>
>
> >> 1. Determine/document unique features in JStorm (e.g. classpath isolation, cgroups, etc.) and create JIRA for migrating the feature.
> >>
>
> >> 2. Create JIRA for migrating each clojure component (or logical group of components) to Java. Assumes tests will be ported as well.
> >>
>
> >> 3. Discuss/establish style guide for Java coding conventions. Consider using Oracle’s or Google’s Java conventions as a base — they are both pretty solid.
> >>
>
> >> 4. align package names (e.g backtype.storm --> org.apache.storm / com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >>
> >>
> >>
> >>
> >>
> >> Phase 3 - Migrate Clojure --> Java
> >>
>
> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever possible (core functionality only, features distinct to JStorm migrated separately).
> >>
> >> 2. Port JStorm-specific features.
> >>
> >> 3. Begin releasing preview/beta versions.
> >>
>
> >> 4. Code cleanup (across the board) and refactoring using established coding conventions, and leveraging PMD/Checkstyle reports for reference. (Note: good oportunity for new contributors.)
> >>
>
> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift feature freeze.
> >>
> >>
> >>
> >>
> >>
> >> Notes:
> >>
>
> >> We should consider bumping up to version 1.0 sometime soon and then switching to semantic versioning [3] from then on.
> >>
> >>
> >>
> >>
> >>
>
> >> With the exception of package name alignment, the "jstorm-import" branch will largely be read-only throughout the process.
> >>
> >>
> >>
>
> >> During migration, it's probably easiest to operate with two local clones of the Apache Storm repo: one for working (i.e. checked out to working branch) and one for reference/copying (i.e. checked out to "jstorm-import").
> >>
> >>
> >>
>
> >> Feature-freeze probably only needs to be enforced against core functionality. Components under "external" can likely be exempt, but we should figure out a process for accepting and releasing new features during the migration.
> >>
> >>
> >>
>
> >> Performance testing should be continuous throughout the process. Since we don't really have ASF infrastructure for performance testing, we will need a volunteer(s) to host and run the performance tests. Performance test results can be posted to the wiki [2]. It would probably be a good idea to establish a baseline with the 0.10.0 release.
> >>
> >>
> >>
>
> >> I’ve attached an analysis document Sean Zhong put together a while back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3 release but is still relevant and has a lot of good information.
> >>
> >>
> >>
> >>
> >>
> >> [1]
> https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> >>
> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> >>
> >> [3] http://semver.org
> >>
> >> [4] https://issues.apache.org/jira/browse/STORM-717
> >>
> >>
> >>
> >>
> >>
> >> -Taylor
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>
>
>
>
>



-- 
Name : 임 정택
Blog : http://www.heartsavior.net / http://dev.heartsavior.net
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Re: [DISCUSS] Storm 2.0 plan

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
I disagree completely.  You claim that JStorm is compatible with storm 1.0.  I don't believe that it is 100% compatible.  There has been more then 2 years of software development happening on both sides.  Security was not done in a day, and porting it over to JStorm is not going to happen quickly, and because of the major architectural changes between storm and JStorm I believe we would have to make some serious enhancements to fully support a secure TopologyMaster, but I need to look into it more.  The blob store is another piece of code that has taken a very long time to develop.  There are numberous others.  The big features are not the ones that make me nervous because we can plan for them, it is the hundreds of small JIRA and features that will result in minor incompatibilities.  If we start with storm itself, and follow the same process that we have been doing up until now, if there is a reason to stop the port add in an important feature and do a release, we can.  We will know that we have compatibility vs starting with JStorm we know from the start that we do not without adding feature X, Y, Z, ....

I personally am not scared of Flink, Heron, Spark, Apex, IBM Streams, etc...  We just did some major performance enhancements that will be released with STORM 1.0.  We now have up to 6x the throughput that we had before with minimal changes to the latency (20 ms vs 5 ms).  We have automatic back-pressure so if someone was running with acking enabled just for flow control they can now process close to 16x the throughput they could before with the same hardware.  This puts our throughput very much on par with flink and Spark, but with a much lower latency compared to either of them.  Plus from what I have heard Flink is still calling the streaming API beta, and their storm API compatibility is very rudimentary.  They are also going to have more and more problems maintaining compatibility as we add in new features and functionality.  

Spark only really works well when it is running with several seconds of latency. Not every one needs sub-second processing, but when your platform is completely unable to handle it, locks you out of a lot of use cases.  Their throughput is decent and can scale very high when you are willing to tolerate similarly very high latencies.
Who knows about Heron until they actually release their code, but it is missing lots of critical features, and the one they touted, better performance, is a moot point with storm 1.0.  The only thing we really are lacking is advertising, we don't have a big company really pushing storm and getting it in the news all the time (Sorry Hortonworks, but I really have not seen much about it in the news).  I am trying to do more, but there is only so much I can do.
Longda I very much agree with you about moving quickly to make the transition, but I do not believe in any way that starting with JStorm is going to reduce that transition time.
My proposal is to give everyone about 2 weeks to finish merging new features into Storm 1.0.  On Dec 1st we create a storm-1.x branch and call for a release.  At the same time development work to port storm to java begins.  You said it took 4 developers 1 year to port storm to java the first time for JStorm.  We have 14+ active developers and over one hundred contributors not including those from the JStorm community.  If numbers scale linearly, I know they don't completely, we should be able to do a complete port with no JStorm reference in around 100 days.  With a copy and paste for a lot of this from the JStorm codebase, I would expect to be able to do it in 1 month of development, possibly less if the JStorm community can really help out too.  So by January we should be ready to begin pulling in features from JStorm that make since.  Looking at the feature matrix in https://github.com/apache/storm/pull/877 there are a few potentially big improvements that we would want to pull in, but they require architectural changes in some cases that I don't want to just do lightly.  I would propose that one the code has been ported to java we reopen for all new features in parallel with the JStorm feature migration, but I am open to others opinions as well.
 - Bobby 


    On Thursday, November 19, 2015 12:14 AM, Longda Feng <zh...@alibaba-inc.com> wrote:
 

 Sorry for changing the Subject.

I am +1 for releasing Storm 2.0 with java core, which is merged with JStorm.

I think the change of this release will be the biggest one in history. It will probably take a long time to develop. At the same time, Heron is going to open source, and the latest release of Flink provides the compatibility to Storm’s API. These might be the threat to Storm. So I suggest we start the development of Storm 2.0 as quickly as possible. In order to accelerate the development cycle, I proposed to take JStorm 2.1.0 core and UI as the base version since this version is stable and compatible with API of Storm 1.0. Please refer to the phases below for the detailed merging plan.

Note: We provide a demo of JStorm’s web UI. Please refer to storm.taobao.org . I think JStorm will give a totally different view to you.

I would like to share the experience of initial development of JStorm (Migrate from clojure core to java core). 
Our team(4 developers) have spent almost one year to finish the migration. We took 4 months to release the first JStorm version, and 6 months to make JStorm stable. During this period, we tried to switch more than online 100 applications with different scenarios from Storm to JStorm, and many bugs were fixed. Then more and more applications were switched to JStorm in Alibaba.
Currently, there are 7000+ nodes of JStorm clusters in Alibaba and 2000+ applications are running on them. The JStorm Clusters here can handle 1.5 PB/2 Trillion messages per day. The use cases are not only in BigData field but also in many other online scenarios.
Besides it, we have experienced the November 11th Shopping Festival of Alibaba for last three years. At that day, the computation in our cluster increased several times than usual. All applications worked well during the peak time. I can say the stability of JStorm is no doubt today. Actually, besides Alibaba, the most powerful Chinese IT company are also using JStorm.


Phase 1:
 
Define the target of Storm 2.0. List the requirement of Storm 2.0
1. Open a new Umbrella Jira (https://issues.apache.org/jira/browse/STORM-717)
2. Create one 2.0 branch, 
2.1 Copy modules from JStorm, one module from one module
2.2 The sequence is extern modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
3.1 Discuss solution for each difference(jira)
3.2 Once the solution is finalized, we can start the merging. (Some issues could be start concurrently. It depends on the discussion.)

The phase mainly try to define target and finalize the solution. Hopefully this phase could be finished in 2 month(before 2016/1/31). . 


Phase 2:
Release Storm 2.0 beta
1. Based on phrase 1's discussion, finish all features of Storm 2.0
2. Integrate all modules, make the simplest storm example can run on the system.
3. Test with all example and modules in Storm code base.
4. All daily test can be passed.
 
Hopefully this phase could be finished in 2 month(before 2016/3/31)


Phase 3:
Persuade some user to have a try.
Alibaba will try to run some online applications on the beta version

Hopefully this phase could be finished in 1 month(before 2016/4/31).


Any comments are welcome.


Thanks
Longda------------------------------------------------------------------From:P. Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev <de...@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re: [DISCUSS] Plan for Merging JStorm Code
All I have at this point is a placeholder wiki entry [1], and a lot of local notes that likely would only make sense to me.

Let me know your wiki username and I’ll give you permissions. The same goes for anyone else who wants to help.

-Taylor

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109

> On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> 
> Taylor and others I was hoping to get started filing JIRA and planning on how we are going to do the java migration + JStorm merger.  Is anyone else starting to do this?  If not would anyone object to me starting on it? - Bobby
> 
> 
>    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <pt...@gmail.com> wrote:
> 
> 
> Thanks for putting this together Basti, that comparison helps a lot.
> 
> And thanks Bobby for converting it into markdown. I was going to just attach the spreadsheet to JIRA, but markdown is a much better solution.
> 
> -Taylor
> 
>> On Nov 12, 2015, at 12:03 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
>> 
>> I translated the excel spreadsheet into a markdown file and put up a pull request for it.
>> https://github.com/apache/storm/pull/877
>> I did a few edits to it to make it work with Markdown, and to add in a few of my own comments.  I also put in a field for JIRAs to be able to track the migration.
>> Overall I think your evaluation was very good.  We have a fair amount of work ahead of us to decide what version of various features we want to go forward with.
>>   - Bobby
>> 
>> 
>>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> 
>> 
>> Hi Bobby & Jungtaek,
>> 
>> Thanks for your replay.
>> I totally agree that compatibility is the most important thing. Actually, JStorm has been compatible with the user API of Storm.
>> As you mentioned below, we indeed still have some features different between Storm and JStorm. I have tried to list them (minor update or improvements are not included).
>> Please refer to attachment for details. If any missing, please help to point out. (The current working features are probably missing here.)
>> Just have a look at these differences. For the missing features in JStorm, I did not see any obstacle which will block the merge to JStorm.
>> For the features which has different solution between Storm and JStorm, we can evaluate the solution one by one to decision which one is appropriate.
>> After the finalization of evaluation, I think JStorm team can take the merging job and publish a stable release in 2 months.
>> But anyway, the detailed implementation for these features with different solution is transparent to user. So, from user's point of view, there is not any compatibility problem.
>> 
>> Besides compatibility, by our experience, stability is also important and is not an easy job. 4 people in JStorm team took almost one year to finish the porting from "clojure core"
>> to "java core", and to make it stable. Of course, we have many devs in community to make the porting job faster. But it still needs a long time to run many online complex topologys to find bugs and fix them. So, that is the reason why I proposed to do merging and build on a stable "java core".
>> 
>> -----Original Message-----
>> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
>> Sent: Wednesday, November 11, 2015 10:51 PM
>> To: dev@storm.apache.org
>> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
>> 
>> +1 for doing a 1.0 release based off of the clojure 0.11.x code.  Migrating the APIs to org.apache.storm is a big non-backwards compatible move, and a major version bump to 2.x seems like a good move there.
>> +1 for the release plan
>> 
>> I would like the move for user facing APIs to org.apache to be one of the last things we do.  Translating clojure code into java and moving it to org.apache I am not too concerned about.
>> 
>> Basti,
>> We have two code bases that have diverged significantly from one another in terms of functionality.  The storm code now or soon will have A Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware Scheduling, a distributed cache like API, log searching, security, massive performance improvements, shaded almost all of our dependencies, a REST API for programtically accessing everything on the UI, and I am sure I am missing a few other things.  JStorm also has many changes including cgroup isolation, restructured zookeeper layout, classpath isolation, and more too.
>> No matter what we do it will be a large effort to port changes from one code base to another, and from clojure to java.  I proposed this initially because it can be broken up into incremental changes.  It may take a little longer, but we will always have a working codebase that is testable and compatible with the current storm release, at least until we move the user facing APIs to be under org.apache.  This lets the community continue to build and test the master branch and report problems that they find, which is incredibly valuable.  I personally don't think it will be much easier, especially if we are intent on always maintaining compatibility with storm. - Bobby
>> 
>> 
>>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
>> 
>> 
>> Hi Taylor,
>> 
>> 
>> 
>> Thanks for the merge plan. I have a question about “Phase 2.2”.
>> 
>> Do you mean community plan to create a fresh new “java core” based on current “clojure core” firstly, and then migrate the features from JStorm?
>> 
>> If so, it confused me.  It is really a huge job which might require a long developing time to make it stable, while JStorm is already a stable version.
>> 
>> The release planned to be release after Nov 11th has already run online stably several month in Alibaba.
>> 
>> Besides this, there are many valuable internal requirements in Alibaba, the fast evolution of JStorm is forseeable in next few months.
>> 
>> If the “java core” is totally fresh new, it might bring many problems for the coming merge.
>> 
>> So, from the point of this view,  I think it is much better and easier to migrate the features of “clojure core” basing on JStorm for the “java core”.
>> 
>> Please correct me, if any misunderstanding.
>> 
>> 
>> 
>> Regards
>> 
>> Basti
>> 
>> 
>> 
>> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
>> 发送时间: 2015年11月11日 5:32
>> 收件人: dev@storm.apache.org
>> 主题: [DISCUSS] Plan for Merging JStorm Code
>> 
>> 
>> 
>> Based on a number of discussions regarding merging the JStorm code, I’ve tried to distill the ideas presented and inserted some of my own. The result is below.
>> 
>> 
>> 
>> I’ve divided the plan into three phases, though they are not necessarily sequential — obviously some tasks can take place in parallel.
>> 
>> 
>> 
>> None of this is set in stone, just presented for discussion. Any and all comments are welcome.
>> 
>> 
>> 
>> -------
>> 
>> 
>> 
>> Phase 1 - Plan for 0.11.x Release
>> 
>> 1. Determine feature set for 0.11.x and publish to wiki [1].
>> 
>> 2. Announce feature-freeze for 0.11.x
>> 
>> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
>> 
>> 4. Release 0.11.0 (or whatever version # we want to use)
>> 
>> 5. Bug fixes and subsequent releases from 0.11.x-branch
>> 
>> 
>> 
>> 
>> 
>> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
>> 
>> 1. Determine/document unique features in JStorm (e.g. classpath isolation, cgroups, etc.) and create JIRA for migrating the feature.
>> 
>> 2. Create JIRA for migrating each clojure component (or logical group of components) to Java. Assumes tests will be ported as well.
>> 
>> 3. Discuss/establish style guide for Java coding conventions. Consider using Oracle’s or Google’s Java conventions as a base — they are both pretty solid.
>> 
>> 4. align package names (e.g backtype.storm --> org.apache.storm / com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
>> 
>> 
>> 
>> 
>> 
>> Phase 3 - Migrate Clojure --> Java
>> 
>> 1. Port code/tests to Java, leveraging existing JStorm code wherever possible (core functionality only, features distinct to JStorm migrated separately).
>> 
>> 2. Port JStorm-specific features.
>> 
>> 3. Begin releasing preview/beta versions.
>> 
>> 4. Code cleanup (across the board) and refactoring using established coding conventions, and leveraging PMD/Checkstyle reports for reference. (Note: good oportunity for new contributors.)
>> 
>> 5. Release 0.12.0 (or whatever version # we want to use) and lift feature freeze.
>> 
>> 
>> 
>> 
>> 
>> Notes:
>> 
>> We should consider bumping up to version 1.0 sometime soon and then switching to semantic versioning [3] from then on.
>> 
>> 
>> 
>> 
>> 
>> With the exception of package name alignment, the "jstorm-import" branch will largely be read-only throughout the process.
>> 
>> 
>> 
>> During migration, it's probably easiest to operate with two local clones of the Apache Storm repo: one for working (i.e. checked out to working branch) and one for reference/copying (i.e. checked out to "jstorm-import").
>> 
>> 
>> 
>> Feature-freeze probably only needs to be enforced against core functionality. Components under "external" can likely be exempt, but we should figure out a process for accepting and releasing new features during the migration.
>> 
>> 
>> 
>> Performance testing should be continuous throughout the process. Since we don't really have ASF infrastructure for performance testing, we will need a volunteer(s) to host and run the performance tests. Performance test results can be posted to the wiki [2]. It would probably be a good idea to establish a baseline with the 0.10.0 release.
>> 
>> 
>> 
>> I’ve attached an analysis document Sean Zhong put together a while back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3 release but is still relevant and has a lot of good information.
>> 
>> 
>> 
>> 
>> 
>> [1] https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
>> 
>> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
>> 
>> [3] http://semver.org
>> 
>> [4] https://issues.apache.org/jira/browse/STORM-717
>> 
>> 
>> 
>> 
>> 
>> -Taylor
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 




  

Re: [DISCUSS] Storm 2.0 plan

Posted by Harsha <ma...@harsha.io>.
Longda,

 "2.1 Copy modules from JStorm, one module from one module
> 2.2 The sequence is extern
> modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others"

Are you suggesting we just copy the Jstorm code in place of clojure? If
so thats not going to work. There might be some code that can be easily
replaceable with Jstorm's . But not everything will be that
straightforward especially with feature disparity between Storm &
JStorm. 
We should be moving code to java i.e rewriting parts of the code where
needed and if something that can be picked up from JStorm we should do
that .

Thanks,
Harsha

 

On Wed, Nov 18, 2015, at 10:13 PM, Longda Feng wrote:
> Sorry for changing the Subject.
> 
> I am +1 for releasing Storm 2.0 with java core, which is merged with
> JStorm.
> 
> I think the change of this release will be the biggest one in history. It
> will probably take a long time to develop. At the same time, Heron is
> going to open source, and the latest release of Flink provides the
> compatibility to Storm’s API. These might be the threat to Storm. So I
> suggest we start the development of Storm 2.0 as quickly as possible. In
> order to accelerate the development cycle, I proposed to take JStorm
> 2.1.0 core and UI as the base version since this version is stable and
> compatible with API of Storm 1.0. Please refer to the phases below for
> the detailed merging plan.
> 
> Note: We provide a demo of JStorm’s web UI. Please refer to
> storm.taobao.org . I think JStorm will give a totally different view to
> you.
> 
> I would like to share the experience of initial development of JStorm
> (Migrate from clojure core to java core). 
> Our team(4 developers) have spent almost one year to finish the
> migration. We took 4 months to release the first JStorm version, and 6
> months to make JStorm stable. During this period, we tried to switch more
> than online 100 applications with different scenarios from Storm to
> JStorm, and many bugs were fixed. Then more and more applications were
> switched to JStorm in Alibaba.
> Currently, there are 7000+ nodes of JStorm clusters in Alibaba and 2000+
> applications are running on them. The JStorm Clusters here can handle 1.5
> PB/2 Trillion messages per day. The use cases are not only in BigData
> field but also in many other online scenarios.
> Besides it, we have experienced the November 11th Shopping Festival of
> Alibaba for last three years. At that day, the computation in our cluster
> increased several times than usual. All applications worked well during
> the peak time. I can say the stability of JStorm is no doubt today.
> Actually, besides Alibaba, the most powerful Chinese IT company are also
> using JStorm.
> 
> 
> Phase 1:
>  
> Define the target of Storm 2.0. List the requirement of Storm 2.0
> 1. Open a new Umbrella Jira
> (https://issues.apache.org/jira/browse/STORM-717)
> 2. Create one 2.0 branch, 
> 2.1 Copy modules from JStorm, one module from one module
> 2.2 The sequence is extern
> modules/client/utils/nimbus/supervisor/drpc/worker & task/web ui/others
> 3. Create jira for all differences between JStorm 2.1.0 and Storm 1.0
> 3.1 Discuss solution for each difference(jira)
> 3.2 Once the solution is finalized, we can start the merging. (Some
> issues could be start concurrently. It depends on the discussion.)
> 
> The phase mainly try to define target and finalize the solution.
> Hopefully this phase could be finished in 2 month(before 2016/1/31). . 
> 
> 
> Phase 2:
> Release Storm 2.0 beta
> 1. Based on phrase 1's discussion, finish all features of Storm 2.0
> 2. Integrate all modules, make the simplest storm example can run on the
> system.
> 3. Test with all example and modules in Storm code base.
> 4. All daily test can be passed.
>  
> Hopefully this phase could be finished in 2 month(before 2016/3/31)
> 
> 
> Phase 3:
> Persuade some user to have a try.
> Alibaba will try to run some online applications on the beta version
> 
> Hopefully this phase could be finished in 1 month(before 2016/4/31).
> 
> 
> Any comments are welcome.
> 
> 
> Thanks
> Longda------------------------------------------------------------------From:P.
> Taylor Goetz <pt...@gmail.com>Send Time:2015年11月19日(星期四) 06:23To:dev
> <de...@storm.apache.org>,Bobby Evans <ev...@yahoo-inc.com>Subject:Re:
> [DISCUSS] Plan for Merging JStorm Code
> All I have at this point is a placeholder wiki entry [1], and a lot of
> local notes that likely would only make sense to me.
> 
> Let me know your wiki username and I’ll give you permissions. The same
> goes for anyone else who wants to help.
> 
> -Taylor
> 
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61328109
> 
> > On Nov 18, 2015, at 2:08 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> > 
> > Taylor and others I was hoping to get started filing JIRA and planning on how we are going to do the java migration + JStorm merger.  Is anyone else starting to do this?  If not would anyone object to me starting on it? - Bobby
> > 
> > 
> >    On Thursday, November 12, 2015 12:04 PM, P. Taylor Goetz <pt...@gmail.com> wrote:
> > 
> > 
> > Thanks for putting this together Basti, that comparison helps a lot.
> > 
> > And thanks Bobby for converting it into markdown. I was going to just attach the spreadsheet to JIRA, but markdown is a much better solution.
> > 
> > -Taylor
> > 
> >> On Nov 12, 2015, at 12:03 PM, Bobby Evans <ev...@yahoo-inc.com.INVALID> wrote:
> >> 
> >> I translated the excel spreadsheet into a markdown file and put up a pull request for it.
> >> https://github.com/apache/storm/pull/877
> >> I did a few edits to it to make it work with Markdown, and to add in a few of my own comments.  I also put in a field for JIRAs to be able to track the migration.
> >> Overall I think your evaluation was very good.  We have a fair amount of work ahead of us to decide what version of various features we want to go forward with.
> >>   - Bobby
> >> 
> >> 
> >>     On Thursday, November 12, 2015 9:37 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
> >> 
> >> 
> >> Hi Bobby & Jungtaek,
> >> 
> >> Thanks for your replay.
> >> I totally agree that compatibility is the most important thing. Actually, JStorm has been compatible with the user API of Storm.
> >> As you mentioned below, we indeed still have some features different between Storm and JStorm. I have tried to list them (minor update or improvements are not included).
> >> Please refer to attachment for details. If any missing, please help to point out. (The current working features are probably missing here.)
> >> Just have a look at these differences. For the missing features in JStorm, I did not see any obstacle which will block the merge to JStorm.
> >> For the features which has different solution between Storm and JStorm, we can evaluate the solution one by one to decision which one is appropriate.
> >> After the finalization of evaluation, I think JStorm team can take the merging job and publish a stable release in 2 months.
> >> But anyway, the detailed implementation for these features with different solution is transparent to user. So, from user's point of view, there is not any compatibility problem.
> >> 
> >> Besides compatibility, by our experience, stability is also important and is not an easy job. 4 people in JStorm team took almost one year to finish the porting from "clojure core"
> >> to "java core", and to make it stable. Of course, we have many devs in community to make the porting job faster. But it still needs a long time to run many online complex topologys to find bugs and fix them. So, that is the reason why I proposed to do merging and build on a stable "java core".
> >> 
> >> -----Original Message-----
> >> From: Bobby Evans [mailto:evans@yahoo-inc.com.INVALID]
> >> Sent: Wednesday, November 11, 2015 10:51 PM
> >> To: dev@storm.apache.org
> >> Subject: Re: [DISCUSS] Plan for Merging JStorm Code
> >> 
> >> +1 for doing a 1.0 release based off of the clojure 0.11.x code.  Migrating the APIs to org.apache.storm is a big non-backwards compatible move, and a major version bump to 2.x seems like a good move there.
> >> +1 for the release plan
> >> 
> >> I would like the move for user facing APIs to org.apache to be one of the last things we do.  Translating clojure code into java and moving it to org.apache I am not too concerned about.
> >> 
> >> Basti,
> >> We have two code bases that have diverged significantly from one another in terms of functionality.  The storm code now or soon will have A Heartbeat Server, Nimbus HA (Different Implementation), Resource Aware Scheduling, a distributed cache like API, log searching, security, massive performance improvements, shaded almost all of our dependencies, a REST API for programtically accessing everything on the UI, and I am sure I am missing a few other things.  JStorm also has many changes including cgroup isolation, restructured zookeeper layout, classpath isolation, and more too.
> >> No matter what we do it will be a large effort to port changes from one code base to another, and from clojure to java.  I proposed this initially because it can be broken up into incremental changes.  It may take a little longer, but we will always have a working codebase that is testable and compatible with the current storm release, at least until we move the user facing APIs to be under org.apache.  This lets the community continue to build and test the master branch and report problems that they find, which is incredibly valuable.  I personally don't think it will be much easier, especially if we are intent on always maintaining compatibility with storm. - Bobby
> >> 
> >> 
> >>     On Wednesday, November 11, 2015 5:42 AM, 刘键(Basti Liu) <ba...@alibaba-inc.com> wrote:
> >> 
> >> 
> >> Hi Taylor,
> >> 
> >> 
> >> 
> >> Thanks for the merge plan. I have a question about “Phase 2.2”.
> >> 
> >> Do you mean community plan to create a fresh new “java core” based on current “clojure core” firstly, and then migrate the features from JStorm?
> >> 
> >> If so, it confused me.  It is really a huge job which might require a long developing time to make it stable, while JStorm is already a stable version.
> >> 
> >> The release planned to be release after Nov 11th has already run online stably several month in Alibaba.
> >> 
> >> Besides this, there are many valuable internal requirements in Alibaba, the fast evolution of JStorm is forseeable in next few months.
> >> 
> >> If the “java core” is totally fresh new, it might bring many problems for the coming merge.
> >> 
> >> So, from the point of this view,  I think it is much better and easier to migrate the features of “clojure core” basing on JStorm for the “java core”.
> >> 
> >> Please correct me, if any misunderstanding.
> >> 
> >> 
> >> 
> >> Regards
> >> 
> >> Basti
> >> 
> >> 
> >> 
> >> 发件人: P. Taylor Goetz [mailto:ptgoetz@gmail.com]
> >> 发送时间: 2015年11月11日 5:32
> >> 收件人: dev@storm.apache.org
> >> 主题: [DISCUSS] Plan for Merging JStorm Code
> >> 
> >> 
> >> 
> >> Based on a number of discussions regarding merging the JStorm code, I’ve tried to distill the ideas presented and inserted some of my own. The result is below.
> >> 
> >> 
> >> 
> >> I’ve divided the plan into three phases, though they are not necessarily sequential — obviously some tasks can take place in parallel.
> >> 
> >> 
> >> 
> >> None of this is set in stone, just presented for discussion. Any and all comments are welcome.
> >> 
> >> 
> >> 
> >> -------
> >> 
> >> 
> >> 
> >> Phase 1 - Plan for 0.11.x Release
> >> 
> >> 1. Determine feature set for 0.11.x and publish to wiki [1].
> >> 
> >> 2. Announce feature-freeze for 0.11.x
> >> 
> >> 3. Create 0.11.x branch from master (Phase 2.4 can begin.)
> >> 
> >> 4. Release 0.11.0 (or whatever version # we want to use)
> >> 
> >> 5. Bug fixes and subsequent releases from 0.11.x-branch
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Phase 2 - Prepare for Merge ("master" and "jstorm-import" branches)
> >> 
> >> 1. Determine/document unique features in JStorm (e.g. classpath isolation, cgroups, etc.) and create JIRA for migrating the feature.
> >> 
> >> 2. Create JIRA for migrating each clojure component (or logical group of components) to Java. Assumes tests will be ported as well.
> >> 
> >> 3. Discuss/establish style guide for Java coding conventions. Consider using Oracle’s or Google’s Java conventions as a base — they are both pretty solid.
> >> 
> >> 4. align package names (e.g backtype.storm --> org.apache.storm / com.alibaba.jstorm --> org.apache.storm) (Phase 3 can begin)
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Phase 3 - Migrate Clojure --> Java
> >> 
> >> 1. Port code/tests to Java, leveraging existing JStorm code wherever possible (core functionality only, features distinct to JStorm migrated separately).
> >> 
> >> 2. Port JStorm-specific features.
> >> 
> >> 3. Begin releasing preview/beta versions.
> >> 
> >> 4. Code cleanup (across the board) and refactoring using established coding conventions, and leveraging PMD/Checkstyle reports for reference. (Note: good oportunity for new contributors.)
> >> 
> >> 5. Release 0.12.0 (or whatever version # we want to use) and lift feature freeze.
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Notes:
> >> 
> >> We should consider bumping up to version 1.0 sometime soon and then switching to semantic versioning [3] from then on.
> >> 
> >> 
> >> 
> >> 
> >> 
> >> With the exception of package name alignment, the "jstorm-import" branch will largely be read-only throughout the process.
> >> 
> >> 
> >> 
> >> During migration, it's probably easiest to operate with two local clones of the Apache Storm repo: one for working (i.e. checked out to working branch) and one for reference/copying (i.e. checked out to "jstorm-import").
> >> 
> >> 
> >> 
> >> Feature-freeze probably only needs to be enforced against core functionality. Components under "external" can likely be exempt, but we should figure out a process for accepting and releasing new features during the migration.
> >> 
> >> 
> >> 
> >> Performance testing should be continuous throughout the process. Since we don't really have ASF infrastructure for performance testing, we will need a volunteer(s) to host and run the performance tests. Performance test results can be posted to the wiki [2]. It would probably be a good idea to establish a baseline with the 0.10.0 release.
> >> 
> >> 
> >> 
> >> I’ve attached an analysis document Sean Zhong put together a while back to the JStorm merge JIRA [4]. The analysis was against the 0.9.3 release but is still relevant and has a lot of good information.
> >> 
> >> 
> >> 
> >> 
> >> 
> >> [1] https://cwiki.apache.org/confluence/display/STORM/Release+0.11.0+Feature+Set
> >> 
> >> [2] https://cwiki.apache.org/confluence/display/STORM/Storm+Home
> >> 
> >> [3] http://semver.org
> >> 
> >> [4] https://issues.apache.org/jira/browse/STORM-717
> >> 
> >> 
> >> 
> >> 
> >> 
> >> -Taylor
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> > 
> > 
> 
>