You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by le...@tutanota.com on 2016/06/01 09:43:40 UTC

Re: Storm unique strengths

Hi Aaron,

thank you very much for the link. I found it quite insightful. It is one of 
the few benchmarks i have encountered where Storm comes out on top in terms 
of latency, although the at-most once trade-off is quite harsh.

Regards
Leon

31. May 2016 15:37 by Aaron.Dossett@target.com:


> Hi Leon,
> This isn’t an advocacy piece per se, but this analysis by several member of 
> the Storm community may be helpful.  For a particular use case you can 
> compare performance and then assess whether the features, 
> user-friendliness, or API of a particular framework is worth switching to.
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
> From: > "> leon_mclare@tutanota.com> " <> leon_mclare@tutanota.com> >
> Reply-To: > "> user@storm.apache.org> " <> user@storm.apache.org> >
> Date: > Monday, May 30, 2016 at 3:28 AM
> To: > "> user@storm.apache.org> " <> user@storm.apache.org> >
> Subject: > Storm unique strengths
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data Stream 
> Management Systems, yet few of them originate from Storm 
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which make it 
> stand out among its direct competitors. Currently there is significant 
> competition from Apache Flink, although less so from Spark due to its 
> seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well as a 
> very flexible concept of Spouts and Bolts. Other aspects however seem to 
> have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards Storm's 
> case?
>
> Thanks in advance.
>
> Leon
> 

Re: Storm unique strengths

Posted by "Nikos R. Katsipoulakis" <ni...@gmail.com>.
Hello Taylor,

Thank you very much of the clarification email. I am sure that it will be a
vital reference point for many new and current users of Storm. I would like
to suggest that a summary of your email becomes part of Apache Storm's
documentation.

Kind Regards,
Nikos


On Thu, Jun 2, 2016 at 10:35 AM, P. Taylor Goetz <pt...@gmail.com> wrote:

> There are a few things to keep in mind when evaluating Heron and Storm:
>
> First is performance. Twitter benchmarked Heron against a very old,
> pre-Apache version of Storm (back when the transport layer was based on
> 0mq), so their claims of performance improvements over Storm are likely
> significantly overblown. There have been an enormous number of performance
> improvements since then, and the Storm 1.0 release likely erases most of
> the performance gain claimed by the Heron project.
>
> Second, despite their claims, Heron is not API compatible with the latest
> release of Apache Storm. It may be somewhat compatible with the 0.9.x
> series of releases, but 0.10.x is likely to have some compatibility issues
> (I haven’t tested this out so I don’t know for sure), and it’s certainly
> not compatible with 1.0.
>
> Finally, lets look at a few things that Storm has that Heron does not. Off
> the top of my head I can think of:
>
> * End-to-end security (Kerberos, etc.), including secure integration with
> other Apache Hadoop projects like ZooKeeper, HDFS, HBase, etc.
> * Trident API (microbatching, exactly-once processing, etc.)
> * Distributed Remote Procedure Calls (DRPC)
> * Built-in windowing support
> * State management (stateful bolts with automatic checkpointing)
> * Distributed Cache API
> * Kafka integration (though I believe this is coming)
> * Integration with HDFS, Hive, HBase, Cassandra, Solr, Elastic Search,
> Redis, MongoDB, JDBC, MQTT, and Azure Event Hubs.
> * Scheduler framework independence (Heron requires Apache Mesos)
> * Partial key groupings
> * Declarative topology wiring (i.e. Flux)
>
> Is Heron a drop-in replacement for Storm? Probably not.
>
> -Taylor
>
> On Jun 2, 2016, at 9:27 AM, leon_mclare@tutanota.com wrote:
>
> Hi Marc,
>
> I had come across Heron a couple of weeks ago. It was indeed quite
> interesting. Thanks for the hint.
>
> Regards
> Leon
>
>
> 1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu:
>
>
> Maybe also take into account the new heron
>
> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: marc@f1-outsourcing.eu
>
>
> -----Original Message-----
> From: leon_mclare@tutanota.com [mailto:leon_mclare@tutanota.com]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: aaron.dossett@target.com
> Subject: Re: Storm unique strengths
>
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by Aaron.Dossett@target.com:
>
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
>
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
> g-computation-engines-at
>
>
> From: "leon_mclare@tutanota.com" <le...@tutanota.com>
> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org" <us...@storm.apache.org>
> Subject: Storm unique strengths
>
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
>
>
>


-- 
Nikos R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh

Re: Storm unique strengths

Posted by "P. Taylor Goetz" <pt...@gmail.com>.
There are a few things to keep in mind when evaluating Heron and Storm:

First is performance. Twitter benchmarked Heron against a very old, pre-Apache version of Storm (back when the transport layer was based on 0mq), so their claims of performance improvements over Storm are likely significantly overblown. There have been an enormous number of performance improvements since then, and the Storm 1.0 release likely erases most of the performance gain claimed by the Heron project.

Second, despite their claims, Heron is not API compatible with the latest release of Apache Storm. It may be somewhat compatible with the 0.9.x series of releases, but 0.10.x is likely to have some compatibility issues (I haven’t tested this out so I don’t know for sure), and it’s certainly not compatible with 1.0.

Finally, lets look at a few things that Storm has that Heron does not. Off the top of my head I can think of:

* End-to-end security (Kerberos, etc.), including secure integration with other Apache Hadoop projects like ZooKeeper, HDFS, HBase, etc.
* Trident API (microbatching, exactly-once processing, etc.)
* Distributed Remote Procedure Calls (DRPC)
* Built-in windowing support
* State management (stateful bolts with automatic checkpointing)
* Distributed Cache API
* Kafka integration (though I believe this is coming)
* Integration with HDFS, Hive, HBase, Cassandra, Solr, Elastic Search, Redis, MongoDB, JDBC, MQTT, and Azure Event Hubs.
* Scheduler framework independence (Heron requires Apache Mesos)
* Partial key groupings
* Declarative topology wiring (i.e. Flux)

Is Heron a drop-in replacement for Storm? Probably not.

-Taylor

> On Jun 2, 2016, at 9:27 AM, leon_mclare@tutanota.com wrote:
> 
> Hi Marc,
> 
> I had come across Heron a couple of weeks ago. It was indeed quite interesting. Thanks for the hint.
> 
> Regards
> Leon
> 
> 
> 1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu <ma...@f1-outsourcing.eu>:
> 
> 
> Maybe also take into account the new heron
> 
> https://blog.twitter.com/2016/open-sourcing-twitter-heron <https://blog.twitter.com/2016/open-sourcing-twitter-heron>
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
> 
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: marc@f1-outsourcing.eu <ma...@f1-outsourcing.eu>
> 
> 
> -----Original Message-----
> From: leon_mclare@tutanota.com <ma...@tutanota.com> [mailto:leon_mclare@tutanota.com <ma...@tutanota.com>]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: aaron.dossett@target.com <ma...@target.com>
> Subject: Re: Storm unique strengths
> 
> Hi Aaron,
> 
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
> 
> Regards
> Leon
> 
> 31. May 2016 15:37 by Aaron.Dossett@target.com <ma...@target.com>:
> 
> 
> 
> Hi Leon,
> 
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
> 
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin <https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin>
> g-computation-engines-at
> 
> 
> From: "leon_mclare@tutanota.com <ma...@tutanota.com>" <leon_mclare@tutanota.com <ma...@tutanota.com>>
> Reply-To: "user@storm.apache.org <ma...@storm.apache.org>" <user@storm.apache.org <ma...@storm.apache.org>>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org <ma...@storm.apache.org>" <user@storm.apache.org <ma...@storm.apache.org>>
> Subject: Storm unique strengths
> 
> 
> Hi Storm team,
> 
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
> 
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
> 
> Would you be able to direct me to resources that argue more towards
> Storm's case?
> 
> Thanks in advance.
> 
> Leon


Re: RE: Storm unique strengths

Posted by le...@tutanota.com.
Hi Marc,

I had come across Heron a couple of weeks ago. It was indeed quite 
interesting. Thanks for the hint.

Regards
Leon


1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu:


>
> Maybe also take into account the new heron
>
> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t:  +48 (0)124466845
> f:  +48 (0)124466843
> e:  > marc@f1-outsourcing.eu
>
>
> -----Original Message-----
> From: > leon_mclare@tutanota.com>  [> mailto:leon_mclare@tutanota.com> ]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: > aaron.dossett@target.com
> Subject: Re: Storm unique strengths
>
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by > Aaron.Dossett@target.com> :
>
>
>
> 	Hi Leon,
>
> 	This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful.  For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
>
> 	> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
> g-computation-engines-at
>
> 	
> 	From: "> leon_mclare@tutanota.com> " <> leon_mclare@tutanota.com> >
> 	Reply-To: "> user@storm.apache.org> " <> user@storm.apache.org> >
> 	Date: Monday, May 30, 2016 at 3:28 AM
> 	To: "> user@storm.apache.org> " <> user@storm.apache.org> >
> 	Subject: Storm unique strengths
> 	
>
> 	Hi Storm team,
> 	
> 	there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> 	I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
> 	
> 	From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
> 	
> 	Would you be able to direct me to resources that argue more towards
> Storm's case?
> 	
> 	Thanks in advance.
> 	
> 	Leon
> 	

Re: Storm unique strengths

Posted by Jungtaek Lim <ka...@gmail.com>.
Hi Leon,

One thing to note is that we addressed some performance issues after
benchmark is being done. I don't have environments for benchmark for the
same, but worth to give it a try with recent release (1.0.1).

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 6월 1일 (수) 오후 6:43, <le...@tutanota.com>님이 작성:

> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by Aaron.Dossett@target.com:
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several member
> of the Storm community may be helpful.  For a particular use case you can
> compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching to.
>
>
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
>
> From: "leon_mclare@tutanota.com" <le...@tutanota.com>
> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org" <us...@storm.apache.org>
> Subject: Storm unique strengths
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data Stream
> Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which make it
> stand out among its direct competitors. Currently there is significant
> competition from Apache Flink, although less so from Spark due to its
> seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well as a
> very flexible concept of Spouts and Bolts. Other aspects however seem to
> have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
>
>

RE: Storm unique strengths

Posted by Marc Roos <M....@f1-outsourcing.eu>.
 
Maybe also take into account the new heron

https://blog.twitter.com/2016/open-sourcing-twitter-heron


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -. 
F1 Outsourcing Development Sp. z o.o.
Poland 

t:  +48 (0)124466845
f:  +48 (0)124466843
e:  marc@f1-outsourcing.eu


-----Original Message-----
From: leon_mclare@tutanota.com [mailto:leon_mclare@tutanota.com] 
Sent: woensdag 1 juni 2016 11:44
To: User
Cc: aaron.dossett@target.com
Subject: Re: Storm unique strengths

Hi Aaron,

thank you very much for the link. I found it quite insightful. It is one 
of the few benchmarks i have encountered where Storm comes out on top in 
terms of latency, although the at-most once trade-off is quite harsh. 

Regards
Leon

31. May 2016 15:37 by Aaron.Dossett@target.com:



	Hi Leon,

	This isn’t an advocacy piece per se, but this analysis by several 
member of the Storm community may be helpful.  For a particular use case 
you can compare performance and then assess whether the features, 
user-friendliness, or API of a particular framework is worth switching 
to.

	https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
g-computation-engines-at

	
	From: "leon_mclare@tutanota.com" <le...@tutanota.com>
	Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
	Date: Monday, May 30, 2016 at 3:28 AM
	To: "user@storm.apache.org" <us...@storm.apache.org>
	Subject: Storm unique strengths
	

	Hi Storm team,
	
	there are a lot of online comparisons between Storm and other Data 
Stream Management Systems, yet few of them originate from Storm 
committers/advocats.
	I am trying to identify the aspects that Storm possesses, which 
make it stand out among its direct competitors. Currently there is 
significant competition from Apache Flink, although less so from Spark 
due to its seconds latency restriction. 
	
	From my experience Storm offers a unique support for DSLs, as well 
as a very flexible concept of Spouts and Bolts. Other aspects however 
seem to have been improved upon by Flink in greater part.
	
	Would you be able to direct me to resources that argue more towards 
Storm's case?
	
	Thanks in advance.
	
	Leon