You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by le...@tutanota.com on 2016/06/01 09:43:40 UTC
Re: Storm unique strengths
Hi Aaron,
thank you very much for the link. I found it quite insightful. It is one of
the few benchmarks i have encountered where Storm comes out on top in terms
of latency, although the at-most once trade-off is quite harsh.
Regards
Leon
31. May 2016 15:37 by Aaron.Dossett@target.com:
> Hi Leon,
> This isn’t an advocacy piece per se, but this analysis by several member of
> the Storm community may be helpful. For a particular use case you can
> compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching to.
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
> From: > "> leon_mclare@tutanota.com> " <> leon_mclare@tutanota.com> >
> Reply-To: > "> user@storm.apache.org> " <> user@storm.apache.org> >
> Date: > Monday, May 30, 2016 at 3:28 AM
> To: > "> user@storm.apache.org> " <> user@storm.apache.org> >
> Subject: > Storm unique strengths
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data Stream
> Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which make it
> stand out among its direct competitors. Currently there is significant
> competition from Apache Flink, although less so from Spark due to its
> seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well as a
> very flexible concept of Spouts and Bolts. Other aspects however seem to
> have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards Storm's
> case?
>
> Thanks in advance.
>
> Leon
>
Re: Storm unique strengths
Posted by "Nikos R. Katsipoulakis" <ni...@gmail.com>.
Hello Taylor,
Thank you very much of the clarification email. I am sure that it will be a
vital reference point for many new and current users of Storm. I would like
to suggest that a summary of your email becomes part of Apache Storm's
documentation.
Kind Regards,
Nikos
On Thu, Jun 2, 2016 at 10:35 AM, P. Taylor Goetz <pt...@gmail.com> wrote:
> There are a few things to keep in mind when evaluating Heron and Storm:
>
> First is performance. Twitter benchmarked Heron against a very old,
> pre-Apache version of Storm (back when the transport layer was based on
> 0mq), so their claims of performance improvements over Storm are likely
> significantly overblown. There have been an enormous number of performance
> improvements since then, and the Storm 1.0 release likely erases most of
> the performance gain claimed by the Heron project.
>
> Second, despite their claims, Heron is not API compatible with the latest
> release of Apache Storm. It may be somewhat compatible with the 0.9.x
> series of releases, but 0.10.x is likely to have some compatibility issues
> (I haven’t tested this out so I don’t know for sure), and it’s certainly
> not compatible with 1.0.
>
> Finally, lets look at a few things that Storm has that Heron does not. Off
> the top of my head I can think of:
>
> * End-to-end security (Kerberos, etc.), including secure integration with
> other Apache Hadoop projects like ZooKeeper, HDFS, HBase, etc.
> * Trident API (microbatching, exactly-once processing, etc.)
> * Distributed Remote Procedure Calls (DRPC)
> * Built-in windowing support
> * State management (stateful bolts with automatic checkpointing)
> * Distributed Cache API
> * Kafka integration (though I believe this is coming)
> * Integration with HDFS, Hive, HBase, Cassandra, Solr, Elastic Search,
> Redis, MongoDB, JDBC, MQTT, and Azure Event Hubs.
> * Scheduler framework independence (Heron requires Apache Mesos)
> * Partial key groupings
> * Declarative topology wiring (i.e. Flux)
>
> Is Heron a drop-in replacement for Storm? Probably not.
>
> -Taylor
>
> On Jun 2, 2016, at 9:27 AM, leon_mclare@tutanota.com wrote:
>
> Hi Marc,
>
> I had come across Heron a couple of weeks ago. It was indeed quite
> interesting. Thanks for the hint.
>
> Regards
> Leon
>
>
> 1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu:
>
>
> Maybe also take into account the new heron
>
> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: marc@f1-outsourcing.eu
>
>
> -----Original Message-----
> From: leon_mclare@tutanota.com [mailto:leon_mclare@tutanota.com]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: aaron.dossett@target.com
> Subject: Re: Storm unique strengths
>
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by Aaron.Dossett@target.com:
>
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
>
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
> g-computation-engines-at
>
>
> From: "leon_mclare@tutanota.com" <le...@tutanota.com>
> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org" <us...@storm.apache.org>
> Subject: Storm unique strengths
>
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
>
>
>
--
Nikos R. Katsipoulakis,
Department of Computer Science
University of Pittsburgh
Re: Storm unique strengths
Posted by "P. Taylor Goetz" <pt...@gmail.com>.
There are a few things to keep in mind when evaluating Heron and Storm:
First is performance. Twitter benchmarked Heron against a very old, pre-Apache version of Storm (back when the transport layer was based on 0mq), so their claims of performance improvements over Storm are likely significantly overblown. There have been an enormous number of performance improvements since then, and the Storm 1.0 release likely erases most of the performance gain claimed by the Heron project.
Second, despite their claims, Heron is not API compatible with the latest release of Apache Storm. It may be somewhat compatible with the 0.9.x series of releases, but 0.10.x is likely to have some compatibility issues (I haven’t tested this out so I don’t know for sure), and it’s certainly not compatible with 1.0.
Finally, lets look at a few things that Storm has that Heron does not. Off the top of my head I can think of:
* End-to-end security (Kerberos, etc.), including secure integration with other Apache Hadoop projects like ZooKeeper, HDFS, HBase, etc.
* Trident API (microbatching, exactly-once processing, etc.)
* Distributed Remote Procedure Calls (DRPC)
* Built-in windowing support
* State management (stateful bolts with automatic checkpointing)
* Distributed Cache API
* Kafka integration (though I believe this is coming)
* Integration with HDFS, Hive, HBase, Cassandra, Solr, Elastic Search, Redis, MongoDB, JDBC, MQTT, and Azure Event Hubs.
* Scheduler framework independence (Heron requires Apache Mesos)
* Partial key groupings
* Declarative topology wiring (i.e. Flux)
Is Heron a drop-in replacement for Storm? Probably not.
-Taylor
> On Jun 2, 2016, at 9:27 AM, leon_mclare@tutanota.com wrote:
>
> Hi Marc,
>
> I had come across Heron a couple of weeks ago. It was indeed quite interesting. Thanks for the hint.
>
> Regards
> Leon
>
>
> 1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu <ma...@f1-outsourcing.eu>:
>
>
> Maybe also take into account the new heron
>
> https://blog.twitter.com/2016/open-sourcing-twitter-heron <https://blog.twitter.com/2016/open-sourcing-twitter-heron>
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: marc@f1-outsourcing.eu <ma...@f1-outsourcing.eu>
>
>
> -----Original Message-----
> From: leon_mclare@tutanota.com <ma...@tutanota.com> [mailto:leon_mclare@tutanota.com <ma...@tutanota.com>]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: aaron.dossett@target.com <ma...@target.com>
> Subject: Re: Storm unique strengths
>
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by Aaron.Dossett@target.com <ma...@target.com>:
>
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
>
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin <https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin>
> g-computation-engines-at
>
>
> From: "leon_mclare@tutanota.com <ma...@tutanota.com>" <leon_mclare@tutanota.com <ma...@tutanota.com>>
> Reply-To: "user@storm.apache.org <ma...@storm.apache.org>" <user@storm.apache.org <ma...@storm.apache.org>>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org <ma...@storm.apache.org>" <user@storm.apache.org <ma...@storm.apache.org>>
> Subject: Storm unique strengths
>
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
Re: RE: Storm unique strengths
Posted by le...@tutanota.com.
Hi Marc,
I had come across Heron a couple of weeks ago. It was indeed quite
interesting. Thanks for the hint.
Regards
Leon
1. Jun 2016 11:47 by M.Roos@f1-outsourcing.eu:
>
> Maybe also take into account the new heron
>
> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
>
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: > marc@f1-outsourcing.eu
>
>
> -----Original Message-----
> From: > leon_mclare@tutanota.com> [> mailto:leon_mclare@tutanota.com> ]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: > aaron.dossett@target.com
> Subject: Re: Storm unique strengths
>
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by > Aaron.Dossett@target.com> :
>
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
>
> > https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
> g-computation-engines-at
>
>
> From: "> leon_mclare@tutanota.com> " <> leon_mclare@tutanota.com> >
> Reply-To: "> user@storm.apache.org> " <> user@storm.apache.org> >
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "> user@storm.apache.org> " <> user@storm.apache.org> >
> Subject: Storm unique strengths
>
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
>
Re: Storm unique strengths
Posted by Jungtaek Lim <ka...@gmail.com>.
Hi Leon,
One thing to note is that we addressed some performance issues after
benchmark is being done. I don't have environments for benchmark for the
same, but worth to give it a try with recent release (1.0.1).
Thanks,
Jungtaek Lim (HeartSaVioR)
2016년 6월 1일 (수) 오후 6:43, <le...@tutanota.com>님이 작성:
> Hi Aaron,
>
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
>
> Regards
> Leon
>
> 31. May 2016 15:37 by Aaron.Dossett@target.com:
>
>
> Hi Leon,
>
> This isn’t an advocacy piece per se, but this analysis by several member
> of the Storm community may be helpful. For a particular use case you can
> compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching to.
>
>
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
>
> From: "leon_mclare@tutanota.com" <le...@tutanota.com>
> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org" <us...@storm.apache.org>
> Subject: Storm unique strengths
>
> Hi Storm team,
>
> there are a lot of online comparisons between Storm and other Data Stream
> Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which make it
> stand out among its direct competitors. Currently there is significant
> competition from Apache Flink, although less so from Spark due to its
> seconds latency restriction.
>
> From my experience Storm offers a unique support for DSLs, as well as a
> very flexible concept of Spouts and Bolts. Other aspects however seem to
> have been improved upon by Flink in greater part.
>
> Would you be able to direct me to resources that argue more towards
> Storm's case?
>
> Thanks in advance.
>
> Leon
>
>
RE: Storm unique strengths
Posted by Marc Roos <M....@f1-outsourcing.eu>.
Maybe also take into account the new heron
https://blog.twitter.com/2016/open-sourcing-twitter-heron
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
F1 Outsourcing Development Sp. z o.o.
Poland
t: +48 (0)124466845
f: +48 (0)124466843
e: marc@f1-outsourcing.eu
-----Original Message-----
From: leon_mclare@tutanota.com [mailto:leon_mclare@tutanota.com]
Sent: woensdag 1 juni 2016 11:44
To: User
Cc: aaron.dossett@target.com
Subject: Re: Storm unique strengths
Hi Aaron,
thank you very much for the link. I found it quite insightful. It is one
of the few benchmarks i have encountered where Storm comes out on top in
terms of latency, although the at-most once trade-off is quite harsh.
Regards
Leon
31. May 2016 15:37 by Aaron.Dossett@target.com:
Hi Leon,
This isn’t an advocacy piece per se, but this analysis by several
member of the Storm community may be helpful. For a particular use case
you can compare performance and then assess whether the features,
user-friendliness, or API of a particular framework is worth switching
to.
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin
g-computation-engines-at
From: "leon_mclare@tutanota.com" <le...@tutanota.com>
Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
Date: Monday, May 30, 2016 at 3:28 AM
To: "user@storm.apache.org" <us...@storm.apache.org>
Subject: Storm unique strengths
Hi Storm team,
there are a lot of online comparisons between Storm and other Data
Stream Management Systems, yet few of them originate from Storm
committers/advocats.
I am trying to identify the aspects that Storm possesses, which
make it stand out among its direct competitors. Currently there is
significant competition from Apache Flink, although less so from Spark
due to its seconds latency restriction.
From my experience Storm offers a unique support for DSLs, as well
as a very flexible concept of Spouts and Bolts. Other aspects however
seem to have been improved upon by Flink in greater part.
Would you be able to direct me to resources that argue more towards
Storm's case?
Thanks in advance.
Leon