You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Michael G. Noll" <mi...@michael-noll.com> on 2015/03/12 21:08:56 UTC

Re: Storm Metrics - Tuples per second - End-to-End delay

Martin,

we recently open sourced storm-graphite, which sends Storm's built-in metrics to Graphite (and InfluxDB because it has a Graphite-compatible API).

https://github.com/verisign/storm-graphite

Maybe this helps,
Michael



> On 15.02.2015, at 14:47, Martin Illecker <mi...@apache.org> wrote:
> 
> Hi Yash,
> 
> but I will have to build a custom Consumer, which extends the LoggingMetricsConsumer [1] to aggregate the metrics?
> Do you know how I can calculate the total end-to-end latency of my topology?
> (simply accumulating the completion time of each bolt?)
> 
> Please can you share your StatsDMetricsConsumer?
> 
> Thanks!
> Best regards
> Martin
> 
> [1] https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/LoggingMetricsConsumer.java
> 
> 2015-02-15 2:13 GMT+01:00 Yashwant Ganti <ya...@gmail.com>:
>> Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of 1 should work, since it would receive all the metrics. Although, if the Topology is rebalanced, the location of this MetricsConsumer can change (different worker on the same supervisor or a different supervisor altogether). 
>> 
>> For what it's worth, we haven't observed any significant performance hit in our production topology, which has a single instance of a StatsDMetricsConsumer running. 
>> 
>> - Yash
>> 
>>> On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <mi...@apache.org> wrote:
>>> Hi Yash,
>>> 
>>> I would prefer to have a solution within Storm only, so that there is no external service involved.
>>> Because the impact in performance should be as small as possible.
>>> 
>>> I don't know if its possible in Storm?
>>> (aggregating CountMetrics or end-to-end latencies by a single global LoggingMetricsConsumer)
>>> 
>>> Best regards
>>> Martin
>>> 
>>> 
>>> 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <ya...@gmail.com>:
>>>> Hi Martin, 
>>>> 
>>>> Do you need the metric information to be written to logs? If that is not a hard constraint, replacing the 'LoggingMetricsConsumer' with a component that sends the metric data to a metric aggregation daemon like StatsD can solve your issue. All you need to make sure is that every metric corresponding to a task is uniquely identified across the Topology and StatsD will take care of the aggregation for you. 
>>>> 
>>>> Regards,
>>>> Yash
>>>> 
>>>>> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <mi...@apache.org> wrote:
>>>>> Hello,
>>>>> 
>>>>> 1) I would like to measure and aggregate the tuples per second for a bolt, which is running on multiple workers and multiple executors.
>>>>> 
>>>>> Therefore I used the CountMetric [1] together with a LoggingMetricsConsumer according to [2].
>>>>> But the results were spread among multiple worker logs and its executor.
>>>>> How can I aggregate this data and get the average number of tuples per second every 10 seconds?
>>>>> 
>>>>> 2) Furthermore, I would also like to measure the end-to-end delay of the whole topology.
>>>>> Is there a better way than propagating the emitting time from the spout to the last bolt?
>>>>> And similar to 1), how can I finally aggregate the calculated end-to-end delay among multiple workers and supervisors?
>>>>> 
>>>>> What would be the best solution to get these aggregated measurements of tuples per second and end-to-end delay without impacting the performance?
>>>>> I would prefer one global LoggingMetricsConsumer.
>>>>> 
>>>>> Thanks!
>>>>> Best regards
>>>>> Martin
>>>>> 
>>>>> [1] https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java
>>>>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html
>