You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Jungtaek Lim (JIRA)" <ji...@apache.org> on 2016/06/13 04:22:20 UTC

[jira] [Resolved] (STORM-1698) Asynchronous MetricsConsumerBolt

     [ https://issues.apache.org/jira/browse/STORM-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jungtaek Lim resolved STORM-1698.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.1.0
                   2.0.0

Merged into master and 1.x-branch.

> Asynchronous MetricsConsumerBolt
> --------------------------------
>
>                 Key: STORM-1698
>                 URL: https://issues.apache.org/jira/browse/STORM-1698
>             Project: Apache Storm
>          Issue Type: Sub-task
>          Components: storm-core
>    Affects Versions: 1.0.0, 2.0.0
>            Reporter: Jungtaek Lim
>            Assignee: Jungtaek Lim
>            Priority: Critical
>             Fix For: 2.0.0, 1.1.0
>
>
> Currently MetricsConsumerBolt is delegating MetricsConsumer to handle data points via synchronous manner.
> When MetricsConsumer cannot keep up, it will trigger backpressure when (queue size + overflow buffer size) reaches high watermark, which incurs slowing down the topology in result. 
> Slowing down Itself is not a problem because that’s what backpressure is for. The actual problem is that backpressure only throttles spout, not metrics. If MetricsConsumerBolt cannot keep up with incoming tuples, backpressure never ends and topology just hangs. If we turn off backpressure, we have unbounded queue and worker could throw OOME eventually.
> Making MetricsConsumerBolt asynchronous can resolve this issue. One downside of making it async is that it's hard to see that MetricsConsumerBolt is keeping up now. (capacity will be always around 0)
> I don't have an idea for now but I think it's still better than current.
> Before making consensus about huge change of metrics, I'd love to improve current metrics without breaking backward compatible manner. It could be applied to 1.x-branch, and even 0.10.x-branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)