You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Chinmay Soman <ch...@gmail.com> on 2015/02/05 20:01:00 UTC

Query regarding Metrics per worker vs per task

Hey all,

I'm trying to monitor the CPU/memory usage per worker (or JVM). The way
we're doing this in our setup is to send these metrics for the registered
metric name using a Graphite reporter. Currently, this metric name is
constructed using the following components:

<topology-specific-name>.class-name.metric-name.componentID.taskID

The problem with this naming is that each task redundantly reports the same
value (in case of JVM related metrics). What I really want is one metric
per worker instead of per task.

Is there any way to get a logical worker ID  ? From the documentation it
seems like there is a worker port (which probably keeps changing). Or is
there a better way to do this ?

Please let me know.

-- 
Thanks and regards

Chinmay Soman

Re: Query regarding Metrics per worker vs per task

Posted by Kosala Dissanayake <um...@gmail.com>.

Don't know whether there is a method to retrieve an ID for the worker.
Maybe try using the PID of the process?
 https://stackoverflow.com/questions/35842/how-can-a-java-program-get-its-own-process-id
<https://stackoverflow.com/questions/35842/how-can-a-java-program-get-its-own-process-id>

On Fri, Feb 6, 2015 at 7:47 AM, Yury Ruchin <yu...@gmail.com> wrote:

> Hi,
>
> The worker slot ports are somewhat being statically configured for Storm
> cluster (
> https://storm.apache.org/documentation/Setting-up-a-Storm-cluster.html)
> and my understanding is that configured ports should not change. After all,
> these ports are what workers' receive threads use when doing interprocess
> communication (
> http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/
> ).
>
> I think it can (and should) be checked experimentally, by killing worker
> process and observing which port will the worker re-created by supervisor
> occupy.
>
> Regards,
> Yury
>
> 2015-02-05 22:01 GMT+03:00 Chinmay Soman <ch...@gmail.com>:
>
>> Hey all,
>>
>> I'm trying to monitor the CPU/memory usage per worker (or JVM). The way
>> we're doing this in our setup is to send these metrics for the registered
>> metric name using a Graphite reporter. Currently, this metric name is
>> constructed using the following components:
>>
>> <topology-specific-name>.class-name.metric-name.componentID.taskID
>>
>> The problem with this naming is that each task redundantly reports the
>> same value (in case of JVM related metrics). What I really want is one
>> metric per worker instead of per task.
>>
>> Is there any way to get a logical worker ID  ? From the documentation it
>> seems like there is a worker port (which probably keeps changing). Or is
>> there a better way to do this ?
>>
>> Please let me know.
>>
>> --
>> Thanks and regards
>>
>> Chinmay Soman
>>
>
>

Re: Query regarding Metrics per worker vs per task

Posted by Yury Ruchin <yu...@gmail.com>.

Hi,

The worker slot ports are somewhat being statically configured for Storm
cluster (
https://storm.apache.org/documentation/Setting-up-a-Storm-cluster.html) and
my understanding is that configured ports should not change. After all,
these ports are what workers' receive threads use when doing interprocess
communication (
http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/
).

I think it can (and should) be checked experimentally, by killing worker
process and observing which port will the worker re-created by supervisor
occupy.

Regards,
Yury

2015-02-05 22:01 GMT+03:00 Chinmay Soman <ch...@gmail.com>:

> Hey all,
>
> I'm trying to monitor the CPU/memory usage per worker (or JVM). The way
> we're doing this in our setup is to send these metrics for the registered
> metric name using a Graphite reporter. Currently, this metric name is
> constructed using the following components:
>
> <topology-specific-name>.class-name.metric-name.componentID.taskID
>
> The problem with this naming is that each task redundantly reports the
> same value (in case of JVM related metrics). What I really want is one
> metric per worker instead of per task.
>
> Is there any way to get a logical worker ID  ? From the documentation it
> seems like there is a worker port (which probably keeps changing). Or is
> there a better way to do this ?
>
> Please let me know.
>
> --
> Thanks and regards
>
> Chinmay Soman
>

Re: Query regarding Metrics per worker vs per task

Posted by "P. Taylor Goetz" <pt...@gmail.com>.

The combination of host + port is enough to uniquely identify a worker in the cluster. Just be aware that if a worker dies or is killed, it might be reassigned to a different slot (host/port).

-Taylor


> On Feb 5, 2015, at 2:01 PM, Chinmay Soman <ch...@gmail.com> wrote:
> 
> Hey all,
> 
> I'm trying to monitor the CPU/memory usage per worker (or JVM). The way we're doing this in our setup is to send these metrics for the registered metric name using a Graphite reporter. Currently, this metric name is constructed using the following components:
> 
> <topology-specific-name>.class-name.metric-name.componentID.taskID
> 
> The problem with this naming is that each task redundantly reports the same value (in case of JVM related metrics). What I really want is one metric per worker instead of per task. 
> 
> Is there any way to get a logical worker ID  ? From the documentation it seems like there is a worker port (which probably keeps changing). Or is there a better way to do this ?
> 
> Please let me know.
> 
> -- 
> Thanks and regards
> 
> Chinmay Soman