You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Shyam Rajendran (JIRA)" <ji...@apache.org> on 2015/06/30 20:12:08 UTC

[jira] [Comment Edited] (STORM-919) Gathering worker and supervisor process information (CPU/Memory)

    [ https://issues.apache.org/jira/browse/STORM-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608750#comment-14608750 ] 

Shyam Rajendran edited comment on STORM-919 at 6/30/15 6:11 PM:
----------------------------------------------------------------

- Have made changes to the worker and supervisor heartbeats to carry the system stats ( CPU and JVM memory stats in KB ) . 
- Changes to the SystemBolt.java to include CPU metrics.
- Tested the changes by running on the local mode ( Mac ) 

+ O/P after enabling Logging metrics [ conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class) ] 

Note the addition of cpu/Util

aPoint [uptimeSecs = 130.046]> #<DataPoint [__ack-count = {}]> #<DataPoint [__transfer-count = {}]> #<DataPoint [__recv-iconnection = {dequeuedMessages=4891, enqueued={/10.74.127.190:49539=2903, /10.74.127.190:49534=1988}, pending=[0]}]> #<DataPoint [newWorkerEvent = 0]> #<DataPoint [__send-iconnection = {"6700/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 879, "src" "/10.74.127.190:49537", "dest" "/10.74.127.190:6700", "sent" 879, "lostOnSend" 0}, "6702/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 2879, "src" "/10.74.127.190:49536", "dest" "/10.74.127.190:6702", "sent" 2879, "lostOnSend" 0}}]> #<DataPoint [__fail-count = {}]> #<DataPoint [GC/PSMarkSweep = {count=0, timeMs=0}]> #<DataPoint [startTimeSecs = 1.435686256876E9]> #<DataPoint [__emit-count = {}]> #<DataPoint [memory/nonHeap = {unusedBytes=1495152, virtualFreeBytes=-55431057, initBytes=2555904, committedBytes=56926208, maxBytes=-1, usedBytes=55431056}]> #<DataPoint [__process-latency = {}]> #<DataPoint [__receive = {read_pos=1, write_pos=2, capacity=1024, population=1}]> #<DataPoint [__transfer = {read_pos=7458, write_pos=7458, capacity=1024, population=0}]> #<DataPoint [GC/PSScavenge = {count=2, timeMs=6}]> #<DataPoint [__execute-latency = {}]> #

<DataPoint [cpu/Util = 0.021242744000722888]> 

#<DataPoint [__sendqueue = {read_pos=0, write_pos=0, capacity=1024, population=0}]> #<DataPoint [memory/heap = {unusedBytes=344822264, virtualFreeBytes=658870776, initBytes=268435456, committedBytes=402128896, maxBytes=716177408, usedBytes=57306632}]> #<DataPoint [__execute-count = {}]>]]


was (Author: shyamrajendran):
- Have made changes to the worker and supervisor heartbeats to carry the system stats ( CPU and JVM memory stats in KB ) . 
- Changes to the SystemBolt.java to include CPU metrics.
- Tested the changes by running on the local mode ( Mac ) 

+ O/P after enabling Logging metrics [ conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class) ] 
Note the 
aPoint [uptimeSecs = 130.046]> #<DataPoint [__ack-count = {}]> #<DataPoint [__transfer-count = {}]> #<DataPoint [__recv-iconnection = {dequeuedMessages=4891, enqueued={/10.74.127.190:49539=2903, /10.74.127.190:49534=1988}, pending=[0]}]> #<DataPoint [newWorkerEvent = 0]> #<DataPoint [__send-iconnection = {"6700/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 879, "src" "/10.74.127.190:49537", "dest" "/10.74.127.190:6700", "sent" 879, "lostOnSend" 0}, "6702/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 2879, "src" "/10.74.127.190:49536", "dest" "/10.74.127.190:6702", "sent" 2879, "lostOnSend" 0}}]> #<DataPoint [__fail-count = {}]> #<DataPoint [GC/PSMarkSweep = {count=0, timeMs=0}]> #<DataPoint [startTimeSecs = 1.435686256876E9]> #<DataPoint [__emit-count = {}]> #<DataPoint [memory/nonHeap = {unusedBytes=1495152, virtualFreeBytes=-55431057, initBytes=2555904, committedBytes=56926208, maxBytes=-1, usedBytes=55431056}]> #<DataPoint [__process-latency = {}]> #<DataPoint [__receive = {read_pos=1, write_pos=2, capacity=1024, population=1}]> #<DataPoint [__transfer = {read_pos=7458, write_pos=7458, capacity=1024, population=0}]> #<DataPoint [GC/PSScavenge = {count=2, timeMs=6}]> #<DataPoint [__execute-latency = {}]> #

<DataPoint [cpu/Util = 0.021242744000722888]> 

#<DataPoint [__sendqueue = {read_pos=0, write_pos=0, capacity=1024, population=0}]> #<DataPoint [memory/heap = {unusedBytes=344822264, virtualFreeBytes=658870776, initBytes=268435456, committedBytes=402128896, maxBytes=716177408, usedBytes=57306632}]> #<DataPoint [__execute-count = {}]>]]

> Gathering worker and supervisor process information (CPU/Memory)
> ----------------------------------------------------------------
>
>                 Key: STORM-919
>                 URL: https://issues.apache.org/jira/browse/STORM-919
>             Project: Apache Storm
>          Issue Type: New Feature
>            Reporter: Shyam Rajendran
>            Assignee: Shyam Rajendran
>            Priority: Minor
>
> It would be useful to have supervisor and worker process related information such as %cpu utilization, JVM memory and network bandwidth available to NIMBUS which would be useful for resource aware scheduler implementation later on. As a beginning, the information can be piggybacked on the existing heartbeats into the ZK or to the pacemaker as required. 
> Related JIRAs
> STORM-177
> STORM-891
> STORM-899



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)