You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Shyam Rajendran (JIRA)" <ji...@apache.org> on 2015/06/30 20:12:08 UTC
[jira] [Comment Edited] (STORM-919) Gathering worker and supervisor
process information (CPU/Memory)
[ https://issues.apache.org/jira/browse/STORM-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608750#comment-14608750 ]
Shyam Rajendran edited comment on STORM-919 at 6/30/15 6:11 PM:
----------------------------------------------------------------
- Have made changes to the worker and supervisor heartbeats to carry the system stats ( CPU and JVM memory stats in KB ) .
- Changes to the SystemBolt.java to include CPU metrics.
- Tested the changes by running on the local mode ( Mac )
+ O/P after enabling Logging metrics [ conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class) ]
Note the addition of cpu/Util
aPoint [uptimeSecs = 130.046]> #<DataPoint [__ack-count = {}]> #<DataPoint [__transfer-count = {}]> #<DataPoint [__recv-iconnection = {dequeuedMessages=4891, enqueued={/10.74.127.190:49539=2903, /10.74.127.190:49534=1988}, pending=[0]}]> #<DataPoint [newWorkerEvent = 0]> #<DataPoint [__send-iconnection = {"6700/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 879, "src" "/10.74.127.190:49537", "dest" "/10.74.127.190:6700", "sent" 879, "lostOnSend" 0}, "6702/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 2879, "src" "/10.74.127.190:49536", "dest" "/10.74.127.190:6702", "sent" 2879, "lostOnSend" 0}}]> #<DataPoint [__fail-count = {}]> #<DataPoint [GC/PSMarkSweep = {count=0, timeMs=0}]> #<DataPoint [startTimeSecs = 1.435686256876E9]> #<DataPoint [__emit-count = {}]> #<DataPoint [memory/nonHeap = {unusedBytes=1495152, virtualFreeBytes=-55431057, initBytes=2555904, committedBytes=56926208, maxBytes=-1, usedBytes=55431056}]> #<DataPoint [__process-latency = {}]> #<DataPoint [__receive = {read_pos=1, write_pos=2, capacity=1024, population=1}]> #<DataPoint [__transfer = {read_pos=7458, write_pos=7458, capacity=1024, population=0}]> #<DataPoint [GC/PSScavenge = {count=2, timeMs=6}]> #<DataPoint [__execute-latency = {}]> #
<DataPoint [cpu/Util = 0.021242744000722888]>
#<DataPoint [__sendqueue = {read_pos=0, write_pos=0, capacity=1024, population=0}]> #<DataPoint [memory/heap = {unusedBytes=344822264, virtualFreeBytes=658870776, initBytes=268435456, committedBytes=402128896, maxBytes=716177408, usedBytes=57306632}]> #<DataPoint [__execute-count = {}]>]]
was (Author: shyamrajendran):
- Have made changes to the worker and supervisor heartbeats to carry the system stats ( CPU and JVM memory stats in KB ) .
- Changes to the SystemBolt.java to include CPU metrics.
- Tested the changes by running on the local mode ( Mac )
+ O/P after enabling Logging metrics [ conf.registerMetricsConsumer(backtype.storm.metric.LoggingMetricsConsumer.class) ]
Note the
aPoint [uptimeSecs = 130.046]> #<DataPoint [__ack-count = {}]> #<DataPoint [__transfer-count = {}]> #<DataPoint [__recv-iconnection = {dequeuedMessages=4891, enqueued={/10.74.127.190:49539=2903, /10.74.127.190:49534=1988}, pending=[0]}]> #<DataPoint [newWorkerEvent = 0]> #<DataPoint [__send-iconnection = {"6700/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 879, "src" "/10.74.127.190:49537", "dest" "/10.74.127.190:6700", "sent" 879, "lostOnSend" 0}, "6702/207e5608-3c16-43f1-927f-a9b3396839bd" {"queue_length" 0, "reconnects" 0, "enqueued" 2879, "src" "/10.74.127.190:49536", "dest" "/10.74.127.190:6702", "sent" 2879, "lostOnSend" 0}}]> #<DataPoint [__fail-count = {}]> #<DataPoint [GC/PSMarkSweep = {count=0, timeMs=0}]> #<DataPoint [startTimeSecs = 1.435686256876E9]> #<DataPoint [__emit-count = {}]> #<DataPoint [memory/nonHeap = {unusedBytes=1495152, virtualFreeBytes=-55431057, initBytes=2555904, committedBytes=56926208, maxBytes=-1, usedBytes=55431056}]> #<DataPoint [__process-latency = {}]> #<DataPoint [__receive = {read_pos=1, write_pos=2, capacity=1024, population=1}]> #<DataPoint [__transfer = {read_pos=7458, write_pos=7458, capacity=1024, population=0}]> #<DataPoint [GC/PSScavenge = {count=2, timeMs=6}]> #<DataPoint [__execute-latency = {}]> #
<DataPoint [cpu/Util = 0.021242744000722888]>
#<DataPoint [__sendqueue = {read_pos=0, write_pos=0, capacity=1024, population=0}]> #<DataPoint [memory/heap = {unusedBytes=344822264, virtualFreeBytes=658870776, initBytes=268435456, committedBytes=402128896, maxBytes=716177408, usedBytes=57306632}]> #<DataPoint [__execute-count = {}]>]]
> Gathering worker and supervisor process information (CPU/Memory)
> ----------------------------------------------------------------
>
> Key: STORM-919
> URL: https://issues.apache.org/jira/browse/STORM-919
> Project: Apache Storm
> Issue Type: New Feature
> Reporter: Shyam Rajendran
> Assignee: Shyam Rajendran
> Priority: Minor
>
> It would be useful to have supervisor and worker process related information such as %cpu utilization, JVM memory and network bandwidth available to NIMBUS which would be useful for resource aware scheduler implementation later on. As a beginning, the information can be piggybacked on the existing heartbeats into the ZK or to the pacemaker as required.
> Related JIRAs
> STORM-177
> STORM-891
> STORM-899
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)