You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Nick R. Katsipoulakis" <ni...@gmail.com> on 2015/06/24 20:14:38 UTC

Worker thread memory

Hello all,

I am working on an EC2 Storm cluster, and I want the workers in the
supervisor machines to use 4GBs of memory, so I add the following line in
the machine that hosts the nimbus:

worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=128m
-XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
Djava.net.preferIPv4Stack=true
However, when I take a look into the workers' logs (on each other machine
who is running a supervisor), I do not find the above line on the part that
launches the worker with the given arguments. In fact, I find the following
line:

2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
"/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
"topology.builtin.metrics.bucket.size.secs" 60,
"topology.fall.back.on.java.serialization" true,
"topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
"topology.skip.missing.kryo.registrations" false,
"storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
"storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
"topology.trident.batch.emit.interval.millis" 500, "
storm.messaging.netty.flush.check.interval.ms" 10,
"nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
"java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
"/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
"storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
10485760, "supervisor.worker.start.timeout.secs" 120,
"topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
"storm.meta.serialization.delegate"
"backtype.storm.serialization.DefaultSerializationDelegate",
"topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
"storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
"transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
"/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
"supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
"storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
"transactional.zookeeper.root" "/transactional", "topology.acker.executors"
nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
"drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
"supervisor.heartbeat.frequency.secs" 5,
"topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
"supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
"topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
"topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
"topology.spout.wait.strategy"
"backtype.storm.spout.SleepSpoutWaitStrategy",
"nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
"storm.zookeeper.retry.interval" 1000, "
topology.sleep.spout.wait.strategy.time.ms" 1, "nimbus.topology.validator"
"backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
[6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
"nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
"topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
"topology.workers" 1, "supervisor.childopts" "-Xmx256m",
"nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
"worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
"backtype.storm.serialization.types.ListDelegateSerializer",
"topology.disruptor.wait.strategy"
"com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
"backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
"storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
"backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
"backtype.storm.security.auth.SimpleTransportPlugin",
"topology.state.synchronization.timeout.secs" 60,
"supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000,
"drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port"
8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode" "distributed",
"topology.max.task.parallelism" nil,
"storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
nil}

which as you can see uses topology.worker.childopts: nil and
worker.childops: -Xmx768m. My question is the following: Do I need to add
the above line in the storm.yaml files of my supervisor nodes in order to
allow the JVM to use up to 4GBs of memory? Also, am I setting the right
value for what I am trying to achieve?

Thanks,
Nick

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
You have to specify how many workers you want when building the topology.
If you lookup the TopologyBuilder there are examples of how to set the
number of workers in the config map.  And you should check the nimbus log
only to find out where it put the worker process, and then check the logs
of the worker process to see why it failed to start up.

On Thu, Jun 25, 2015 at 11:27 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> I see.
>
> Well, I took a look at the nimbus.log and everything looks fine and it
> still seems strange why this is happening. On top of that, another strange
> thing is that all my bolts are placed in the same supervisor and the same
> worker (which does not seem too smart for Storm to do). My topology defines
> a total of parallelism hint of 23 tasks and I have 4 supervisor nodes, each
> one with 4 worker processes.
>
> Nick
>
> 2015-06-25 11:22 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>
>> I'm not sure but if I had to wager a guess the former is set on the
>> supervisor and will be applied to all topologies run on that supervisor,
>> whereas the latter is set per topology.
>>
>> On Thu, Jun 25, 2015 at 11:19 AM, Nick R. Katsipoulakis <
>> nick.katsip@gmail.com> wrote:
>>
>>> I see. I will try to debug and see what's going on. Also, what is the
>>> difference between worker.childopts and topology.worker.childopts?
>>>
>>> Thanks,
>>> Nick
>>>
>>> 2015-06-25 11:10 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>
>>>> The nimbus log will tell you which port the worker was started on (look
>>>> for the worker hash, it will give supervisor node and port assignments but
>>>> requires some decoding).  Then take a look at the worker log.  Maybe your
>>>> initialization is taking too long?
>>>>
>>>> On Thu, Jun 25, 2015 at 11:06 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> Yes, I see the following message which I have not seen before:
>>>>>
>>>>> 2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>> 2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>> 2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>> 2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>> 2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
>>>>> storm id tpch-q5-top-5-1435172243
>>>>> 2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>> 2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>>
>>>>> 2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
>>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
>>>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
>>>>> clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
>>>>> supervisor time: 1435172810. State: :not-started, Heartbeat: nil
>>>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
>>>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>>>> 2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
>>>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>>>> 2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
>>>>> supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
>>>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
>>>>> 0x24e26a304b50025 closed
>>>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread
>>>>> shut down
>>>>>
>>>>> But no indication on why the above is happening.
>>>>>
>>>>> Thanks,
>>>>> Nick
>>>>>
>>>>> 2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>
>>>>>> Any problems in supervisor or nimbus logs?
>>>>>>
>>>>>> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>
>>>>>>> I am using m4.xlarge instances, each one with 4 workers per
>>>>>>> supervisor. Yes, they are listed.
>>>>>>>
>>>>>>> Nick
>>>>>>>
>>>>>>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>>
>>>>>>>> How big are your EC2 instances?  Are your supervisors listed in the
>>>>>>>> storm UI?
>>>>>>>>
>>>>>>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Nathan,
>>>>>>>>>
>>>>>>>>> I attempted to put the following line
>>>>>>>>>
>>>>>>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>>>>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>>> Djava.net.preferIPv4Stack=true"
>>>>>>>>>
>>>>>>>>> in the supervisor config files, but for some reason workers were
>>>>>>>>> not spawned on those machines. To be more precise, I submitted my topology
>>>>>>>>> (with storm jar...) and I just waited for it to start executing, but
>>>>>>>>> nothing. Any ideas of what might have been the reason?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Nick
>>>>>>>>>
>>>>>>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> In general worker options need to be set in the supervisor config
>>>>>>>>>> files.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello sy.pan
>>>>>>>>>>>
>>>>>>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Nick
>>>>>>>>>>>
>>>>>>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> FYI:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com>
>>>>>>>>>>>> 写道:
>>>>>>>>>>>>
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in
>>>>>>>>>>>> the supervisor machines to use 4GBs of memory, so I add the following line
>>>>>>>>>>>> in the machine that hosts the nimbus:
>>>>>>>>>>>>
>>>>>>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>>>>>>> However, when I take a look into the workers' logs (on each
>>>>>>>>>>>> other machine who is running a supervisor), I do not find the above line on
>>>>>>>>>>>> the part that launches the worker with the given arguments. In fact, I find
>>>>>>>>>>>> the following line:
>>>>>>>>>>>>
>>>>>>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching
>>>>>>>>>>>> worker for tpch-q5-top-2-1435168361 on
>>>>>>>>>>>> 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with id
>>>>>>>>>>>> 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>>>>>>> "storm.meta.serialization.delegate"
>>>>>>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>>>>>>> "topology.spout.wait.strategy"
>>>>>>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>>>>>>> "nimbus.topology.validator"
>>>>>>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>>>>>>> logviewer.appender.name" "A1",
>>>>>>>>>>>> "storm.messaging.netty.max_wait_ms" 1000, "drpc.request.timeout.secs" 600,
>>>>>>>>>>>> "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts"
>>>>>>>>>>>> "-Xmx1024m", "storm.cluster.mode" "distributed",
>>>>>>>>>>>> "topology.max.task.parallelism" nil,
>>>>>>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>>>>>>> nil}
>>>>>>>>>>>>
>>>>>>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>>>>>>> value for what I am trying to achieve?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Nick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nikolaos Romanos Katsipoulakis,
>>>>> University of Pittsburgh, PhD candidate
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
I see.

Well, I took a look at the nimbus.log and everything looks fine and it
still seems strange why this is happening. On top of that, another strange
thing is that all my bolts are placed in the same supervisor and the same
worker (which does not seem too smart for Storm to do). My topology defines
a total of parallelism hint of 23 tasks and I have 4 supervisor nodes, each
one with 4 worker processes.

Nick

2015-06-25 11:22 GMT-04:00 Nathan Leung <nc...@gmail.com>:

> I'm not sure but if I had to wager a guess the former is set on the
> supervisor and will be applied to all topologies run on that supervisor,
> whereas the latter is set per topology.
>
> On Thu, Jun 25, 2015 at 11:19 AM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>> I see. I will try to debug and see what's going on. Also, what is the
>> difference between worker.childopts and topology.worker.childopts?
>>
>> Thanks,
>> Nick
>>
>> 2015-06-25 11:10 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>
>>> The nimbus log will tell you which port the worker was started on (look
>>> for the worker hash, it will give supervisor node and port assignments but
>>> requires some decoding).  Then take a look at the worker log.  Maybe your
>>> initialization is taking too long?
>>>
>>> On Thu, Jun 25, 2015 at 11:06 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> Yes, I see the following message which I have not seen before:
>>>>
>>>> 2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>> 2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>> 2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>> 2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>> 2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
>>>> storm id tpch-q5-top-5-1435172243
>>>> 2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>> 2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>>
>>>> 2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
>>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
>>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
>>>> clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
>>>> supervisor time: 1435172810. State: :not-started, Heartbeat: nil
>>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
>>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>>> 2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
>>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>>> 2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
>>>> supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
>>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
>>>> 0x24e26a304b50025 closed
>>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
>>>> down
>>>>
>>>> But no indication on why the above is happening.
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>> 2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>
>>>>> Any problems in supervisor or nimbus logs?
>>>>>
>>>>> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
>>>>> nick.katsip@gmail.com> wrote:
>>>>>
>>>>>> I am using m4.xlarge instances, each one with 4 workers per
>>>>>> supervisor. Yes, they are listed.
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>
>>>>>>> How big are your EC2 instances?  Are your supervisors listed in the
>>>>>>> storm UI?
>>>>>>>
>>>>>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>
>>>>>>>> Nathan,
>>>>>>>>
>>>>>>>> I attempted to put the following line
>>>>>>>>
>>>>>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>>>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>> Djava.net.preferIPv4Stack=true"
>>>>>>>>
>>>>>>>> in the supervisor config files, but for some reason workers were
>>>>>>>> not spawned on those machines. To be more precise, I submitted my topology
>>>>>>>> (with storm jar...) and I just waited for it to start executing, but
>>>>>>>> nothing. Any ideas of what might have been the reason?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Nick
>>>>>>>>
>>>>>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>>>
>>>>>>>>> In general worker options need to be set in the supervisor config
>>>>>>>>> files.
>>>>>>>>>
>>>>>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello sy.pan
>>>>>>>>>>
>>>>>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Nick
>>>>>>>>>>
>>>>>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> FYI:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com>
>>>>>>>>>>> 写道:
>>>>>>>>>>>
>>>>>>>>>>> Hello all,
>>>>>>>>>>>
>>>>>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in
>>>>>>>>>>> the supervisor machines to use 4GBs of memory, so I add the following line
>>>>>>>>>>> in the machine that hosts the nimbus:
>>>>>>>>>>>
>>>>>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>>>>>> However, when I take a look into the workers' logs (on each
>>>>>>>>>>> other machine who is running a supervisor), I do not find the above line on
>>>>>>>>>>> the part that launches the worker with the given arguments. In fact, I find
>>>>>>>>>>> the following line:
>>>>>>>>>>>
>>>>>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching
>>>>>>>>>>> worker for tpch-q5-top-2-1435168361 on
>>>>>>>>>>> 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with id
>>>>>>>>>>> 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>>>>>> "storm.meta.serialization.delegate"
>>>>>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>>>>>> "topology.spout.wait.strategy"
>>>>>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>>>>>> "nimbus.topology.validator"
>>>>>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>>>>>> logviewer.appender.name" "A1",
>>>>>>>>>>> "storm.messaging.netty.max_wait_ms" 1000, "drpc.request.timeout.secs" 600,
>>>>>>>>>>> "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts"
>>>>>>>>>>> "-Xmx1024m", "storm.cluster.mode" "distributed",
>>>>>>>>>>> "topology.max.task.parallelism" nil,
>>>>>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>>>>>> nil}
>>>>>>>>>>>
>>>>>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>>>>>> value for what I am trying to achieve?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Nick
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>> University of Pittsburgh, PhD candidate
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nikolaos Romanos Katsipoulakis,
>>>> University of Pittsburgh, PhD candidate
>>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
I'm not sure but if I had to wager a guess the former is set on the
supervisor and will be applied to all topologies run on that supervisor,
whereas the latter is set per topology.

On Thu, Jun 25, 2015 at 11:19 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> I see. I will try to debug and see what's going on. Also, what is the
> difference between worker.childopts and topology.worker.childopts?
>
> Thanks,
> Nick
>
> 2015-06-25 11:10 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>
>> The nimbus log will tell you which port the worker was started on (look
>> for the worker hash, it will give supervisor node and port assignments but
>> requires some decoding).  Then take a look at the worker log.  Maybe your
>> initialization is taking too long?
>>
>> On Thu, Jun 25, 2015 at 11:06 AM, Nick R. Katsipoulakis <
>> nick.katsip@gmail.com> wrote:
>>
>>> Yes, I see the following message which I have not seen before:
>>>
>>> 2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>> 2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>> 2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>> 2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>> 2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
>>> storm id tpch-q5-top-5-1435172243
>>> 2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>> 2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>>
>>> 2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
>>> fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
>>> clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
>>> supervisor time: 1435172810. State: :not-started, Heartbeat: nil
>>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>> 2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
>>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>>> 2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
>>> supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
>>> 0x24e26a304b50025 closed
>>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
>>> down
>>>
>>> But no indication on why the above is happening.
>>>
>>> Thanks,
>>> Nick
>>>
>>> 2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>
>>>> Any problems in supervisor or nimbus logs?
>>>>
>>>> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> I am using m4.xlarge instances, each one with 4 workers per
>>>>> supervisor. Yes, they are listed.
>>>>>
>>>>> Nick
>>>>>
>>>>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>
>>>>>> How big are your EC2 instances?  Are your supervisors listed in the
>>>>>> storm UI?
>>>>>>
>>>>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>
>>>>>>> Nathan,
>>>>>>>
>>>>>>> I attempted to put the following line
>>>>>>>
>>>>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>> Djava.net.preferIPv4Stack=true"
>>>>>>>
>>>>>>> in the supervisor config files, but for some reason workers were not
>>>>>>> spawned on those machines. To be more precise, I submitted my topology
>>>>>>> (with storm jar...) and I just waited for it to start executing, but
>>>>>>> nothing. Any ideas of what might have been the reason?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nick
>>>>>>>
>>>>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>>
>>>>>>>> In general worker options need to be set in the supervisor config
>>>>>>>> files.
>>>>>>>>
>>>>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello sy.pan
>>>>>>>>>
>>>>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Nick
>>>>>>>>>
>>>>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> FYI:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com>
>>>>>>>>>> 写道:
>>>>>>>>>>
>>>>>>>>>> Hello all,
>>>>>>>>>>
>>>>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in
>>>>>>>>>> the supervisor machines to use 4GBs of memory, so I add the following line
>>>>>>>>>> in the machine that hosts the nimbus:
>>>>>>>>>>
>>>>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>>>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>>>>>>> following line:
>>>>>>>>>>
>>>>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker
>>>>>>>>>> for tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703
>>>>>>>>>> with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>>>>> "storm.meta.serialization.delegate"
>>>>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>>>>> "topology.spout.wait.strategy"
>>>>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>>>>> "nimbus.topology.validator"
>>>>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>>>>> logviewer.appender.name" "A1",
>>>>>>>>>> "storm.messaging.netty.max_wait_ms" 1000, "drpc.request.timeout.secs" 600,
>>>>>>>>>> "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts"
>>>>>>>>>> "-Xmx1024m", "storm.cluster.mode" "distributed",
>>>>>>>>>> "topology.max.task.parallelism" nil,
>>>>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>>>>> nil}
>>>>>>>>>>
>>>>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>>>>> value for what I am trying to achieve?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Nick
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nikolaos Romanos Katsipoulakis,
>>>>> University of Pittsburgh, PhD candidate
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
I see. I will try to debug and see what's going on. Also, what is the
difference between worker.childopts and topology.worker.childopts?

Thanks,
Nick

2015-06-25 11:10 GMT-04:00 Nathan Leung <nc...@gmail.com>:

> The nimbus log will tell you which port the worker was started on (look
> for the worker hash, it will give supervisor node and port assignments but
> requires some decoding).  Then take a look at the worker log.  Maybe your
> initialization is taking too long?
>
> On Thu, Jun 25, 2015 at 11:06 AM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>> Yes, I see the following message which I have not seen before:
>>
>> 2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>> 2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>> 2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>> 2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>> 2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
>> storm id tpch-q5-top-5-1435172243
>> 2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>> 2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>>
>> 2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
>> fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
>> clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
>> supervisor time: 1435172810. State: :not-started, Heartbeat: nil
>> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>> 2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
>> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
>> 2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
>> supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
>> 0x24e26a304b50025 closed
>> 2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
>> down
>>
>> But no indication on why the above is happening.
>>
>> Thanks,
>> Nick
>>
>> 2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>
>>> Any problems in supervisor or nimbus logs?
>>>
>>> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> I am using m4.xlarge instances, each one with 4 workers per supervisor.
>>>> Yes, they are listed.
>>>>
>>>> Nick
>>>>
>>>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>
>>>>> How big are your EC2 instances?  Are your supervisors listed in the
>>>>> storm UI?
>>>>>
>>>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>>>> nick.katsip@gmail.com> wrote:
>>>>>
>>>>>> Nathan,
>>>>>>
>>>>>> I attempted to put the following line
>>>>>>
>>>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>> Djava.net.preferIPv4Stack=true"
>>>>>>
>>>>>> in the supervisor config files, but for some reason workers were not
>>>>>> spawned on those machines. To be more precise, I submitted my topology
>>>>>> (with storm jar...) and I just waited for it to start executing, but
>>>>>> nothing. Any ideas of what might have been the reason?
>>>>>>
>>>>>> Thanks,
>>>>>> Nick
>>>>>>
>>>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>>
>>>>>>> In general worker options need to be set in the supervisor config
>>>>>>> files.
>>>>>>>
>>>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello sy.pan
>>>>>>>>
>>>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Nick
>>>>>>>>
>>>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>>>
>>>>>>>>> FYI:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com>
>>>>>>>>> 写道:
>>>>>>>>>
>>>>>>>>> Hello all,
>>>>>>>>>
>>>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in
>>>>>>>>> the supervisor machines to use 4GBs of memory, so I add the following line
>>>>>>>>> in the machine that hosts the nimbus:
>>>>>>>>>
>>>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC
>>>>>>>>> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>>>>>> following line:
>>>>>>>>>
>>>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker
>>>>>>>>> for tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703
>>>>>>>>> with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>>>> "storm.meta.serialization.delegate"
>>>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>>>> "topology.spout.wait.strategy"
>>>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>>>> "nimbus.topology.validator"
>>>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>>>> logviewer.appender.name" "A1",
>>>>>>>>> "storm.messaging.netty.max_wait_ms" 1000, "drpc.request.timeout.secs" 600,
>>>>>>>>> "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts"
>>>>>>>>> "-Xmx1024m", "storm.cluster.mode" "distributed",
>>>>>>>>> "topology.max.task.parallelism" nil,
>>>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>>>> nil}
>>>>>>>>>
>>>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>>>> value for what I am trying to achieve?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Nick
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>> University of Pittsburgh, PhD candidate
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nikolaos Romanos Katsipoulakis,
>>>> University of Pittsburgh, PhD candidate
>>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
The nimbus log will tell you which port the worker was started on (look for
the worker hash, it will give supervisor node and port assignments but
requires some decoding).  Then take a look at the worker log.  Maybe your
initialization is taking too long?

On Thu, Jun 25, 2015 at 11:06 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> Yes, I see the following message which I have not seen before:
>
> 2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
> 2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
> 2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
> 2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
> 2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
> storm id tpch-q5-top-5-1435172243
> 2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
> 2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
> fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
>
> 2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
> fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
> clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
> supervisor time: 1435172810. State: :not-started, Heartbeat: nil
> 2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
> 2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
> 58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
> 2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
> supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
> 2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
> 2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
> 0x24e26a304b50025 closed
> 2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
> down
>
> But no indication on why the above is happening.
>
> Thanks,
> Nick
>
> 2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>
>> Any problems in supervisor or nimbus logs?
>>
>> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
>> nick.katsip@gmail.com> wrote:
>>
>>> I am using m4.xlarge instances, each one with 4 workers per supervisor.
>>> Yes, they are listed.
>>>
>>> Nick
>>>
>>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>
>>>> How big are your EC2 instances?  Are your supervisors listed in the
>>>> storm UI?
>>>>
>>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> Nathan,
>>>>>
>>>>> I attempted to put the following line
>>>>>
>>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>> Djava.net.preferIPv4Stack=true"
>>>>>
>>>>> in the supervisor config files, but for some reason workers were not
>>>>> spawned on those machines. To be more precise, I submitted my topology
>>>>> (with storm jar...) and I just waited for it to start executing, but
>>>>> nothing. Any ideas of what might have been the reason?
>>>>>
>>>>> Thanks,
>>>>> Nick
>>>>>
>>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>>
>>>>>> In general worker options need to be set in the supervisor config
>>>>>> files.
>>>>>>
>>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>>> nick.katsip@gmail.com> wrote:
>>>>>>
>>>>>>> Hello sy.pan
>>>>>>>
>>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Nick
>>>>>>>
>>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>>
>>>>>>>> FYI:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>>
>>>>>>>>
>>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com>
>>>>>>>> 写道:
>>>>>>>>
>>>>>>>> Hello all,
>>>>>>>>
>>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>>>>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>>>>>>> the machine that hosts the nimbus:
>>>>>>>>
>>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>>>>> following line:
>>>>>>>>
>>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker
>>>>>>>> for tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703
>>>>>>>> with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>>> "storm.meta.serialization.delegate"
>>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>>> "topology.spout.wait.strategy"
>>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>>> "nimbus.topology.validator"
>>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>>>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>>>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>>>>>> "distributed", "topology.max.task.parallelism" nil,
>>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>>> nil}
>>>>>>>>
>>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>>> value for what I am trying to achieve?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Nick
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>>> University of Pittsburgh, PhD candidate
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nikolaos Romanos Katsipoulakis,
>>>>> University of Pittsburgh, PhD candidate
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Yes, I see the following message which I have not seen before:

2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
storm id tpch-q5-top-5-1435172243
2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started

2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
supervisor time: 1435172810. State: :not-started, Heartbeat: nil
2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
0x24e26a304b50025 closed
2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut down

But no indication on why the above is happening.

Thanks,
Nick

2015-06-25 10:52 GMT-04:00 Nathan Leung <nc...@gmail.com>:

> Any problems in supervisor or nimbus logs?
>
> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>> I am using m4.xlarge instances, each one with 4 workers per supervisor.
>> Yes, they are listed.
>>
>> Nick
>>
>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>
>>> How big are your EC2 instances?  Are your supervisors listed in the
>>> storm UI?
>>>
>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> Nathan,
>>>>
>>>> I attempted to put the following line
>>>>
>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>> Djava.net.preferIPv4Stack=true"
>>>>
>>>> in the supervisor config files, but for some reason workers were not
>>>> spawned on those machines. To be more precise, I submitted my topology
>>>> (with storm jar...) and I just waited for it to start executing, but
>>>> nothing. Any ideas of what might have been the reason?
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>>
>>>>> In general worker options need to be set in the supervisor config
>>>>> files.
>>>>>
>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>> nick.katsip@gmail.com> wrote:
>>>>>
>>>>>> Hello sy.pan
>>>>>>
>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>
>>>>>> Cheers,
>>>>>> Nick
>>>>>>
>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>>
>>>>>>> FYI:
>>>>>>>
>>>>>>>
>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>>
>>>>>>>
>>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>>>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>>>>>> the machine that hosts the nimbus:
>>>>>>>
>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>>>> following line:
>>>>>>>
>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker
>>>>>>> for tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703
>>>>>>> with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>> "storm.meta.serialization.delegate"
>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>> "topology.spout.wait.strategy"
>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>> "nimbus.topology.validator"
>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>>>>> "distributed", "topology.max.task.parallelism" nil,
>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>> nil}
>>>>>>>
>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>> value for what I am trying to achieve?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>> University of Pittsburgh, PhD candidate
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nikolaos Romanos Katsipoulakis,
>>>> University of Pittsburgh, PhD candidate
>>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
Any problems in supervisor or nimbus logs?

On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> I am using m4.xlarge instances, each one with 4 workers per supervisor.
> Yes, they are listed.
>
> Nick
>
> 2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>
>> How big are your EC2 instances?  Are your supervisors listed in the storm
>> UI?
>>
>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>> nick.katsip@gmail.com> wrote:
>>
>>> Nathan,
>>>
>>> I attempted to put the following line
>>>
>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>> Djava.net.preferIPv4Stack=true"
>>>
>>> in the supervisor config files, but for some reason workers were not
>>> spawned on those machines. To be more precise, I submitted my topology
>>> (with storm jar...) and I just waited for it to start executing, but
>>> nothing. Any ideas of what might have been the reason?
>>>
>>> Thanks,
>>> Nick
>>>
>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>>
>>>> In general worker options need to be set in the supervisor config files.
>>>>
>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> Hello sy.pan
>>>>>
>>>>> Thank you for the link. I will try the suggestions.
>>>>>
>>>>> Cheers,
>>>>> Nick
>>>>>
>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>>
>>>>>> FYI:
>>>>>>
>>>>>>
>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>>
>>>>>>
>>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>>>>> the machine that hosts the nimbus:
>>>>>>
>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>> Djava.net.preferIPv4Stack=true
>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>>> following line:
>>>>>>
>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
>>>>>> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
>>>>>> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>> "storm.meta.serialization.delegate"
>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>> "topology.spout.wait.strategy"
>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>> "nimbus.topology.validator"
>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>> "topology.disruptor.wait.strategy"
>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>>>> "distributed", "topology.max.task.parallelism" nil,
>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>> nil}
>>>>>>
>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>> value for what I am trying to achieve?
>>>>>>
>>>>>> Thanks,
>>>>>> Nick
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nikolaos Romanos Katsipoulakis,
>>>>> University of Pittsburgh, PhD candidate
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
I am using m4.xlarge instances, each one with 4 workers per supervisor.
Yes, they are listed.

Nick

2015-06-25 10:47 GMT-04:00 Nathan Leung <nc...@gmail.com>:

> How big are your EC2 instances?  Are your supervisors listed in the storm
> UI?
>
> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>> Nathan,
>>
>> I attempted to put the following line
>>
>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>> Djava.net.preferIPv4Stack=true"
>>
>> in the supervisor config files, but for some reason workers were not
>> spawned on those machines. To be more precise, I submitted my topology
>> (with storm jar...) and I just waited for it to start executing, but
>> nothing. Any ideas of what might have been the reason?
>>
>> Thanks,
>> Nick
>>
>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>>
>>> In general worker options need to be set in the supervisor config files.
>>>
>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> Hello sy.pan
>>>>
>>>> Thank you for the link. I will try the suggestions.
>>>>
>>>> Cheers,
>>>> Nick
>>>>
>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>>
>>>>> FYI:
>>>>>
>>>>>
>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>>
>>>>>
>>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>>>> the machine that hosts the nimbus:
>>>>>
>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>> Djava.net.preferIPv4Stack=true
>>>>> However, when I take a look into the workers' logs (on each other
>>>>> machine who is running a supervisor), I do not find the above line on the
>>>>> part that launches the worker with the given arguments. In fact, I find the
>>>>> following line:
>>>>>
>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
>>>>> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
>>>>> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>> "topology.fall.back.on.java.serialization" true,
>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>> "storm.meta.serialization.delegate"
>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>> "topology.spout.wait.strategy"
>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>> "nimbus.topology.validator"
>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>> "topology.disruptor.wait.strategy"
>>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>>> "distributed", "topology.max.task.parallelism" nil,
>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>> nil}
>>>>>
>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>> value for what I am trying to achieve?
>>>>>
>>>>> Thanks,
>>>>> Nick
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nikolaos Romanos Katsipoulakis,
>>>> University of Pittsburgh, PhD candidate
>>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
How big are your EC2 instances?  Are your supervisors listed in the storm
UI?

On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> Nathan,
>
> I attempted to put the following line
>
> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
> Djava.net.preferIPv4Stack=true"
>
> in the supervisor config files, but for some reason workers were not
> spawned on those machines. To be more precise, I submitted my topology
> (with storm jar...) and I just waited for it to start executing, but
> nothing. Any ideas of what might have been the reason?
>
> Thanks,
> Nick
>
> 2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:
>
>> In general worker options need to be set in the supervisor config files.
>>
>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>> nick.katsip@gmail.com> wrote:
>>
>>> Hello sy.pan
>>>
>>> Thank you for the link. I will try the suggestions.
>>>
>>> Cheers,
>>> Nick
>>>
>>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>>
>>>> FYI:
>>>>
>>>>
>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>>
>>>>
>>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>>>
>>>> Hello all,
>>>>
>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>>> the machine that hosts the nimbus:
>>>>
>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>> Djava.net.preferIPv4Stack=true
>>>> However, when I take a look into the workers' logs (on each other
>>>> machine who is running a supervisor), I do not find the above line on the
>>>> part that launches the worker with the given arguments. In fact, I find the
>>>> following line:
>>>>
>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
>>>> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
>>>> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>> "topology.fall.back.on.java.serialization" true,
>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>> "topology.skip.missing.kryo.registrations" false,
>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>> "storm.meta.serialization.delegate"
>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>> "topology.spout.wait.strategy"
>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>>> "storm.zookeeper.retry.interval" 1000, "
>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>> "nimbus.topology.validator"
>>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>> "topology.disruptor.wait.strategy"
>>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>> "topology.state.synchronization.timeout.secs" 60,
>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>> "distributed", "topology.max.task.parallelism" nil,
>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>> nil}
>>>>
>>>> which as you can see uses topology.worker.childopts: nil and
>>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>> value for what I am trying to achieve?
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Nikolaos Romanos Katsipoulakis,
>>> University of Pittsburgh, PhD candidate
>>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Nathan,

I attempted to put the following line

worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:CMSInitiatingOccupancyFraction=70
-XX: -CMSConcurrentMTEnabled Djava.net.preferIPv4Stack=true"

in the supervisor config files, but for some reason workers were not
spawned on those machines. To be more precise, I submitted my topology
(with storm jar...) and I just waited for it to start executing, but
nothing. Any ideas of what might have been the reason?

Thanks,
Nick

2015-06-25 10:39 GMT-04:00 Nathan Leung <nc...@gmail.com>:

> In general worker options need to be set in the supervisor config files.
>
> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>> Hello sy.pan
>>
>> Thank you for the link. I will try the suggestions.
>>
>> Cheers,
>> Nick
>>
>> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>>
>>> FYI:
>>>
>>>
>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>>
>>>
>>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>>
>>> Hello all,
>>>
>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>> supervisor machines to use 4GBs of memory, so I add the following line in
>>> the machine that hosts the nimbus:
>>>
>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>> Djava.net.preferIPv4Stack=true
>>> However, when I take a look into the workers' logs (on each other
>>> machine who is running a supervisor), I do not find the above line on the
>>> part that launches the worker with the given arguments. In fact, I find the
>>> following line:
>>>
>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
>>> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
>>> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>> "topology.fall.back.on.java.serialization" true,
>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>> "topology.skip.missing.kryo.registrations" false,
>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>> "topology.trident.batch.emit.interval.millis" 500, "
>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>> "storm.meta.serialization.delegate"
>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>> "supervisor.heartbeat.frequency.secs" 5,
>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>> "topology.spout.wait.strategy"
>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>>> "storm.zookeeper.retry.interval" 1000, "
>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>> "nimbus.topology.validator"
>>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>> "topology.disruptor.wait.strategy"
>>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>> "topology.state.synchronization.timeout.secs" 60,
>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>> "distributed", "topology.max.task.parallelism" nil,
>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>> nil}
>>>
>>> which as you can see uses topology.worker.childopts: nil and
>>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>>> the above line in the storm.yaml files of my supervisor nodes in order to
>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>> value for what I am trying to achieve?
>>>
>>> Thanks,
>>> Nick
>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by Nathan Leung <nc...@gmail.com>.
In general worker options need to be set in the supervisor config files.

On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
nick.katsip@gmail.com> wrote:

> Hello sy.pan
>
> Thank you for the link. I will try the suggestions.
>
> Cheers,
> Nick
>
> 2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:
>
>> FYI:
>>
>>
>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>>
>>
>> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>>
>> Hello all,
>>
>> I am working on an EC2 Storm cluster, and I want the workers in the
>> supervisor machines to use 4GBs of memory, so I add the following line in
>> the machine that hosts the nimbus:
>>
>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>> Djava.net.preferIPv4Stack=true
>> However, when I take a look into the workers' logs (on each other machine
>> who is running a supervisor), I do not find the above line on the part that
>> launches the worker with the given arguments. In fact, I find the following
>> line:
>>
>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
>> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
>> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>> "topology.builtin.metrics.bucket.size.secs" 60,
>> "topology.fall.back.on.java.serialization" true,
>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>> "topology.skip.missing.kryo.registrations" false,
>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>> "topology.trident.batch.emit.interval.millis" 500, "
>> storm.messaging.netty.flush.check.interval.ms" 10,
>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>> "storm.meta.serialization.delegate"
>> "backtype.storm.serialization.DefaultSerializationDelegate",
>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
>> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>> "supervisor.heartbeat.frequency.secs" 5,
>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>> "topology.spout.wait.strategy"
>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
>> "storm.zookeeper.retry.interval" 1000, "
>> topology.sleep.spout.wait.strategy.time.ms" 1,
>> "nimbus.topology.validator"
>> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>> "backtype.storm.serialization.types.ListDelegateSerializer",
>> "topology.disruptor.wait.strategy"
>> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
>> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>> "backtype.storm.security.auth.SimpleTransportPlugin",
>> "topology.state.synchronization.timeout.secs" 60,
>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000,
>> "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port"
>> 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode" "distributed",
>> "topology.max.task.parallelism" nil,
>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>> nil}
>>
>> which as you can see uses topology.worker.childopts: nil and
>> worker.childops: -Xmx768m. My question is the following: Do I need to add
>> the above line in the storm.yaml files of my supervisor nodes in order to
>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>> value for what I am trying to achieve?
>>
>> Thanks,
>> Nick
>>
>>
>>
>
>
> --
> Nikolaos Romanos Katsipoulakis,
> University of Pittsburgh, PhD candidate
>

Re: Worker thread memory

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Hello sy.pan

Thank you for the link. I will try the suggestions.

Cheers,
Nick

2015-06-24 22:35 GMT-04:00 sy.pan <sh...@gmail.com>:

> FYI:
>
>
> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E
>
>
> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
>
> Hello all,
>
> I am working on an EC2 Storm cluster, and I want the workers in the
> supervisor machines to use 4GBs of memory, so I add the following line in
> the machine that hosts the nimbus:
>
> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
> Djava.net.preferIPv4Stack=true
> However, when I take a look into the workers' logs (on each other machine
> who is running a supervisor), I do not find the above line on the part that
> launches the worker with the given arguments. In fact, I find the following
> line:
>
> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for
> tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with
> id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path"
> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
> "topology.builtin.metrics.bucket.size.secs" 60,
> "topology.fall.back.on.java.serialization" true,
> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
> "topology.skip.missing.kryo.registrations" false,
> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m",
> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
> "topology.trident.batch.emit.interval.millis" 500, "
> storm.messaging.netty.flush.check.interval.ms" 10,
> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home"
> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
> 10485760, "supervisor.worker.start.timeout.secs" 120,
> "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs"
> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
> "storm.meta.serialization.delegate"
> "backtype.storm.serialization.DefaultSerializationDelegate",
> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163",
> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
> "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size"
> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4,
> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"],
> "transactional.zookeeper.root" "/transactional", "topology.acker.executors"
> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil,
> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
> "supervisor.heartbeat.frequency.secs" 5,
> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772,
> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
> "topology.spout.wait.strategy"
> "backtype.storm.spout.SleepSpoutWaitStrategy",
> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil,
> "storm.zookeeper.retry.interval" 1000, "
> topology.sleep.spout.wait.strategy.time.ms" 1,
> "nimbus.topology.validator"
> "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports"
> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false,
> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
> "backtype.storm.serialization.types.ListDelegateSerializer",
> "topology.disruptor.wait.strategy"
> "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer"
> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30,
> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
> "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port"
> 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times"
> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
> "backtype.storm.security.auth.SimpleTransportPlugin",
> "topology.state.synchronization.timeout.secs" 60,
> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
> 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "
> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms" 1000,
> "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port"
> 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode" "distributed",
> "topology.max.task.parallelism" nil,
> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
> nil}
>
> which as you can see uses topology.worker.childopts: nil and
> worker.childops: -Xmx768m. My question is the following: Do I need to add
> the above line in the storm.yaml files of my supervisor nodes in order to
> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
> value for what I am trying to achieve?
>
> Thanks,
> Nick
>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Posted by "sy.pan" <sh...@gmail.com>.
FYI:

https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E <https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3CCAFBccRCAdux8SL8D99tOMrBG9HkMo3gkg-qdV-qKMC-6zXs8ow@mail.gmail.com%3E>


> 在 2015年6月25日,02:14,Nick R. Katsipoulakis <ni...@gmail.com> 写道:
> 
> Hello all, 
> 
> I am working on an EC2 Storm cluster, and I want the workers in the supervisor machines to use 4GBs of memory, so I add the following line in the machine that hosts the nimbus:
> 
> worker.childopts	-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled Djava.net.preferIPv4Stack=true
> However, when I take a look into the workers' logs (on each other machine who is running a supervisor), I do not find the above line on the part that launches the worker with the given arguments. In fact, I find the following line:
> 
> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker for tpch-q5-top-2-1435168361 on 5568726d-ad65-4a7c-ba52-32eed83276ad:6703 with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil, "topology.builtin.metrics.bucket.size.secs" 60, "topology.fall.back.on.java.serialization" true, "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000, "topology.skip.missing.kryo.registrations" false, "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" "-Xmx768m", "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true, "topology.trident.batch.emit.interval.millis" 500, "storm.messaging.netty.flush.check.interval.ms <http://storm.messaging.netty.flush.check.interval.ms/>" 10, "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m", "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", "storm.home" "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024, "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size" 10485760, "supervisor.worker.start.timeout.secs" 120, "topology.enable.message.timeouts" true, "nimbus.cleanup.inbox.freq.secs" 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64, "storm.meta.serialization.delegate" "backtype.storm.serialization.DefaultSerializationDelegate", "topology.worker.shared.thread.pool.size" 4, "nimbus.host" "52.25.74.163", "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181, "transactional.zookeeper.port" nil, "topology.executor.receive.buffer.size" 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root" "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000, "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 4, "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" "172.31.38.252"], "transactional.zookeeper.root" "/transactional", "topology.acker.executors" nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" nil, "drpc.queue.size" 128, "worker.childopts" "-Xmx768m", "supervisor.heartbeat.frequency.secs" 5, "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 3772, "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m", "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3, "topology.tasks" nil, "storm.messaging.netty.max_retries" 100, "topology.spout.wait.strategy" "backtype.storm.spout.SleepSpoutWaitStrategy", "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" nil, "storm.zookeeper.retry.interval" 1000, "topology.sleep.spout.wait.strategy.time.ms <http://topology.sleep.spout.wait.strategy.time.ms/>" 1, "nimbus.topology.validator" "backtype.storm.nimbus.DefaultTopologyValidator", "supervisor.slots.ports" [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" false, "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60, "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10, "topology.workers" 1, "supervisor.childopts" "-Xmx256m", "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05, "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer" "backtype.storm.serialization.types.ListDelegateSerializer", "topology.disruptor.wait.strategy" "com.lmax.disruptor.BlockingWaitStrategy", "topology.multilang.serializer" "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 30, "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory" "backtype.storm.serialization.DefaultKryoFactory", "drpc.invocations.port" 3773, "logviewer.port" 8000, "zmq.threads" 1, "storm.zookeeper.retry.times" 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport" "backtype.storm.security.auth.SimpleTransportPlugin", "topology.state.synchronization.timeout.secs" 60, "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs" 600, "storm.messaging.transport" "backtype.storm.messaging.netty.Context", "logviewer.appender.name <http://logviewer.appender.name/>" "A1", "storm.messaging.netty.max_wait_ms" 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false, "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode" "distributed", "topology.max.task.parallelism" nil, "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath" nil}
> 
> which as you can see uses topology.worker.childopts: nil and worker.childops: -Xmx768m. My question is the following: Do I need to add the above line in the storm.yaml files of my supervisor nodes in order to allow the JVM to use up to 4GBs of memory? Also, am I setting the right value for what I am trying to achieve?
> 
> Thanks,
> Nick