You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Nick R. Katsipoulakis" <ni...@gmail.com> on 2015/06/25 21:57:09 UTC

Problems with Storm EC2 setup

Hello all,

I apologize for the long message, but I have no idea what is going wrong in
my setup and I tried to give a lot of info about my cluster. I have the
following EC2 setup:

1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus

2) 4x m4.xlarge nodes for my Supervisors.

All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
(only) has the following values:

storm.home: "/opt/apache-storm-0.9.4"
storm.local.dir: "/mnt/storm"
storm.zookeeper.servers:
     - "172.31.28.73"
     - "172.31.38.251"
     - "172.31.38.252"
storm.zookeeper.port: 2181
storm.zookeeper.root: "/storm"
storm.zookeeper.session.timeout: 20000
storm.zookeeper.connection.timeout: 15000
storm.zookeeper.retry.times: 5
storm.zookeeper.retry.interval: 1000
storm.zookeeper.retry.invervalceiling.millis: 30000
storm.cluster.mode: "distributed"
storm.local.mode.zmq: false
storm.thrift.transport: "backtype.storm.security.auth.SimpleTransportPlugin"
storm.messaging.transport: "backtype.storm.messaging.netty.Context"

nimbus.host: "127.0.0.1"
nimbus.thrift.port: 6627
nimbus.thrift.max_buffer_size: 1048576
nimbus.thrift.threads: 256
nimbus.childopts: "-Xmx256m"
nimbus.task.timeout.secs: 30
nimbus.supervisor.timeout.secs: 60
nimbus.monitor.freq.secs: 10
nimbus.cleanup.inbox.freq.secs: 600
nimbus.inbox.jar.expiration.secs: 3600
nimbus.task.launch.secs: 120
nimbus.reassign: true
nimbus.file.copy.expiration.secs: 600
nimbus.topology.validator: "backtype.storm.nimbus.DefaultTopologyValidator"

ui.port: 8080
ui.childopts: "-Xmx768m"
logviewer.port: 8000
logviewer.childopts: "-Xmx256m"
logviewer.appender.name: "A1"

drpc.port: 3772
drpc.worker.threads: 64
drpc.queue.size: 128
drpc.invocations.port: 3773
drpc.request.timeout.secs: 600
drpc.childopts: "-Xmx768m"

transactional.zookeeper.root: "/transactional"
transactional.zookeeper.servers: null
transactional.zookeeper.port: null

supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703
supervisor.childopts: "-Xmx256m"
supervisor.worker.start.timeout.secs: 120
supervisor.worker.timeout.secs: 30
supervisor.monitor.frequency.secs: 3
supervisor.heartbeat.frequency.secs: 5
supervisor.enable: true

worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=128m
-XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
-Djava.net.preferIPv4Stack=true"
worker.heartbeat.frequency.secs: 1

task.heartbeat.frequency.secs: 3
task.refresh.poll.secs: 10

zmq.threads: 1
zmq.linger.millis: 5000
zmq.hwm: 0

storm.messaging.netty.server_worker_threads: 4
storm.messaging.netty.client_worker_threads: 4
storm.messaging.netty.buffer_size: 10485760
storm.messaging.netty.max_retries: 100
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100
topology.enable.message.timeouts: true
topology.debug: false
topology.optimize: true
topology.workers: 1
topology.acker.executors: null
topology.tasks: null
topology.message.timeout.secs: 30
topology.skip.missing.kryo.registrations: false
topology.max.task.parallelism: null
topology.max.spout.pending: null
topology.state.synchronization.timeout.secs: 60
topology.stats.sample.rate: 0.05
topology.builtin.metrics.bucket.size.secs: 60
topology.fall.back.on.java.serialization: true
topology.worker.childopts: null
topology.executor.receive.buffer.size: 1024
topology.executor.send.buffer.size: 1024
topology.receiver.buffer.size: 8
topology.transfer.buffer.size: 1024
topology.tick.tuple.freq.secs: null
topology.worker.shared.thread.pool.size: 4
topology.disruptor.wait.strategy: "com.lmax.disruptor.BlockingWaitStrategy"
topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
topology.sleep.spout.wait.strategy.time.ms: 1
topology.error.throttle.interval.secs: 10
topology.max.error.report.per.interval: 5
topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
topology.tuple.serializer:
"backtype.storm.serialization.types.ListDelegateSerializer"
topology.trident.batch.emit.interval.millis: 500

dev.zookeeper.path: "/tmp/dev-storm-zookeeper"

The problem is that every time I submit a topology, I got a lot of Netty
messages in my worker logs (found in the supervisor machines) and
many of them had similar to the following messages:

2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
[5]
2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
Starting
2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
172.31.38.252:2181 sessionTimeout=20000
watcher=org.apache.storm.curator.ConnectionState@5172aa5a
2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
authenticate using SASL (unknown error)
2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
established to 172.31.28.73/172.31.28.73:2181, initiating session
2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
= 0x14e2b0caa01005f, negotiated timeout = 20000
2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
State change: CONNECTED
2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
:connected:none
2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut down
2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
0x14e2b0caa01005f closed
2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
[5]
2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
Starting
2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
172.31.38.252:2181/storm sessionTimeout=20000
watcher=org.apache.storm.curator.ConnectionState@3f308697
2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
authenticate using SASL (unknown error)
2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
established to 172.31.28.73/172.31.28.73:2181, initiating session
2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
= 0x14e2b0caa010061, negotiated timeout = 20000
2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
State change: CONNECTED
2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
transport plugin:backtype.storm.messaging.netty.Context
2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
[stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
10485760
2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 0 ms
2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
10485760
2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 0 ms
2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
10485760
2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
scheduled to run in 0 ms
2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
10485760
2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6702
scheduled to run in 0 ms
2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
10485760
2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
scheduled to run in 0 ms
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
10485760
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29

and

2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
scheduled to run in 0 ms
2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
scheduled to run in 103 ms
2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 105 ms
2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
scheduled to run in 102 ms
2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
scheduled to run in 102 ms
2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
scheduled to run in 103 ms
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
172.31.19.252:6701
2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
scheduled to run in 102 ms
2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
172.31.19.253:6701
2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 105 ms
2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
scheduled to run in 105 ms
2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
scheduled to run in 106 ms
2015-06-25T19:42:33.275+0000 b.s

Why am I getting the above. Initially, I thought that the input rate of
tuples in my topology is too high, and Netty's buffers are filled up too
fast. However, I submitted a debug topology
that sent one tuple every 1 second and I still got the above messages.

Am I doing something wrong in my configuration? Why do I have the previous
Netty messages, which obviously show that something is going wrong? Please,
any hint on my setup will be really helpful.

Regards,
Nick

Re: Problems with Storm EC2 setup

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Hello all,

I know that this mailing list might not be the right place to ask the
following question, but, I have searched through the archives and asked for
help in the users' mailing list and could not get an answer. Please, any
ideas for the following?

Thank you very much for your time.

Regards,
Nick

2015-06-25 15:57 GMT-04:00 Nick R. Katsipoulakis <ni...@gmail.com>:

> Hello all,
>
> I apologize for the long message, but I have no idea what is going wrong
> in my setup and I tried to give a lot of info about my cluster. I have the
> following EC2 setup:
>
> 1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus
>
> 2) 4x m4.xlarge nodes for my Supervisors.
>
> All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
> Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
> (only) has the following values:
>
> storm.home: "/opt/apache-storm-0.9.4"
> storm.local.dir: "/mnt/storm"
> storm.zookeeper.servers:
>      - "172.31.28.73"
>      - "172.31.38.251"
>      - "172.31.38.252"
> storm.zookeeper.port: 2181
> storm.zookeeper.root: "/storm"
> storm.zookeeper.session.timeout: 20000
> storm.zookeeper.connection.timeout: 15000
> storm.zookeeper.retry.times: 5
> storm.zookeeper.retry.interval: 1000
> storm.zookeeper.retry.invervalceiling.millis: 30000
> storm.cluster.mode: "distributed"
> storm.local.mode.zmq: false
> storm.thrift.transport:
> "backtype.storm.security.auth.SimpleTransportPlugin"
> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>
> nimbus.host: "127.0.0.1"
> nimbus.thrift.port: 6627
> nimbus.thrift.max_buffer_size: 1048576
> nimbus.thrift.threads: 256
> nimbus.childopts: "-Xmx256m"
> nimbus.task.timeout.secs: 30
> nimbus.supervisor.timeout.secs: 60
> nimbus.monitor.freq.secs: 10
> nimbus.cleanup.inbox.freq.secs: 600
> nimbus.inbox.jar.expiration.secs: 3600
> nimbus.task.launch.secs: 120
> nimbus.reassign: true
> nimbus.file.copy.expiration.secs: 600
> nimbus.topology.validator: "backtype.storm.nimbus.DefaultTopologyValidator"
>
> ui.port: 8080
> ui.childopts: "-Xmx768m"
> logviewer.port: 8000
> logviewer.childopts: "-Xmx256m"
> logviewer.appender.name: "A1"
>
> drpc.port: 3772
> drpc.worker.threads: 64
> drpc.queue.size: 128
> drpc.invocations.port: 3773
> drpc.request.timeout.secs: 600
> drpc.childopts: "-Xmx768m"
>
> transactional.zookeeper.root: "/transactional"
> transactional.zookeeper.servers: null
> transactional.zookeeper.port: null
>
> supervisor.slots.ports:
>      - 6700
>      - 6701
>      - 6702
>      - 6703
> supervisor.childopts: "-Xmx256m"
> supervisor.worker.start.timeout.secs: 120
> supervisor.worker.timeout.secs: 30
> supervisor.monitor.frequency.secs: 3
> supervisor.heartbeat.frequency.secs: 5
> supervisor.enable: true
>
> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
> -Djava.net.preferIPv4Stack=true"
> worker.heartbeat.frequency.secs: 1
>
> task.heartbeat.frequency.secs: 3
> task.refresh.poll.secs: 10
>
> zmq.threads: 1
> zmq.linger.millis: 5000
> zmq.hwm: 0
>
> storm.messaging.netty.server_worker_threads: 4
> storm.messaging.netty.client_worker_threads: 4
> storm.messaging.netty.buffer_size: 10485760
> storm.messaging.netty.max_retries: 100
> storm.messaging.netty.max_wait_ms: 1000
> storm.messaging.netty.min_wait_ms: 100
> topology.enable.message.timeouts: true
> topology.debug: false
> topology.optimize: true
> topology.workers: 1
> topology.acker.executors: null
> topology.tasks: null
> topology.message.timeout.secs: 30
> topology.skip.missing.kryo.registrations: false
> topology.max.task.parallelism: null
> topology.max.spout.pending: null
> topology.state.synchronization.timeout.secs: 60
> topology.stats.sample.rate: 0.05
> topology.builtin.metrics.bucket.size.secs: 60
> topology.fall.back.on.java.serialization: true
> topology.worker.childopts: null
> topology.executor.receive.buffer.size: 1024
> topology.executor.send.buffer.size: 1024
> topology.receiver.buffer.size: 8
> topology.transfer.buffer.size: 1024
> topology.tick.tuple.freq.secs: null
> topology.worker.shared.thread.pool.size: 4
> topology.disruptor.wait.strategy: "com.lmax.disruptor.BlockingWaitStrategy"
> topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
> topology.sleep.spout.wait.strategy.time.ms: 1
> topology.error.throttle.interval.secs: 10
> topology.max.error.report.per.interval: 5
> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
> topology.tuple.serializer:
> "backtype.storm.serialization.types.ListDelegateSerializer"
> topology.trident.batch.emit.interval.millis: 500
>
> dev.zookeeper.path: "/tmp/dev-storm-zookeeper"
>
> The problem is that every time I submit a topology, I got a lot of Netty
> messages in my worker logs (found in the supervisor machines) and
> many of them had similar to the following messages:
>
> 2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
> [5]
> 2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
> Starting
> 2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
> 172.31.38.252:2181 sessionTimeout=20000
> watcher=org.apache.storm.curator.ConnectionState@5172aa5a
> 2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
> established to 172.31.28.73/172.31.28.73:2181, initiating session
> 2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
> establishment complete on server 172.31.28.73/172.31.28.73:2181,
> sessionid = 0x14e2b0caa01005f, negotiated timeout = 20000
> 2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
> State change: CONNECTED
> 2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
> :connected:none
> 2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
> down
> 2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
> 0x14e2b0caa01005f closed
> 2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
> [5]
> 2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
> Starting
> 2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
> 172.31.38.252:2181/storm sessionTimeout=20000
> watcher=org.apache.storm.curator.ConnectionState@3f308697
> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
> established to 172.31.28.73/172.31.28.73:2181, initiating session
> 2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
> establishment complete on server 172.31.28.73/172.31.28.73:2181,
> sessionid = 0x14e2b0caa010061, negotiated timeout = 20000
> 2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
> State change: CONNECTED
> 2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
> 2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
> transport plugin:backtype.storm.messaging.netty.Context
> 2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
> for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
> 2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
> Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
> 2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
> [stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
> 2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
> 10485760
> 2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6703 scheduled to run in 0 ms
> 2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6703 scheduled to run in 103 ms
> 2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
> 10485760
> 2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6703 scheduled to run in 0 ms
> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
> 10485760
> 2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6702 scheduled to run in 0 ms
> 2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
> 10485760
> 2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6702 scheduled to run in 0 ms
> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
> 10485760
> 2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6701 scheduled to run in 0 ms
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
> 10485760
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6703 scheduled to run in 103 ms
> 2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
>
> and
>
> 2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6700 scheduled to run in 0 ms
> 2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6700 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2
> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6700 scheduled to run in 103 ms
> 2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6703 scheduled to run in 105 ms
> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6700 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6700 scheduled to run in 102 ms
> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6702 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6702 scheduled to run in 102 ms
> 2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6700 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6700 scheduled to run in 103 ms
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6703 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6702 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6703 scheduled to run in 103 ms
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6701
> 2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6702 scheduled to run in 102 ms
> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6701
> 2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2
> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
> 172.31.13.184:6703 scheduled to run in 105 ms
> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2
> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6701 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6701 scheduled to run in 105 ms
> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2
> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6702 failed: java.lang.RuntimeException: Returned channel
> was actually not established
> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
> 172.31.19.254:6702 scheduled to run in 106 ms
> 2015-06-25T19:42:33.275+0000 b.s
>
> Why am I getting the above. Initially, I thought that the input rate of
> tuples in my topology is too high, and Netty's buffers are filled up too
> fast. However, I submitted a debug topology
> that sent one tuple every 1 second and I still got the above messages.
>
> Am I doing something wrong in my configuration? Why do I have the previous
> Netty messages, which obviously show that something is going wrong? Please,
> any hint on my setup will be really helpful.
>
> Regards,
> Nick
>



-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Fwd: Problems with Storm EC2 setup

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Hello,

I apologize for forwarding my question again, but it seems to be a general
issue in Storm 0.9.4 with Netty (I looked through the user mailing lists
yesterday and found similar problems). I mapped my host IPs to aliases in
the /etc/hosts file but the problem persists. Is it an open problem (or
design decision point) in Storm currently, or can I change something in my
configuration to resolve it.

Thank you very much for your time.

Regards,
Nick

---------- Forwarded message ----------
From: Nick R. Katsipoulakis <ni...@gmail.com>
Date: 2015-06-25 15:57 GMT-04:00
Subject: Problems with Storm EC2 setup
To: user@storm.apache.org


Hello all,

I apologize for the long message, but I have no idea what is going wrong in
my setup and I tried to give a lot of info about my cluster. I have the
following EC2 setup:

1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus

2) 4x m4.xlarge nodes for my Supervisors.

All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
(only) has the following values:

storm.home: "/opt/apache-storm-0.9.4"
storm.local.dir: "/mnt/storm"
storm.zookeeper.servers:
     - "172.31.28.73"
     - "172.31.38.251"
     - "172.31.38.252"
storm.zookeeper.port: 2181
storm.zookeeper.root: "/storm"
storm.zookeeper.session.timeout: 20000
storm.zookeeper.connection.timeout: 15000
storm.zookeeper.retry.times: 5
storm.zookeeper.retry.interval: 1000
storm.zookeeper.retry.invervalceiling.millis: 30000
storm.cluster.mode: "distributed"
storm.local.mode.zmq: false
storm.thrift.transport: "backtype.storm.security.auth.SimpleTransportPlugin"
storm.messaging.transport: "backtype.storm.messaging.netty.Context"

nimbus.host: "127.0.0.1"
nimbus.thrift.port: 6627
nimbus.thrift.max_buffer_size: 1048576
nimbus.thrift.threads: 256
nimbus.childopts: "-Xmx256m"
nimbus.task.timeout.secs: 30
nimbus.supervisor.timeout.secs: 60
nimbus.monitor.freq.secs: 10
nimbus.cleanup.inbox.freq.secs: 600
nimbus.inbox.jar.expiration.secs: 3600
nimbus.task.launch.secs: 120
nimbus.reassign: true
nimbus.file.copy.expiration.secs: 600
nimbus.topology.validator: "backtype.storm.nimbus.DefaultTopologyValidator"

ui.port: 8080
ui.childopts: "-Xmx768m"
logviewer.port: 8000
logviewer.childopts: "-Xmx256m"
logviewer.appender.name: "A1"

drpc.port: 3772
drpc.worker.threads: 64
drpc.queue.size: 128
drpc.invocations.port: 3773
drpc.request.timeout.secs: 600
drpc.childopts: "-Xmx768m"

transactional.zookeeper.root: "/transactional"
transactional.zookeeper.servers: null
transactional.zookeeper.port: null

supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703
supervisor.childopts: "-Xmx256m"
supervisor.worker.start.timeout.secs: 120
supervisor.worker.timeout.secs: 30
supervisor.monitor.frequency.secs: 3
supervisor.heartbeat.frequency.secs: 5
supervisor.enable: true

worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=128m
-XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
-Djava.net.preferIPv4Stack=true"
worker.heartbeat.frequency.secs: 1

task.heartbeat.frequency.secs: 3
task.refresh.poll.secs: 10

zmq.threads: 1
zmq.linger.millis: 5000
zmq.hwm: 0

storm.messaging.netty.server_worker_threads: 4
storm.messaging.netty.client_worker_threads: 4
storm.messaging.netty.buffer_size: 10485760
storm.messaging.netty.max_retries: 100
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100
topology.enable.message.timeouts: true
topology.debug: false
topology.optimize: true
topology.workers: 1
topology.acker.executors: null
topology.tasks: null
topology.message.timeout.secs: 30
topology.skip.missing.kryo.registrations: false
topology.max.task.parallelism: null
topology.max.spout.pending: null
topology.state.synchronization.timeout.secs: 60
topology.stats.sample.rate: 0.05
topology.builtin.metrics.bucket.size.secs: 60
topology.fall.back.on.java.serialization: true
topology.worker.childopts: null
topology.executor.receive.buffer.size: 1024
topology.executor.send.buffer.size: 1024
topology.receiver.buffer.size: 8
topology.transfer.buffer.size: 1024
topology.tick.tuple.freq.secs: null
topology.worker.shared.thread.pool.size: 4
topology.disruptor.wait.strategy: "com.lmax.disruptor.BlockingWaitStrategy"
topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
topology.sleep.spout.wait.strategy.time.ms: 1
topology.error.throttle.interval.secs: 10
topology.max.error.report.per.interval: 5
topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
topology.tuple.serializer:
"backtype.storm.serialization.types.ListDelegateSerializer"
topology.trident.batch.emit.interval.millis: 500

dev.zookeeper.path: "/tmp/dev-storm-zookeeper"

The problem is that every time I submit a topology, I got a lot of Netty
messages in my worker logs (found in the supervisor machines) and
many of them had similar to the following messages:

2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
[5]
2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
Starting
2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
172.31.38.252:2181 sessionTimeout=20000
watcher=org.apache.storm.curator.ConnectionState@5172aa5a
2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
authenticate using SASL (unknown error)
2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
established to 172.31.28.73/172.31.28.73:2181, initiating session
2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
= 0x14e2b0caa01005f, negotiated timeout = 20000
2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
State change: CONNECTED
2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
:connected:none
2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut down
2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
0x14e2b0caa01005f closed
2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
[5]
2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
Starting
2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
172.31.38.252:2181/storm sessionTimeout=20000
watcher=org.apache.storm.curator.ConnectionState@3f308697
2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
authenticate using SASL (unknown error)
2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
established to 172.31.28.73/172.31.28.73:2181, initiating session
2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
= 0x14e2b0caa010061, negotiated timeout = 20000
2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
State change: CONNECTED
2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
transport plugin:backtype.storm.messaging.netty.Context
2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
[stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
10485760
2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 0 ms
2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
10485760
2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 0 ms
2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
10485760
2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
scheduled to run in 0 ms
2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
10485760
2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6702
scheduled to run in 0 ms
2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
10485760
2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29
2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
scheduled to run in 0 ms
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
10485760
2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
maxRetries too large (100). Pinning to 29

and

2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
[INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
[100]
2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
scheduled to run in 0 ms
2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
scheduled to run in 103 ms
2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
scheduled to run in 105 ms
2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
scheduled to run in 102 ms
2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
scheduled to run in 102 ms
2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
scheduled to run in 103 ms
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
scheduled to run in 103 ms
2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
172.31.19.252:6701
2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2 to
Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
scheduled to run in 102 ms
2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
172.31.19.253:6701
2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
scheduled to run in 105 ms
2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
scheduled to run in 105 ms
2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
failed: java.lang.RuntimeException: Returned channel was actually not
established
2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3 to
Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
scheduled to run in 106 ms
2015-06-25T19:42:33.275+0000 b.s

Why am I getting the above. Initially, I thought that the input rate of
tuples in my topology is too high, and Netty's buffers are filled up too
fast. However, I submitted a debug topology
that sent one tuple every 1 second and I still got the above messages.

Am I doing something wrong in my configuration? Why do I have the previous
Netty messages, which obviously show that something is going wrong? Please,
any hint on my setup will be really helpful.

Regards,
Nick



-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Problems with Storm EC2 setup

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Hey,

So, I updated my /etc/hosts files with the hostnames and IP addresses, but
I still get the same error. You are right, somebody asked the same thing
yesterday and it received no answer.

Cheers,
Nick

Date: Wed, 24 Jun 2015 09:42:01 -0700
Message-ID: <CA...@mail.gmail.com>
Subject: AWS Hostnames not resolving properly Netty-Client-ip-10
From: Dillian Murphey <cr...@gmail.com>
To: user@storm.apache.org
Content-Type: multipart/alternative; boundary=001a113d2b16a1f6890519462fda

--001a113d2b16a1f6890519462fda
Content-Type: text/plain; charset=UTF-8

I'm seeing this reported elsewhere but finding no replies.

Does anyone know what the problem is here with host hostnames are resolved
by Storm when running on AWS?

Clearly that is not a hostname, prefixed with Netty-Client.

2015-06-24T07:13:07.856+0000 b.s.m.n.Client [INFO] connection attempt 24 to
Netty-Client-ip-10-9-255-20.us <http://netty-client-ip-10-9-255-20.us/>
-west-2.compute.internal/10.9.255.20:6711 scheduled to run in 387 ms

2015-06-24T07:13:08.244+0000 b.s.m.n.Client [ERROR] connection attempt 24
to Netty-Client-ip-10-9-255-20.us <http://netty-client-ip-10-9-255-20.us/>
-west-2.compute.internal/10.9.255.20:6711 failed:
java.lang.RuntimeException: Returned channel was actually not established



I think a work-around is to manually edit the /etc/hosts file. I had a
different issue I believe preventing me from fully testing that.



2015-06-25 16:14 GMT-04:00 Fan Jiang <dc...@gmail.com>:

> I remember someone post the same issue yesterday. The problem is that your
> host is somehow resolved as "Netty-Client-*", which is not pingable. You
> may modify /etc/hosts to map the hostnames to IP addresses appropriately if
> it is allowed.
>
> —
> Sincerely,
> Fan Jiang
>
>
> On Thu, Jun 25, 2015 at 3:57 PM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>>  Hello all,
>>
>> I apologize for the long message, but I have no idea what is going wrong
>> in my setup and I tried to give a lot of info about my cluster. I have the
>> following EC2 setup:
>>
>> 1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus
>>
>> 2) 4x m4.xlarge nodes for my Supervisors.
>>
>> All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
>> Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
>> (only) has the following values:
>>
>> storm.home: "/opt/apache-storm-0.9.4"
>> storm.local.dir: "/mnt/storm"
>> storm.zookeeper.servers:
>>      - "172.31.28.73"
>>      - "172.31.38.251"
>>      - "172.31.38.252"
>> storm.zookeeper.port: 2181
>> storm.zookeeper.root: "/storm"
>> storm.zookeeper.session.timeout: 20000
>> storm.zookeeper.connection.timeout: 15000
>> storm.zookeeper.retry.times: 5
>> storm.zookeeper.retry.interval: 1000
>> storm.zookeeper.retry.invervalceiling.millis: 30000
>> storm.cluster.mode: "distributed"
>> storm.local.mode.zmq: false
>> storm.thrift.transport:
>> "backtype.storm.security.auth.SimpleTransportPlugin"
>> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>
>> nimbus.host: "127.0.0.1"
>> nimbus.thrift.port: 6627
>> nimbus.thrift.max_buffer_size: 1048576
>> nimbus.thrift.threads: 256
>> nimbus.childopts: "-Xmx256m"
>> nimbus.task.timeout.secs: 30
>> nimbus.supervisor.timeout.secs: 60
>> nimbus.monitor.freq.secs: 10
>> nimbus.cleanup.inbox.freq.secs: 600
>> nimbus.inbox.jar.expiration.secs: 3600
>> nimbus.task.launch.secs: 120
>> nimbus.reassign: true
>> nimbus.file.copy.expiration.secs: 600
>> nimbus.topology.validator:
>> "backtype.storm.nimbus.DefaultTopologyValidator"
>>
>> ui.port: 8080
>> ui.childopts: "-Xmx768m"
>> logviewer.port: 8000
>> logviewer.childopts: "-Xmx256m"
>>  logviewer.appender.name: "A1"
>>
>> drpc.port: 3772
>> drpc.worker.threads: 64
>> drpc.queue.size: 128
>> drpc.invocations.port: 3773
>> drpc.request.timeout.secs: 600
>> drpc.childopts: "-Xmx768m"
>>
>> transactional.zookeeper.root: "/transactional"
>> transactional.zookeeper.servers: null
>> transactional.zookeeper.port: null
>>
>> supervisor.slots.ports:
>>      - 6700
>>      - 6701
>>      - 6702
>>      - 6703
>> supervisor.childopts: "-Xmx256m"
>> supervisor.worker.start.timeout.secs: 120
>> supervisor.worker.timeout.secs: 30
>> supervisor.monitor.frequency.secs: 3
>> supervisor.heartbeat.frequency.secs: 5
>> supervisor.enable: true
>>
>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>> -Djava.net.preferIPv4Stack=true"
>> worker.heartbeat.frequency.secs: 1
>>
>> task.heartbeat.frequency.secs: 3
>> task.refresh.poll.secs: 10
>>
>> zmq.threads: 1
>> zmq.linger.millis: 5000
>> zmq.hwm: 0
>>
>> storm.messaging.netty.server_worker_threads: 4
>> storm.messaging.netty.client_worker_threads: 4
>> storm.messaging.netty.buffer_size: 10485760
>> storm.messaging.netty.max_retries: 100
>> storm.messaging.netty.max_wait_ms: 1000
>> storm.messaging.netty.min_wait_ms: 100
>> topology.enable.message.timeouts: true
>> topology.debug: false
>> topology.optimize: true
>> topology.workers: 1
>> topology.acker.executors: null
>> topology.tasks: null
>> topology.message.timeout.secs: 30
>> topology.skip.missing.kryo.registrations: false
>> topology.max.task.parallelism: null
>> topology.max.spout.pending: null
>> topology.state.synchronization.timeout.secs: 60
>> topology.stats.sample.rate: 0.05
>> topology.builtin.metrics.bucket.size.secs: 60
>> topology.fall.back.on.java.serialization: true
>> topology.worker.childopts: null
>> topology.executor.receive.buffer.size: 1024
>> topology.executor.send.buffer.size: 1024
>> topology.receiver.buffer.size: 8
>> topology.transfer.buffer.size: 1024
>> topology.tick.tuple.freq.secs: null
>> topology.worker.shared.thread.pool.size: 4
>> topology.disruptor.wait.strategy:
>> "com.lmax.disruptor.BlockingWaitStrategy"
>> topology.spout.wait.strategy:
>> "backtype.storm.spout.SleepSpoutWaitStrategy"
>>  topology.sleep.spout.wait.strategy.time.ms: 1
>> topology.error.throttle.interval.secs: 10
>> topology.max.error.report.per.interval: 5
>> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
>> topology.tuple.serializer:
>> "backtype.storm.serialization.types.ListDelegateSerializer"
>> topology.trident.batch.emit.interval.millis: 500
>>
>> dev.zookeeper.path: "/tmp/dev-storm-zookeeper"
>>
>> The problem is that every time I submit a topology, I got a lot of Netty
>> messages in my worker logs (found in the supervisor machines) and
>> many of them had similar to the following messages:
>>
>> 2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
>> [5]
>> 2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
>> Starting
>> 2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
>> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
>> 172.31.38.252:2181 sessionTimeout=20000
>> watcher=org.apache.storm.curator.ConnectionState@5172aa5a
>> 2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
>> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>> 2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
>> established to 172.31.28.73/172.31.28.73:2181, initiating session
>> 2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
>> establishment complete on server 172.31.28.73/172.31.28.73:2181,
>> sessionid = 0x14e2b0caa01005f, negotiated timeout = 20000
>> 2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
>> State change: CONNECTED
>> 2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
>> :connected:none
>> 2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
>> down
>> 2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
>> 0x14e2b0caa01005f closed
>> 2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
>> [5]
>> 2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
>> Starting
>> 2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
>> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
>> 172.31.38.252:2181/storm sessionTimeout=20000
>> watcher=org.apache.storm.curator.ConnectionState@3f308697
>> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
>> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
>> established to 172.31.28.73/172.31.28.73:2181, initiating session
>> 2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
>> establishment complete on server 172.31.28.73/172.31.28.73:2181,
>> sessionid = 0x14e2b0caa010061, negotiated timeout = 20000
>> 2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
>> State change: CONNECTED
>> 2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
>> 2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
>> transport plugin:backtype.storm.messaging.netty.Context
>> 2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
>> for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
>> 2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
>> Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
>> 2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
>> [stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
>> 2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6702 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>>
>> and
>>
>> 2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6700 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6702 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6700 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6701
>> 2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6702 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6701
>> 2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 scheduled to run in 106 ms
>> 2015-06-25T19:42:33.275+0000 b.s
>>
>> Why am I getting the above. Initially, I thought that the input rate of
>> tuples in my topology is too high, and Netty's buffers are filled up too
>> fast. However, I submitted a debug topology
>> that sent one tuple every 1 second and I still got the above messages.
>>
>> Am I doing something wrong in my configuration? Why do I have the
>> previous Netty messages, which obviously show that something is going
>> wrong? Please, any hint on my setup will be really helpful.
>>
>> Regards,
>> Nick
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Problems with Storm EC2 setup

Posted by "Nick R. Katsipoulakis" <ni...@gmail.com>.
Thank you Fan for your response. Can you please give me some more
information on how to do what you suggest?

Regards,
Nick

2015-06-25 16:14 GMT-04:00 Fan Jiang <dc...@gmail.com>:

> I remember someone post the same issue yesterday. The problem is that your
> host is somehow resolved as "Netty-Client-*", which is not pingable. You
> may modify /etc/hosts to map the hostnames to IP addresses appropriately if
> it is allowed.
>
> —
> Sincerely,
> Fan Jiang
>
>
> On Thu, Jun 25, 2015 at 3:57 PM, Nick R. Katsipoulakis <
> nick.katsip@gmail.com> wrote:
>
>>  Hello all,
>>
>> I apologize for the long message, but I have no idea what is going wrong
>> in my setup and I tried to give a lot of info about my cluster. I have the
>> following EC2 setup:
>>
>> 1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus
>>
>> 2) 4x m4.xlarge nodes for my Supervisors.
>>
>> All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
>> Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
>> (only) has the following values:
>>
>> storm.home: "/opt/apache-storm-0.9.4"
>> storm.local.dir: "/mnt/storm"
>> storm.zookeeper.servers:
>>      - "172.31.28.73"
>>      - "172.31.38.251"
>>      - "172.31.38.252"
>> storm.zookeeper.port: 2181
>> storm.zookeeper.root: "/storm"
>> storm.zookeeper.session.timeout: 20000
>> storm.zookeeper.connection.timeout: 15000
>> storm.zookeeper.retry.times: 5
>> storm.zookeeper.retry.interval: 1000
>> storm.zookeeper.retry.invervalceiling.millis: 30000
>> storm.cluster.mode: "distributed"
>> storm.local.mode.zmq: false
>> storm.thrift.transport:
>> "backtype.storm.security.auth.SimpleTransportPlugin"
>> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
>>
>> nimbus.host: "127.0.0.1"
>> nimbus.thrift.port: 6627
>> nimbus.thrift.max_buffer_size: 1048576
>> nimbus.thrift.threads: 256
>> nimbus.childopts: "-Xmx256m"
>> nimbus.task.timeout.secs: 30
>> nimbus.supervisor.timeout.secs: 60
>> nimbus.monitor.freq.secs: 10
>> nimbus.cleanup.inbox.freq.secs: 600
>> nimbus.inbox.jar.expiration.secs: 3600
>> nimbus.task.launch.secs: 120
>> nimbus.reassign: true
>> nimbus.file.copy.expiration.secs: 600
>> nimbus.topology.validator:
>> "backtype.storm.nimbus.DefaultTopologyValidator"
>>
>> ui.port: 8080
>> ui.childopts: "-Xmx768m"
>> logviewer.port: 8000
>> logviewer.childopts: "-Xmx256m"
>>  logviewer.appender.name: "A1"
>>
>> drpc.port: 3772
>> drpc.worker.threads: 64
>> drpc.queue.size: 128
>> drpc.invocations.port: 3773
>> drpc.request.timeout.secs: 600
>> drpc.childopts: "-Xmx768m"
>>
>> transactional.zookeeper.root: "/transactional"
>> transactional.zookeeper.servers: null
>> transactional.zookeeper.port: null
>>
>> supervisor.slots.ports:
>>      - 6700
>>      - 6701
>>      - 6702
>>      - 6703
>> supervisor.childopts: "-Xmx256m"
>> supervisor.worker.start.timeout.secs: 120
>> supervisor.worker.timeout.secs: 30
>> supervisor.monitor.frequency.secs: 3
>> supervisor.heartbeat.frequency.secs: 5
>> supervisor.enable: true
>>
>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>> -Djava.net.preferIPv4Stack=true"
>> worker.heartbeat.frequency.secs: 1
>>
>> task.heartbeat.frequency.secs: 3
>> task.refresh.poll.secs: 10
>>
>> zmq.threads: 1
>> zmq.linger.millis: 5000
>> zmq.hwm: 0
>>
>> storm.messaging.netty.server_worker_threads: 4
>> storm.messaging.netty.client_worker_threads: 4
>> storm.messaging.netty.buffer_size: 10485760
>> storm.messaging.netty.max_retries: 100
>> storm.messaging.netty.max_wait_ms: 1000
>> storm.messaging.netty.min_wait_ms: 100
>> topology.enable.message.timeouts: true
>> topology.debug: false
>> topology.optimize: true
>> topology.workers: 1
>> topology.acker.executors: null
>> topology.tasks: null
>> topology.message.timeout.secs: 30
>> topology.skip.missing.kryo.registrations: false
>> topology.max.task.parallelism: null
>> topology.max.spout.pending: null
>> topology.state.synchronization.timeout.secs: 60
>> topology.stats.sample.rate: 0.05
>> topology.builtin.metrics.bucket.size.secs: 60
>> topology.fall.back.on.java.serialization: true
>> topology.worker.childopts: null
>> topology.executor.receive.buffer.size: 1024
>> topology.executor.send.buffer.size: 1024
>> topology.receiver.buffer.size: 8
>> topology.transfer.buffer.size: 1024
>> topology.tick.tuple.freq.secs: null
>> topology.worker.shared.thread.pool.size: 4
>> topology.disruptor.wait.strategy:
>> "com.lmax.disruptor.BlockingWaitStrategy"
>> topology.spout.wait.strategy:
>> "backtype.storm.spout.SleepSpoutWaitStrategy"
>>  topology.sleep.spout.wait.strategy.time.ms: 1
>> topology.error.throttle.interval.secs: 10
>> topology.max.error.report.per.interval: 5
>> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
>> topology.tuple.serializer:
>> "backtype.storm.serialization.types.ListDelegateSerializer"
>> topology.trident.batch.emit.interval.millis: 500
>>
>> dev.zookeeper.path: "/tmp/dev-storm-zookeeper"
>>
>> The problem is that every time I submit a topology, I got a lot of Netty
>> messages in my worker logs (found in the supervisor machines) and
>> many of them had similar to the following messages:
>>
>> 2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
>> [5]
>> 2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
>> Starting
>> 2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
>> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
>> 172.31.38.252:2181 sessionTimeout=20000
>> watcher=org.apache.storm.curator.ConnectionState@5172aa5a
>> 2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
>> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>> 2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
>> established to 172.31.28.73/172.31.28.73:2181, initiating session
>> 2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
>> establishment complete on server 172.31.28.73/172.31.28.73:2181,
>> sessionid = 0x14e2b0caa01005f, negotiated timeout = 20000
>> 2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
>> State change: CONNECTED
>> 2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
>> :connected:none
>> 2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut
>> down
>> 2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
>> 0x14e2b0caa01005f closed
>> 2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
>> [5]
>> 2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
>> Starting
>> 2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
>> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
>> 172.31.38.252:2181/storm sessionTimeout=20000
>> watcher=org.apache.storm.curator.ConnectionState@3f308697
>> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
>> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
>> authenticate using SASL (unknown error)
>> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
>> established to 172.31.28.73/172.31.28.73:2181, initiating session
>> 2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
>> establishment complete on server 172.31.28.73/172.31.28.73:2181,
>> sessionid = 0x14e2b0caa010061, negotiated timeout = 20000
>> 2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
>> State change: CONNECTED
>> 2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
>> 2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
>> transport plugin:backtype.storm.messaging.netty.Context
>> 2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
>> for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
>> 2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
>> Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
>> 2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
>> [stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
>> 2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6702 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>> 2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
>> connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
>> 10485760
>> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
>> maxRetries too large (100). Pinning to 29
>>
>> and
>>
>> 2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
>> [100]
>> 2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 scheduled to run in 0 ms
>> 2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6700 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6703 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6700 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6702 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6700 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6700 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6703 scheduled to run in 103 ms
>> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
>> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
>> 172.31.19.252:6701
>> 2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6702 scheduled to run in 102 ms
>> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
>> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
>> 172.31.19.253:6701
>> 2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/
>> 172.31.13.184:6703 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6701 scheduled to run in 105 ms
>> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 failed: java.lang.RuntimeException: Returned channel
>> was actually not established
>> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3
>> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/
>> 172.31.19.254:6702 scheduled to run in 106 ms
>> 2015-06-25T19:42:33.275+0000 b.s
>>
>> Why am I getting the above. Initially, I thought that the input rate of
>> tuples in my topology is too high, and Netty's buffers are filled up too
>> fast. However, I submitted a debug topology
>> that sent one tuple every 1 second and I still got the above messages.
>>
>> Am I doing something wrong in my configuration? Why do I have the
>> previous Netty messages, which obviously show that something is going
>> wrong? Please, any hint on my setup will be really helpful.
>>
>> Regards,
>> Nick
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Problems with Storm EC2 setup

Posted by Fan Jiang <dc...@gmail.com>.
I remember someone post the same issue yesterday. The problem is that your host is somehow resolved as "Netty-Client-*", which is not pingable. You may modify /etc/hosts to map the hostnames to IP addresses appropriately if it is allowed.



—
Sincerely,
Fan Jiang

On Thu, Jun 25, 2015 at 3:57 PM, Nick R. Katsipoulakis
<ni...@gmail.com> wrote:

> Hello all,
> I apologize for the long message, but I have no idea what is going wrong in
> my setup and I tried to give a lot of info about my cluster. I have the
> following EC2 setup:
> 1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus
> 2) 4x m4.xlarge nodes for my Supervisors.
> All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache
> Storm v0.9.4. The storm.yaml I am currently having in the nimbus node
> (only) has the following values:
> storm.home: "/opt/apache-storm-0.9.4"
> storm.local.dir: "/mnt/storm"
> storm.zookeeper.servers:
>      - "172.31.28.73"
>      - "172.31.38.251"
>      - "172.31.38.252"
> storm.zookeeper.port: 2181
> storm.zookeeper.root: "/storm"
> storm.zookeeper.session.timeout: 20000
> storm.zookeeper.connection.timeout: 15000
> storm.zookeeper.retry.times: 5
> storm.zookeeper.retry.interval: 1000
> storm.zookeeper.retry.invervalceiling.millis: 30000
> storm.cluster.mode: "distributed"
> storm.local.mode.zmq: false
> storm.thrift.transport: "backtype.storm.security.auth.SimpleTransportPlugin"
> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
> nimbus.host: "127.0.0.1"
> nimbus.thrift.port: 6627
> nimbus.thrift.max_buffer_size: 1048576
> nimbus.thrift.threads: 256
> nimbus.childopts: "-Xmx256m"
> nimbus.task.timeout.secs: 30
> nimbus.supervisor.timeout.secs: 60
> nimbus.monitor.freq.secs: 10
> nimbus.cleanup.inbox.freq.secs: 600
> nimbus.inbox.jar.expiration.secs: 3600
> nimbus.task.launch.secs: 120
> nimbus.reassign: true
> nimbus.file.copy.expiration.secs: 600
> nimbus.topology.validator: "backtype.storm.nimbus.DefaultTopologyValidator"
> ui.port: 8080
> ui.childopts: "-Xmx768m"
> logviewer.port: 8000
> logviewer.childopts: "-Xmx256m"
> logviewer.appender.name: "A1"
> drpc.port: 3772
> drpc.worker.threads: 64
> drpc.queue.size: 128
> drpc.invocations.port: 3773
> drpc.request.timeout.secs: 600
> drpc.childopts: "-Xmx768m"
> transactional.zookeeper.root: "/transactional"
> transactional.zookeeper.servers: null
> transactional.zookeeper.port: null
> supervisor.slots.ports:
>      - 6700
>      - 6701
>      - 6702
>      - 6703
> supervisor.childopts: "-Xmx256m"
> supervisor.worker.start.timeout.secs: 120
> supervisor.worker.timeout.secs: 30
> supervisor.monitor.frequency.secs: 3
> supervisor.heartbeat.frequency.secs: 5
> supervisor.enable: true
> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
> -Djava.net.preferIPv4Stack=true"
> worker.heartbeat.frequency.secs: 1
> task.heartbeat.frequency.secs: 3
> task.refresh.poll.secs: 10
> zmq.threads: 1
> zmq.linger.millis: 5000
> zmq.hwm: 0
> storm.messaging.netty.server_worker_threads: 4
> storm.messaging.netty.client_worker_threads: 4
> storm.messaging.netty.buffer_size: 10485760
> storm.messaging.netty.max_retries: 100
> storm.messaging.netty.max_wait_ms: 1000
> storm.messaging.netty.min_wait_ms: 100
> topology.enable.message.timeouts: true
> topology.debug: false
> topology.optimize: true
> topology.workers: 1
> topology.acker.executors: null
> topology.tasks: null
> topology.message.timeout.secs: 30
> topology.skip.missing.kryo.registrations: false
> topology.max.task.parallelism: null
> topology.max.spout.pending: null
> topology.state.synchronization.timeout.secs: 60
> topology.stats.sample.rate: 0.05
> topology.builtin.metrics.bucket.size.secs: 60
> topology.fall.back.on.java.serialization: true
> topology.worker.childopts: null
> topology.executor.receive.buffer.size: 1024
> topology.executor.send.buffer.size: 1024
> topology.receiver.buffer.size: 8
> topology.transfer.buffer.size: 1024
> topology.tick.tuple.freq.secs: null
> topology.worker.shared.thread.pool.size: 4
> topology.disruptor.wait.strategy: "com.lmax.disruptor.BlockingWaitStrategy"
> topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
> topology.sleep.spout.wait.strategy.time.ms: 1
> topology.error.throttle.interval.secs: 10
> topology.max.error.report.per.interval: 5
> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory"
> topology.tuple.serializer:
> "backtype.storm.serialization.types.ListDelegateSerializer"
> topology.trident.batch.emit.interval.millis: 500
> dev.zookeeper.path: "/tmp/dev-storm-zookeeper"
> The problem is that every time I submit a topology, I got a lot of Netty
> messages in my worker logs (found in the supervisor machines) and
> many of them had similar to the following messages:
> 2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
> [5]
> 2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
> Starting
> 2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
> 172.31.38.252:2181 sessionTimeout=20000
> watcher=org.apache.storm.curator.ConnectionState@5172aa5a
> 2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
> established to 172.31.28.73/172.31.28.73:2181, initiating session
> 2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session
> establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
> = 0x14e2b0caa01005f, negotiated timeout = 20000
> 2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
> State change: CONNECTED
> 2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update:
> :connected:none
> 2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut down
> 2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session:
> 0x14e2b0caa01005f closed
> 2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries
> [5]
> 2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO]
> Starting
> 2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client
> connection, connectString=172.31.28.73:2181,172.31.38.251:2181,
> 172.31.38.252:2181/storm sessionTimeout=20000
> watcher=org.apache.storm.curator.ConnectionState@3f308697
> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket
> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection
> established to 172.31.28.73/172.31.28.73:2181, initiating session
> 2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session
> establishment complete on server 172.31.28.73/172.31.28.73:2181, sessionid
> = 0x14e2b0caa010061, negotiated timeout = 20000
> 2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO]
> State change: CONNECTED
> 2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments.
> 2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer
> transport plugin:backtype.storm.messaging.netty.Context
> 2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread
> for 58e551ba-f944-4aec-9c8f-5621053021dd:6703
> 2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server
> Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4
> 2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread:
> [stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ]
> 2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize:
> 10485760
> 2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
> scheduled to run in 103 ms
> 2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize:
> 10485760
> 2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize:
> 10485760
> 2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize:
> 10485760
> 2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6702
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize:
> 10485760
> 2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> 2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client,
> connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize:
> 10485760
> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
> scheduled to run in 103 ms
> 2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN]
> maxRetries too large (100). Pinning to 29
> and
> 2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry
> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries
> [100]
> 2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
> scheduled to run in 0 ms
> 2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6700
> scheduled to run in 103 ms
> 2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6703
> scheduled to run in 105 ms
> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6700
> scheduled to run in 102 ms
> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6702
> scheduled to run in 102 ms
> 2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6700
> scheduled to run in 103 ms
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/172.31.19.252:6703
> scheduled to run in 103 ms
> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established
> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/
> 172.31.19.252:6701
> 2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2 to
> Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/172.31.19.253:6702
> scheduled to run in 102 ms
> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established
> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/
> 172.31.19.253:6701
> 2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/172.31.13.184:6703
> scheduled to run in 105 ms
> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6701
> scheduled to run in 105 ms
> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
> failed: java.lang.RuntimeException: Returned channel was actually not
> established
> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3 to
> Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/172.31.19.254:6702
> scheduled to run in 106 ms
> 2015-06-25T19:42:33.275+0000 b.s
> Why am I getting the above. Initially, I thought that the input rate of
> tuples in my topology is too high, and Netty's buffers are filled up too
> fast. However, I submitted a debug topology
> that sent one tuple every 1 second and I still got the above messages.
> Am I doing something wrong in my configuration? Why do I have the previous
> Netty messages, which obviously show that something is going wrong? Please,
> any hint on my setup will be really helpful.
> Regards,
> Nick