You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Erik Weathers (JIRA)" <ji...@apache.org> on 2015/09/23 02:40:04 UTC
[jira] [Updated] (STORM-763) nimbus reassigned worker A to another
machine, but other worker's netty client can't connect to the new worker A
[ https://issues.apache.org/jira/browse/STORM-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Weathers updated STORM-763:
--------------------------------
Description:
Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
java version "1.7.0_03"
storm 0.9.4
cluster 50+ machines
my topology have 50+ worker, it can't emit 50000 thousand tuples in ten minutes.
sometimes one worker is reassigned to another machine by nimbus because of task heartbeat timeout:
{code}
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[440 440] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[90 90] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[510 510] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[160 160] not alive
{code}
i can see the reassigned worker is already started in storm UI, but other worker write error log all the time:
{code}
2015-04-08T16:56:43.091+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:45.715+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:45.716+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.277+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.278+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.835+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
{code}
The worker of destined host is already started, and i can telnet 192.168.163.19 5700.
however, why the netty client can't connect to the ip:port?
was:
Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
java version "1.7.0_03"
storm 0.9.4
cluster 50+ machines
my topology have 50+ worker, it can't emit 50000 thousand tuples in ten minutes.
sometimes one worker is reassigned to another machine by nimbus because of task heartbeat timeout:
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[440 440] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[90 90] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[510 510] not alive
2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[160 160] not alive
i can see the reassigned worker is already started in storm UI, but other worker write error log all the time:
2015-04-08T16:56:43.091+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:45.715+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:45.716+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.277+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.278+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
2015-04-08T16:56:46.835+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
The worker of destined host is already started, and i can telnet 192.168.163.19 5700.
however, why the netty client can't connect to the ip:port?
> nimbus reassigned worker A to another machine, but other worker's netty client can't connect to the new worker A
> -----------------------------------------------------------------------------------------------------------------
>
> Key: STORM-763
> URL: https://issues.apache.org/jira/browse/STORM-763
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 0.9.4
> Environment: Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
> java version "1.7.0_03"
> storm 0.9.4
> cluster 50+ machines
> Reporter: 3in
> Assignee: Enno Shioji
> Fix For: 0.9.6
>
>
> Debian 3.16.3-2~bpo70+1 (2014-09-21) x86_64 GNU/Linux
> java version "1.7.0_03"
> storm 0.9.4
> cluster 50+ machines
> my topology have 50+ worker, it can't emit 50000 thousand tuples in ten minutes.
> sometimes one worker is reassigned to another machine by nimbus because of task heartbeat timeout:
> {code}
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[440 440] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[90 90] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[510 510] not alive
> 2015-04-08T16:51:23.026+0800 b.s.d.nimbus [INFO] Executor my_topology-22-1428243953:[160 160] not alive
> {code}
> i can see the reassigned worker is already started in storm UI, but other worker write error log all the time:
> {code}
> 2015-04-08T16:56:43.091+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:45.660+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:45.715+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:45.716+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.277+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.278+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.306+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> 2015-04-08T16:56:46.586+0800 b.s.m.n.Client [ERROR] dropping 1 message(s) destined for Netty-Client-host_19/192.168.163.19:5700
> 2015-04-08T16:56:46.835+0800 b.s.m.n.Client [ERROR] connection to Netty-Client-host_19/192.168.163.19:5700 is unavailable
> {code}
> The worker of destined host is already started, and i can telnet 192.168.163.19 5700.
> however, why the netty client can't connect to the ip:port?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)