You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Junfeng Chen <da...@gmail.com> on 2016/09/27 09:14:26 UTC

Workers restart every 2 minutes problem

I found  workers in cluster always restart themselves and the executors up
time is never more than 2 minutes.
In every restart loop, the workers run well after restart, where it execute
and emit tuples normally. Then when the uptime is near 2 minutes, all
statistic data is reset to zero including the up time.

Here is the error log before every restart behavior:

2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6734
2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6731
2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6731
2016-09-27 17:10:38.832 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6738
2016-09-27 17:10:38.832 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6738
2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6741
2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6741
2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6737
2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6737
2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6740
2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6740
2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm08/132.228.28.151:6751
2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm08/132.228.28.151:6751
2016-09-27 17:10:40.894 o.a.s.d.worker [INFO] Shutting down worker
DPI-10-1474620300 e0060b80-1f9b-4ea3-8ccf-fec15a2293d9 6737
2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm07/132.228.28.150:6748
2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm07/132.228.28.150:6748
2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm02/132.228.28.137:6737
2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm02/132.228.28.137:6737
2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm07/132.228.28.150:6733
2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
send 0 pending messages to Netty-Client-storm07/132.228.28.150:6733
2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] closing Netty Client
Netty-Client-storm07/132.228.28.150:6752


Regard,
Junfeng Chen

Re: Workers restart every 2 minutes problem

Posted by Junfeng Chen <da...@gmail.com>.
I found them in supervisor.log. It seems supervisor restarts workers every
2 minutes

2016-09-27 17:12:56.855 o.a.s.d.supervisor [INFO] Shut down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:8abce5ea-02ca-41c6-8bdc-02f1e45919fa
2016-09-27 17:12:56.855 o.a.s.d.supervisor [INFO] Shutting down and
clearing state for id d2afc0da-4ace-458f-8eff-54d70d5a0e56. Current
supervisor time: 1474967572. State: :disallowed, Heartbeat: {:time-secs
1474967572, :storm-id "DPI-10-1474620300", :executors [[301 301] [245 245]
[-1 -1] [21 21] [189 189] [133 133] [77 77]], :port 6734}
2016-09-27 17:12:56.855 o.a.s.d.supervisor [INFO] Shutting down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:d2afc0da-4ace-458f-8eff-54d70d5a0e56
2016-09-27 17:12:56.855 o.a.s.config [INFO] GET worker-user
d2afc0da-4ace-458f-8eff-54d70d5a0e56
2016-09-27 17:12:56.858 o.a.s.d.supervisor [INFO] Sleep 1 seconds for
execution of cleanup threads on worker.
2016-09-27 17:12:56.944 o.a.s.d.supervisor [INFO] Worker Process
8abce5ea-02ca-41c6-8bdc-02f1e45919fa exited with code: 137
2016-09-27 17:12:57.869 o.a.s.config [INFO] REMOVE worker-user
d2afc0da-4ace-458f-8eff-54d70d5a0e56
2016-09-27 17:12:57.869 o.a.s.d.supervisor [INFO] Shut down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:d2afc0da-4ace-458f-8eff-54d70d5a0e56
2016-09-27 17:12:57.869 o.a.s.d.supervisor [INFO] Shutting down and
clearing state for id ded8d43c-f01f-421a-ae39-048a9ac0842d. Current
supervisor time: 1474967572. State: :disallowed, Heartbeat: {:time-secs
1474967572, :storm-id "DPI-10-1474620300", :executors [[80 80] [248 248]
[24 24] [192 192] [-1 -1] [304 304] [136 136]], :port 6735}
2016-09-27 17:12:57.869 o.a.s.d.supervisor [INFO] Shutting down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:ded8d43c-f01f-421a-ae39-048a9ac0842d
2016-09-27 17:12:57.869 o.a.s.config [INFO] GET worker-user
ded8d43c-f01f-421a-ae39-048a9ac0842d
2016-09-27 17:12:57.872 o.a.s.d.supervisor [INFO] Sleep 1 seconds for
execution of cleanup threads on worker.
2016-09-27 17:12:57.941 o.a.s.d.supervisor [INFO] Worker Process
d2afc0da-4ace-458f-8eff-54d70d5a0e56 exited with code: 137
2016-09-27 17:12:58.883 o.a.s.config [INFO] REMOVE worker-user
ded8d43c-f01f-421a-ae39-048a9ac0842d
2016-09-27 17:12:58.883 o.a.s.d.supervisor [INFO] Shut down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:ded8d43c-f01f-421a-ae39-048a9ac0842d
2016-09-27 17:12:58.883 o.a.s.d.supervisor [INFO] Shutting down and
clearing state for id 5a51f57d-5c4d-4a99-9f04-5e5ff192ea05. Current
supervisor time: 1474967572. State: :disallowed, Heartbeat: {:time-secs
1474967572, :storm-id "DPI-10-1474620300", :executors [[186 186] [18 18]
[-1 -1] [298 298] [242 242] [74 74] [130 130]], :port 6733}
2016-09-27 17:12:58.883 o.a.s.d.supervisor [INFO] Shutting down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:5a51f57d-5c4d-4a99-9f04-5e5ff192ea05
2016-09-27 17:12:58.883 o.a.s.config [INFO] GET worker-user
5a51f57d-5c4d-4a99-9f04-5e5ff192ea05
2016-09-27 17:12:58.886 o.a.s.d.supervisor [INFO] Sleep 1 seconds for
execution of cleanup threads on worker.
2016-09-27 17:12:59.033 o.a.s.d.supervisor [INFO] Worker Process
ded8d43c-f01f-421a-ae39-048a9ac0842d exited with code: 137
2016-09-27 17:12:59.896 o.a.s.config [INFO] REMOVE worker-user
5a51f57d-5c4d-4a99-9f04-5e5ff192ea05
2016-09-27 17:12:59.896 o.a.s.d.supervisor [INFO] Shut down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:5a51f57d-5c4d-4a99-9f04-5e5ff192ea05
2016-09-27 17:12:59.896 o.a.s.d.supervisor [INFO] Shutting down and
clearing state for id e955c4dc-6744-43b7-b37d-530db3b5a5eb. Current
supervisor time: 1474967572. State: :disallowed, Heartbeat: {:time-secs
1474967572, :storm-id "DPI-10-1474620300", :executors [[177 177] [65 65] [9
9] [121 121] [-1 -1] [289 289] [233 233]], :port 6730}
2016-09-27 17:12:59.896 o.a.s.d.supervisor [INFO] Shutting down
e0060b80-1f9b-4ea3-8ccf-fec15a2293d9:e955c4dc-6744-43b7-b37d-530db3b5a5eb
2016-09-27 17:12:59.896 o.a.s.config [INFO] GET worker-user
e955c4dc-6744-43b7-b37d-530db3b5a5eb
2016-09-27 17:12:59.899 o.a.s.d.supervisor [INFO] Sleep 1 seconds for
execution of cleanup threads on worker.
2016-09-27 17:12:59.962 o.a.s.d.supervisor [INFO] Worker Process
5a51f57d-5c4d-4a99-9f04-5e5ff192ea05 exited with code: 137
2016-09-27 17:13:00.908 o.a.s.config [INFO] REMOVE worker-user
e955c4dc-6744-43b7-b37d-530db3b5a5eb


Regard,
Junfeng Chen

On Tue, Sep 27, 2016 at 5:14 PM, Junfeng Chen <da...@gmail.com> wrote:

> I found  workers in cluster always restart themselves and the executors up
> time is never more than 2 minutes.
> In every restart loop, the workers run well after restart, where it
> execute and emit tuples normally. Then when the uptime is near 2 minutes,
> all statistic data is reset to zero including the up time.
>
> Here is the error log before every restart behavior:
>
> 2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6734
> 2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6731
> 2016-09-27 17:10:38.831 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6731
> 2016-09-27 17:10:38.832 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6738
> 2016-09-27 17:10:38.832 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6738
> 2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6741
> 2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6741
> 2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6737
> 2016-09-27 17:10:38.833 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6737
> 2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6740
> 2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6740
> 2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm08/132.228.28.151:6751
> 2016-09-27 17:10:38.834 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm08/132.228.28.151:6751
> 2016-09-27 17:10:40.894 o.a.s.d.worker [INFO] Shutting down worker
> DPI-10-1474620300 e0060b80-1f9b-4ea3-8ccf-fec15a2293d9 6737
> 2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm07/132.228.28.150:6748
> 2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm07/132.228.28.150:6748
> 2016-09-27 17:10:40.895 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm02/132.228.28.137:6737
> 2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm02/132.228.28.137:6737
> 2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm07/132.228.28.150:6733
> 2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to
> send 0 pending messages to Netty-Client-storm07/132.228.28.150:6733
> 2016-09-27 17:10:40.896 o.a.s.m.n.Client [INFO] closing Netty Client
> Netty-Client-storm07/132.228.28.150:6752
>
>
> Regard,
> Junfeng Chen
>