You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by GitBox <gi...@apache.org> on 2020/11/17 06:39:04 UTC
[GitHub] [storm] RuiLi8080 opened a new pull request #3353: [STORM-3713] fix race-condition by applying submitLock to leaderCallBack
RuiLi8080 opened a new pull request #3353:
URL: https://github.com/apache/storm/pull/3353
## What is the purpose of the change
Adding submitLock to leaderCallBack to avoid race-condition.
## How was the change tested
First, we reproduce the NPE exception by adding 60s sleep right before this step. https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/cluster/StormClusterStateImpl.java#L222
When the sleep starts, we restart zookeeper to trigger leader-re-election and kill the test topo.
This lock can prevent the race-condition even with the 60s sleep. Look at the 60s gap on timestamp.
Nimbus log:
```
2020-11-17 06:24:25.114 o.a.s.c.StormClusterStateImpl main-EventThread [INFO] syncRemoteAssignments sleeps for 60s
2020-11-17 06:24:36.126 o.a.s.d.n.Nimbus pool-34-thread-28 [INFO] TRANSITION: wc-1-1605594107 KILL null true
... 60s sleep ...
2020-11-17 06:25:26.704 o.a.s.d.n.Nimbus timer [INFO] TRANSITION: wc-1-1605594107 GAIN_LEADERSHIP null false
2020-11-17 06:25:26.742 o.a.s.d.n.Nimbus timer [INFO] Delaying event REMOVE for 30 secs for wc-1-1605594107
2020-11-17 06:25:55.149 o.a.s.d.n.Nimbus timer [INFO] TRANSITION: wc-1-1605594107 REMOVE null false
2020-11-17 06:25:55.154 o.a.s.d.n.Nimbus timer [INFO] Killing topology: wc-1-1605594107
```
Client console log:
```
-bash-4.2$ storm kill wc
Running: /home/y/share/yjava_jdk/java/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/home/y/lib64/storm/2.3.0.y -Dstorm.log.dir=/home/y/lib64/storm/2.3.0.y/logs -Djava.library.path=/home/y/lib64:/usr/local/lib64:/usr/lib64:/lib64: -Dstorm.conf.file= -cp /home/y/lib64/storm/2.3.0.y/*:/home/y/lib64/storm/2.3.0.y/lib/*:/home/y/lib64/storm/2.3.0.y/extlib/*:/home/y/lib64/storm/2.3.0.y/extlib-daemon/*:/home/y/lib64/storm/current/conf:/home/y/lib64/storm/2.3.0.y/bin org.apache.storm.command.KillTopology wc
06:24:35.567 [main] INFO o.a.s.v.ConfigValidation - Will use [class org.apache.storm.DaemonConfig, class org.apache.storm.Config] for validation
06:24:35.715 [main] WARN o.a.s.v.ConfigValidation - Field public static final java.lang.String org.apache.storm.DaemonConfig.STORM_RESOURCE_ISOLATION_PLUGIN does not have validator annotation
06:24:35.726 [main] WARN o.a.s.v.ConfigValidation - topology.backpressure.enable is a deprecated config please see class org.apache.storm.Config.TOPOLOGY_BACKPRESSURE_ENABLE for more information.
06:24:35.868 [main] INFO o.a.s.m.n.Login - Successfully logged in to context StormClient using /etc/grid-keytabs/jaas.conf
06:24:35.871 [Refresh-TGT] INFO o.a.s.m.n.Login - TGT refresh thread started.
06:24:35.897 [Refresh-TGT] INFO o.a.s.m.n.Login - TGT valid starting at: Tue Nov 17 05:56:26 UTC 2020
06:24:35.897 [Refresh-TGT] INFO o.a.s.m.n.Login - TGT expires: Wed Nov 18 05:56:26 UTC 2020
06:24:35.898 [Refresh-TGT] INFO o.a.s.m.n.Login - TGT refresh sleeping until: Wed Nov 18 02:13:43 UTC 2020
06:24:36.077 [main] INFO o.a.s.u.NimbusClient - Found leader nimbus : openstorm3blue-n4.blue.ygrid.yahoo.com:50560
... 60s sleep ...
06:25:25.181 [main] INFO o.a.s.c.KillTopology - Killed topology: wc
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org