You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by mashengchen <gi...@git.apache.org> on 2018/01/04 11:07:38 UTC
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...
GitHub user mashengchen opened a pull request:
https://github.com/apache/trafodion/pull/1369
TRAFODION-2877 can't restart DCSServer when switch to backup-master
DcsServer does not be restarted by DcsMaster on HDP.
Steps To Reproduce
1. check Hadoop and instance status
2. kill active DcsMaster
3. repeat Step 1
4. kill DcsServer on the active DcsMaster node
5. repeat Step 1
6. DcsServer never been restarted
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mashengchen/incubator-trafodion dcsServerNotRestart
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/trafodion/pull/1369.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1369
----
commit 2595998299e1a6bb281544e00faab1f7c7ceddda
Author: aven <sh...@...>
Date: 2018-01-04T10:15:54Z
TRAFODION-2877 restart DCSServer when switch to backup-master
----
---
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...
Posted by mashengchen <gi...@git.apache.org>.
Github user mashengchen commented on a diff in the pull request:
https://github.com/apache/trafodion/pull/1369#discussion_r159811900
--- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
@@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
}
}
} else {
- if (LOG.isDebugEnabled())
- LOG.debug("No restart for "
- + znodePath
- + "\nbecause DcsServer start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(serverStartTimestamp))
- + "] was before DcsMaster start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(startupTimestamp)) + "]");
+ LOG.info("No restart for "
--- End diff --
You are right, I should change the log, thanks.
---
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer after s...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/trafodion/pull/1369
---
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...
Posted by mashengchen <gi...@git.apache.org>.
Github user mashengchen commented on a diff in the pull request:
https://github.com/apache/trafodion/pull/1369#discussion_r159793870
--- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
@@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
}
}
} else {
- if (LOG.isDebugEnabled())
- LOG.debug("No restart for "
- + znodePath
- + "\nbecause DcsServer start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(serverStartTimestamp))
- + "] was before DcsMaster start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(startupTimestamp)) + "]");
+ LOG.info("No restart for "
--- End diff --
yes . if backup-master take over from the failed master, then the 'isFollower' is true, the result will be the code does not run the if conditions, and dcs server not restart
---
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...
Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:
https://github.com/apache/trafodion/pull/1369#discussion_r159730607
--- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
@@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
}
}
} else {
- if (LOG.isDebugEnabled())
- LOG.debug("No restart for "
- + znodePath
- + "\nbecause DcsServer start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(serverStartTimestamp))
- + "] was before DcsMaster start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(startupTimestamp)) + "]");
+ LOG.info("No restart for "
--- End diff --
Looks like there is another reason for no restart, namely (master.isFollower() && runningServers.size() < configuredServers.size()) ?
---
[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...
Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:
https://github.com/apache/trafodion/pull/1369#discussion_r159806194
--- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
@@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
}
}
} else {
- if (LOG.isDebugEnabled())
- LOG.debug("No restart for "
- + znodePath
- + "\nbecause DcsServer start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(serverStartTimestamp))
- + "] was before DcsMaster start time ["
- + DateFormat.getDateTimeInstance().format(
- new Date(startupTimestamp)) + "]");
+ LOG.info("No restart for "
--- End diff --
RIght... My point was that the text in the logging seems to need updating, because the 'if' conditions have changed. What do you think?
---