You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by mashengchen <gi...@git.apache.org> on 2018/01/04 11:07:38 UTC

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...

GitHub user mashengchen opened a pull request:

    https://github.com/apache/trafodion/pull/1369

    TRAFODION-2877 can't restart DCSServer when switch to backup-master

    DcsServer does not be restarted by DcsMaster on HDP.
    Steps To Reproduce
    1. check Hadoop and instance status
    2. kill active DcsMaster
    3. repeat Step 1
    4. kill DcsServer on the active DcsMaster node
    5. repeat Step 1
    6. DcsServer never been restarted

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mashengchen/incubator-trafodion dcsServerNotRestart

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1369.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1369
    
----
commit 2595998299e1a6bb281544e00faab1f7c7ceddda
Author: aven <sh...@...>
Date:   2018-01-04T10:15:54Z

    TRAFODION-2877 restart DCSServer when switch to backup-master

----


---

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...

Posted by mashengchen <gi...@git.apache.org>.
Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1369#discussion_r159811900
  
    --- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
    @@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
                             }
                         }
                     } else {
    -                    if (LOG.isDebugEnabled())
    -                        LOG.debug("No restart for "
    -                                + znodePath
    -                                + "\nbecause DcsServer start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(serverStartTimestamp))
    -                                + "] was before DcsMaster start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(startupTimestamp)) + "]");
    +                    LOG.info("No restart for "
    --- End diff --
    
    You are right, I should change the log, thanks.


---

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer after s...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1369


---

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...

Posted by mashengchen <gi...@git.apache.org>.
Github user mashengchen commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1369#discussion_r159793870
  
    --- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
    @@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
                             }
                         }
                     } else {
    -                    if (LOG.isDebugEnabled())
    -                        LOG.debug("No restart for "
    -                                + znodePath
    -                                + "\nbecause DcsServer start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(serverStartTimestamp))
    -                                + "] was before DcsMaster start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(startupTimestamp)) + "]");
    +                    LOG.info("No restart for "
    --- End diff --
    
    yes . if backup-master take over from the failed master, then the 'isFollower' is true, the result will be the code does not run the if conditions, and dcs server not restart


---

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1369#discussion_r159730607
  
    --- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
    @@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
                             }
                         }
                     } else {
    -                    if (LOG.isDebugEnabled())
    -                        LOG.debug("No restart for "
    -                                + znodePath
    -                                + "\nbecause DcsServer start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(serverStartTimestamp))
    -                                + "] was before DcsMaster start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(startupTimestamp)) + "]");
    +                    LOG.info("No restart for "
    --- End diff --
    
    Looks like there is another reason for no restart, namely (master.isFollower() && runningServers.size() < configuredServers.size()) ?


---

[GitHub] trafodion pull request #1369: TRAFODION-2877 can't restart DCSServer when sw...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1369#discussion_r159806194
  
    --- Diff: dcs/src/main/java/org/trafodion/dcs/master/ServerManager.java ---
    @@ -240,15 +240,14 @@ public ScriptContext call() throws Exception {
                             }
                         }
                     } else {
    -                    if (LOG.isDebugEnabled())
    -                        LOG.debug("No restart for "
    -                                + znodePath
    -                                + "\nbecause DcsServer start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(serverStartTimestamp))
    -                                + "] was before DcsMaster start time ["
    -                                + DateFormat.getDateTimeInstance().format(
    -                                        new Date(startupTimestamp)) + "]");
    +                    LOG.info("No restart for "
    --- End diff --
    
    RIght... My point was that the text in the logging seems to need updating, because the 'if' conditions have changed. What do you think?


---