You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by "Shubham Sharma (JIRA)" <ji...@apache.org> on 2017/11/14 02:04:00 UTC

[jira] [Created] (HAWQ-1549) Re-syncing standby fails even when stop mode is fast

Shubham Sharma created HAWQ-1549:
------------------------------------

             Summary:  Re-syncing standby fails even when stop mode is fast
                 Key: HAWQ-1549
                 URL: https://issues.apache.org/jira/browse/HAWQ-1549
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Command Line Tools, Standby master
            Reporter: Shubham Sharma
            Assignee: Radar Lei


Recently observed a behaviour while re-syncing standby from hawq command line.

Here are the reproduction steps -

1 - Open a client connection to hawq using psql
2 - From a different terminal run command - hawq init standby -n -v -M fast
3 - Standby resync fails with error

{code}
20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-There are other connections to this instance, shutdown mode smart aborted

20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-Either remove connections, or use 'hawq stop master -M fast' or 'hawq stop master -M immediate'

20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[WARNING]:-See hawq stop --help for all options

20171113:03:49:21:158354 hawq_stop:hdp3:gpadmin-[ERROR]:-Active connections. Aborting shutdown...

20171113:03:49:21:158143 hawq_init:hdp3:gpadmin-[ERROR]:-Stop hawq cluster failed, exit
{code}

4 - When -M (stop mode) is passed it should terminate existing client connections. 

The source of this issue appears to be tools/bin/hawq_ctl method _resync_standby. When this is called the command formation does not include stop_mode options as passed to the arguments.

{code}
 def _resync_standby(self):
        logger.info("Re-sync standby")
        cmd = "%s; hawq stop master -a;" % source_hawq_env
        check_return_code(local_ssh(cmd, logger), logger, "Stop hawq cluster failed, exit")
        ......
        ......
{code}

I can start this and submit a PR when changes are done.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)