You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hawq.apache.org by xi...@iluvatar.ai on 2018/10/30 09:48:52 UTC
why stop cluster can not stop gpsyncmaster
Hi!
After i call "hawq stop cluster -a", i found that there is still has gpadmin process:
gpadmin 61866 0.4 5.2 811448 419620 ? S 17:29 0:00 /usr/local/apache-hawq/bin/gpsyncmaster -D /data/hawq/masterdd -i -p 1809
gpadmin 61882 0.0 0.0 302688 7200 ? Ss 17:29 0:00 postgres: port 1809, logger process
gpadmin 61883 0.0 0.0 812000 7384 ? S 17:29 0:00 postgres: port 1809, WAL Redo Server process
gpadmin 61907 0.0 0.1 812300 8128 ? Ss 17:29 0:00 postgres: port 1809, gpsyncagent process con2 idle
Then I call "hawq start cluster -a" failed:
20181030:17:29:05:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Starting standby master '192.168.10.18'
20181030:17:29:05:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Start standby master service
20181030:17:29:04:061879 hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-Checking standby master status
20181030:17:29:04:061879 hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-Monitoring logs
20181030:17:29:08:061879 hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-checking if syncmaster is running
20181030:17:29:08:061879 hawqstandbywatch.py:dx-computing2:gpadmin-[INFO]:-syncmaster appears ok, pid 61866
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Standby master started successfully
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Starting master node '192.168.10.17'
20181030:17:29:09:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Start master service
20181030:17:29:10:2789929 hawq_start:dx-computing:gpadmin-[INFO]:-Checking if standby is synced with master
20181030:17:29:10:2789929 hawq_start:dx-computing:gpadmin-[ERROR]:-Failed to connect to database, this script can only be run when the database is up
Traceback (most recent call last):
File "/usr/local/apache-hawq/bin/hawq_ctl", line 1459, in <module>
start_hawq(opts, hawq_dict)
File "/usr/local/apache-hawq/bin/hawq_ctl", line 1233, in start_hawq
instance.run()
File "/usr/local/apache-hawq/bin/hawq_ctl", line 765, in run
check_return_code(self._start_all_nodes())
File "/usr/local/apache-hawq/bin/hawq_ctl", line 701, in _start_all_nodes
check_return_code(self.start_master(), logger, "Master start failed, exit", \
File "/usr/local/apache-hawq/bin/hawq_ctl", line 618, in start_master
sync_result = self._check_standby_sync()
File "/usr/local/apache-hawq/bin/hawq_ctl", line 671, in _check_standby_sync
for row in rows:
UnboundLocalError: local variable 'rows' referenced before assignment
So, why stop cluster can not stop gpsyncmaster on standby node?
I use hawq 2.2, upgrade can solve it?