You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Shubham Gupta (Jira)" <ji...@apache.org> on 2021/05/23 13:19:00 UTC

[jira] [Created] (AMBARI-25675) On giving Restart command, on Server side the desired state gets changed to STARTED but on Agent Side desired state remains INSTALLED.

Shubham Gupta created AMBARI-25675:
--------------------------------------

             Summary: On giving Restart command, on Server side the desired state gets changed to STARTED but on Agent Side desired state remains INSTALLED.
                 Key: AMBARI-25675
                 URL: https://issues.apache.org/jira/browse/AMBARI-25675
             Project: Ambari
          Issue Type: Bug
          Components: ambari-agent
    Affects Versions: 2.7.0
            Reporter: Shubham Gupta


Currently, if we the component is in INSTALLED state and we give Restart command, on Server side the desired state gets changed to STARTED but on Agent Side desired state remains INSTALLED.

The RESTART command received and created on Ambari Agent is:
{'requiredConfigTimestamp': 1621237797633, u'commandParams': {u'hooks_folder': u'stack-hooks', u'custom_command': u'RESTART', u'script': u'scripts/oozie_server.py', u'version': u'4.1.4.0', u'command_timeout': u'1800', u'HAS_RESOURCE_FILTERS': u'true', u'script_type': u'PYTHON'}, u'roleCommand': u'CUSTOM_COMMAND', u'repositoryFile': {u'resolved': True, u'repoVersion': u'4.1.4.0', u'repositories': [{u'mirrorsList': None, u'tags': [], u'ambariManaged': True, u'baseUrl': u'https://hdi31distrorelease.blob.core.windows.net/repos/HDInsight/ubuntu16/4.x/4.1.4.0/MDP.list ', u'repoName': u'HDInsight', u'components': None, u'distribution': None, u'repoId': u'HDInsight-4.1-repo-1', u'applicableServices': []}], u'feature': {u'preInstalled': True, u'scoped': False}, u'stackName': u'HDInsight', u'repoVersionId': 1, u'repoFileName': u'ambari-hdinsight-1'}, u'clusterId': u'2', u'commandType': u'EXECUTION_COMMAND', u'clusterName': u'gshubhamhadoop40test', u'serviceName': u'OOZIE', u'role': u'OOZIE_SERVER', u'requestId': 75454, u'taskId': 13304, u'roleParams': {u'component_category': u'MASTER'}, u'componentVersionMap': {u'HDFS': {u'DATANODE': u'4.1.4.0', u'ZKFC': u'4.1.4.0', u'JOURNALNODE': u'4.1.4.0', u'HDFS_CLIENT': u'4.1.4.0', u'NAMENODE': u'4.1.4.0'}, u'ZOOKEEPER': {u'ZOOKEEPER_SERVER': u'4.1.4.0', u'ZOOKEEPER_CLIENT': u'4.1.4.0'}, u'SQOOP': {u'SQOOP': u'4.1.4.0'}, u'HIVE': {u'HIVE_METASTORE': u'4.1.4.0', u'HIVE_SERVER': u'4.1.4.0', u'HIVE_CLIENT': u'4.1.4.0'}, u'PIG': {u'PIG': u'4.1.4.0'}, u'TEZ': {u'TEZ_CLIENT': u'4.1.4.0'}, u'MAPREDUCE2': {u'MAPREDUCE2_CLIENT': u'4.1.4.0', u'HISTORYSERVER': u'4.1.4.0'}, u'YARN': {u'NODEMANAGER': u'4.1.4.0', u'APP_TIMELINE_SERVER': u'4.1.4.0', u'RESOURCEMANAGER': u'4.1.4.0', u'YARN_CLIENT': u'4.1.4.0'}, u'OOZIE': {u'OOZIE_CLIENT': u'4.1.4.0', u'OOZIE_SERVER': u'4.1.4.0'}}, u'commandId': u'75454-0'}

And currently while updating the desired state we are checking - command['custom_command'] == CustomCommand.restart

But this is incorrect since the key is inside commandParams, so correct check should be: command['commandParams']['custom_command'] == CustomCommand.restart

Problems due to this:
Currently if we stop a component and give RESTART , then on agent side DESIRED state is still INSTALLED but current state gets to STARTED state. Now if we manually kill the process or the process gets shutdown, it will not recover since CURRENT state == DESIRED state == INSTALLED.

With this bug fix, we should be able to recover those tasks as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)