You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2018/06/27 14:00:00 UTC
[jira] [Commented] (AMBARI-24201) Command reschedule does not work
causing blueprint deployments to timeout
[ https://issues.apache.org/jira/browse/AMBARI-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16525121#comment-16525121 ]
Hudson commented on AMBARI-24201:
---------------------------------
SUCCESS: Integrated in Jenkins build Ambari-trunk-Commit #9538 (See [https://builds.apache.org/job/Ambari-trunk-Commit/9538/])
AMBARI-24201. Command reschedule does not work causing blueprint (aonishuk: [https://gitbox.apache.org/repos/asf?p=ambari.git&a=commit&h=781b4bfe9879ce56837b913a7ad6db46908bb684])
* (edit) ambari-agent/src/main/python/ambari_agent/ActionQueue.py
> Command reschedule does not work causing blueprint deployments to timeout
> ---------------------------------------------------------------------------
>
> Key: AMBARI-24201
> URL: https://issues.apache.org/jira/browse/AMBARI-24201
> Project: Ambari
> Issue Type: Bug
> Reporter: Andrew Onischuk
> Assignee: Andrew Onischuk
> Priority: Major
> Fix For: 2.7.0
>
> Attachments: AMBARI-24201.patch, AMBARI-24201.patch, AMBARI-24201.patch
>
>
> During stage timeout/failure of devilery during blueprint install server
> usually reschedules running command. By sending cancel command along with
> repeated execution command.
> The bug is that agent cancels the command which needs to be newly scheduled.
>
>
> 2018-06-27 01:34:58,105 WARN [agent-message-retry-0] MessageEmitter:255 - Reschedule execution command emitting, retry: 1, messageId: 19
>
>
>
> ..., u'cancelCommands': [{u'commandType': u'CANCEL_COMMAND', u'target_task_id': 145, u'reason': u'Stage timeout'}]}}, u'requiredConfigTimestamp': 1530060845474}
> INFO 2018-06-27 01:34:58,121 ActionQueue.py:115 - Canceling command with taskId = 145
> INFO 2018-06-27 01:34:58,121 ActionQueue.py:134 - Canceling EXECUTION_COMMAND for service ZOOKEEPER and role ZOOKEEPER_CLIENT with taskId 145
> WARNING 2018-06-27 01:34:58,121 CustomServiceOrchestrator.py:129 - Unable to find process associated with taskId = 145
> INFO 2018-06-27 01:34:58,122 ActionQueue.py:103 - Adding EXECUTION_COMMAND for role ZOOKEEPER_CLIENT for service ZOOKEEPER of cluster_id 2 to the queue.
> INFO 2018-06-27 01:34:58,122 security.py:135 - Event to server at /reports/responses (correlation_id=870): {'status': 'OK', 'messageId': '19'}
> INFO 2018-06-27 01:34:58,142 __init__.py:57 - Event from server at /user/ (correlation_id=870): {u'status': u'OK'}
> INFO 2018-06-27 01:34:59,293 ActionQueue.py:238 - Executing command with id = 10-0, taskId = 145 for role = ZOOKEEPER_CLIENT of cluster_id 2.
> INFO 2018-06-27 01:34:59,294 security.py:135 - Event to server at /reports/commands_status (correlation_id=871): {'clusters': {u'2': [{'status': 'IN_PROGRESS', 'taskId': 145, 'tmpout': '/var/lib/ambari-agent/data/output-145.txt', 'roleCommand': u'INSTALL', 'structuredOut': '/var/lib/ambari-agent/data/structured-out-145.json', 'clusterId': u'2', 'serviceName': u'ZOOKEEPER', 'role': u'ZOOKEEPER_CLIENT', 'actionId': u'10-0', 'tmperr': '/var/lib/ambari-agent/data/errors-145.txt'}]}}
> INFO 2018-06-27 01:34:59,295 ActionQueue.py:279 - Command execution metadata - taskId = 145, retry enabled = True, max retry duration (sec) = 1200, log_output = True
> INFO 2018-06-27 01:34:59,296 ActionQueue.py:285 - Command with taskId = 145 canceled
> ERROR 2018-06-27 01:34:59,296 ActionQueue.py:221 - Exception while processing EXECUTION_COMMAND command
> Traceback (most recent call last):
> File "/usr/lib/ambari-agent/lib/ambari_agent/ActionQueue.py", line 214, in process_command
> self.execute_command(command)
> File "/usr/lib/ambari-agent/lib/ambari_agent/ActionQueue.py", line 354, in execute_command
> commandresult['stdout'] += '\n\nCommand completed successfully!\n' if status == self.COMPLETED_STATUS else '\n\nCommand failed after ' + str(numAttempts) + ' tries\n'
> UnboundLocalError: local variable 'commandresult' referenced before assignment
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)