You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by "ouchengeng (JIRA)" <ji...@apache.org> on 2017/07/05 08:46:00 UTC

[jira] [Updated] (SLIDER-1232) Provider misses docker entry in stop_command when apps are docker containers

     [ https://issues.apache.org/jira/browse/SLIDER-1232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ouchengeng updated SLIDER-1232:
-------------------------------
    Description: 
When apps are docker containers, agentProviderService can tell slider-agent this is a docker mode, so DockerManager.py can install/start docker containers.
However, as for stop_command, provider fails to tell this is in docker mode, that is to say missing docker entry in command.
This will cause following exceptions.
```
ERROR 2017-07-05 05:14:25,830 CustomServiceOrchestrator.py:169 - Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'
Traceback (most recent call last):
  File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 115, in runCommand
    script_path = self.resolve_script_path(self.base_dir, script, script_type)
  File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 196, in resolve_script_path
    path = os.path.realpath(posixpath.join(base_dir, script))
  File "/usr/lib64/python2.7/posixpath.py", line 75, in join
    if b.startswith('/'):
AttributeError: 'NoneType' object has no attribute 'startswith'
INFO 2017-07-05 05:14:25,831 ActionQueue.py:188 - Stop command received
INFO 2017-07-05 05:14:25,932 AgentToggleLogger.py:40 - Queue result: {'componentStatus': [],
 'reports': [{'actionId': u'16-1',
              'clusterName': u'cgou',
              'exitcode': 1,
              'reportResult': True,
              'role': u'MATCHER',
              'roleCommand': u'STOP',
              'serviceName': u'cgou',
              'status': 'FAILED',
              'stderr': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
              'stdout': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
              'structuredOut': '{}',
              'taskId': 16}]}
```

In AgentProviderService.java, I find that there are addInstallCommand/addInstallDockerCommand, addStartCommand/addStartDockerCommand methods. However, stop method for docker is missing. That's why slider-agent/ActionQueue.py cannot recognize this command properly and causes above exception.

  was:
When apps are docker containers, agentProviderService can tell slider-agent this is a docker mode, so DockerManager.py can install/start docker containers.
However, as for stop_command, provider fails to tell this is in docker mode, that is to say missing docker entry in command.
This will cause following exceptions.
ERROR 2017-07-05 05:14:25,830 CustomServiceOrchestrator.py:169 - Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'
Traceback (most recent call last):
  File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 115, in runCommand
    script_path = self.resolve_script_path(self.base_dir, script, script_type)
  File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 196, in resolve_script_path
    path = os.path.realpath(posixpath.join(base_dir, script))
  File "/usr/lib64/python2.7/posixpath.py", line 75, in join
    if b.startswith('/'):
AttributeError: 'NoneType' object has no attribute 'startswith'
INFO 2017-07-05 05:14:25,831 ActionQueue.py:188 - Stop command received
INFO 2017-07-05 05:14:25,932 AgentToggleLogger.py:40 - Queue result: {'componentStatus': [],
 'reports': [{'actionId': u'16-1',
              'clusterName': u'cgou',
              'exitcode': 1,
              'reportResult': True,
              'role': u'MATCHER',
              'roleCommand': u'STOP',
              'serviceName': u'cgou',
              'status': 'FAILED',
              'stderr': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
              'stdout': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
              'structuredOut': '{}',
              'taskId': 16}]}


In AgentProviderService.java, I find that there are addInstallCommand/addInstallDockerCommand, addStartCommand/addStartDockerCommand methods. However, stop method for docker is missing. That's why slider-agent/ActionQueue.py cannot recognize this command properly and causes above exception.


> Provider misses docker entry in stop_command when apps are docker containers
> ----------------------------------------------------------------------------
>
>                 Key: SLIDER-1232
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1232
>             Project: Slider
>          Issue Type: Bug
>          Components: agent-provider
>    Affects Versions: Slider 0.91, Slider 0.92
>            Reporter: ouchengeng
>            Assignee: ouchengeng
>            Priority: Critical
>             Fix For: Slider 0.92
>
>
> When apps are docker containers, agentProviderService can tell slider-agent this is a docker mode, so DockerManager.py can install/start docker containers.
> However, as for stop_command, provider fails to tell this is in docker mode, that is to say missing docker entry in command.
> This will cause following exceptions.
> ```
> ERROR 2017-07-05 05:14:25,830 CustomServiceOrchestrator.py:169 - Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'
> Traceback (most recent call last):
>   File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 115, in runCommand
>     script_path = self.resolve_script_path(self.base_dir, script, script_type)
>   File "/h8/hadoop/yarn/local/usercache/df/appcache/application_1499081562804_0093/filecache/10/slider-agent.tar.gz/slider-agent/agent/CustomServiceOrchestrator.py", line 196, in resolve_script_path
>     path = os.path.realpath(posixpath.join(base_dir, script))
>   File "/usr/lib64/python2.7/posixpath.py", line 75, in join
>     if b.startswith('/'):
> AttributeError: 'NoneType' object has no attribute 'startswith'
> INFO 2017-07-05 05:14:25,831 ActionQueue.py:188 - Stop command received
> INFO 2017-07-05 05:14:25,932 AgentToggleLogger.py:40 - Queue result: {'componentStatus': [],
>  'reports': [{'actionId': u'16-1',
>               'clusterName': u'cgou',
>               'exitcode': 1,
>               'reportResult': True,
>               'role': u'MATCHER',
>               'roleCommand': u'STOP',
>               'serviceName': u'cgou',
>               'status': 'FAILED',
>               'stderr': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
>               'stdout': "Caught an exception while executing command: <type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'startswith'",
>               'structuredOut': '{}',
>               'taskId': 16}]}
> ```
> In AgentProviderService.java, I find that there are addInstallCommand/addInstallDockerCommand, addStartCommand/addStartDockerCommand methods. However, stop method for docker is missing. That's why slider-agent/ActionQueue.py cannot recognize this command properly and causes above exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)