You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by Rohith Sharma K S <ro...@huawei.com> on 2015/10/06 11:09:17 UTC

Slider-agent can not be started

Hi

I am trying to deploy HBase using Slider. I created HBase package with Hadoop-2.6 distribution. I submitted job using "python slider create t6 --template appConfig.json --resources resources.json".
Slider master started running but Hmaster launching is failed. In the slider-agent.log which is attached has following error.

I hope I am missing some configurations, could anyone help to understand why there is below error? How to resolve this issue? Full slider-agent.log I have attached in the mail.

INFO 2015-10-06 13:38:21,067 main.py:259 - Unable to extract AM host details from ZK, retrying ...
ERROR 2015-10-06 13:38:31,077 Registry.py:63 - Could not connect to zk registry at /registry/users/rohith/services/org-apache-slider/t6 in quorum 10.18.130.110:54000. Error: 'NoneType' object has no attribute 'strip'
INFO 2015-10-06 13:38:31,078 Registry.py:69 - AM Host = , AM Secured Port = , ping port =
INFO 2015-10-06 13:38:31,078 main.py:259 - Unable to extract AM host details from ZK, retrying ...
INFO 2015-10-06 13:38:41,081 Controller.py:140 - Registering with the server at https://localhost:8441/ws/v1/slider/agents/container_e04_1444115477719_0002_01_000016___HBASE_MASTER/register with data '{"tags": "", "timestamp": 1444118921081, "expectedState": 0, "responseId": -1, "actualState": 0, "logFolders": {}, "agentVersion": "1", "allocatedPorts": {}, "appVersion": null, "publicHostname": "host-10-18-130-110", "label": "container_e04_1444115477719_0002_01_000016___HBASE_MASTER"}'
INFO 2015-10-06 13:38:41,081 security.py:89 - SSL Connect being called.. connecting to the server
ERROR 2015-10-06 13:38:41,082 Controller.py:625 - Exception raised
Traceback (most recent call last):
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/Controller.py", line 619, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 106, in __init__
    self.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 111, in connect
    self.httpsconn.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 49, in connect
    sock=self.create_connection()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 90, in create_connection
    sock = socket.create_connection((self.host, self.port), 60)
  File "/usr/lib64/python2.6/socket.py", line 512, in create_connection




Thanks & Regards
Rohith Sharma K S


Re: Slider-agent can not be started

Posted by Gour Saha <gs...@hortonworks.com>.
Steve,
I think you meant the max python version that works, right. It is 2.7.8.
Slider agent has issues with 2.7.9.

A bug has been filed for it -
https://issues.apache.org/jira/browse/SLIDER-942

-Gour

On 10/7/15, 4:03 AM, "Steve Loughran" <st...@hortonworks.com> wrote:

>
>Gour -what's the minimum python version that works?
>


Re: Slider-agent can not be started

Posted by Steve Loughran <st...@hortonworks.com>.
Gour -what's the minimum python version that works?

RE: Slider-agent can not be started

Posted by Rohith Sharma K S <ro...@huawei.com>.
Hi Gour

Yes, I am using 0.81.0-incubating-SNAPSHOT. The quorum is running 10.18.130.110:54000. In the same quorum, YARN cluster also used for ZKRMStateStore.


I have sent you logs and configuration to your personal mail. Kindly let me know if you required more details.


Thanks & Regards
Rohith Sharma K S



-----Original Message-----
From: Gour Saha [mailto:gsaha@hortonworks.com] 
Sent: 06 October 2015 21:18
To: dev@slider.incubator.apache.org
Subject: Re: Slider-agent can not be started

Rohit,
I am assuming you are using the latest 0.81.0 version of Slider (or the develop branch). If not please do so.

Can you share the AM log (copy paste please)? Note, file attachments are dropped in the apache DLs. So the slider-agent.log also did not come through.

>From the error it shows that the agent is unable to connect to zk to retrieve the AM host/port details. The quorum value says 10.18.130.110:54000. Typically zk port is 2181, but it can be different in your env, so just wanted to confirm if that is the right value.

-Gour

From: Rohith Sharma K S <ro...@huawei.com>>
Reply-To: "dev@slider.incubator.apache.org<ma...@slider.incubator.apache.org>" <de...@slider.incubator.apache.org>>
Date: Tuesday, October 6, 2015 at 2:09 AM
To: "dev@slider.incubator.apache.org<ma...@slider.incubator.apache.org>" <de...@slider.incubator.apache.org>>
Subject: Slider-agent can not be started

Hi

I am trying to deploy HBase using Slider. I created HBase package with Hadoop-2.6 distribution. I submitted job using "python slider create t6 --template appConfig.json --resources resources.json".
Slider master started running but Hmaster launching is failed. In the slider-agent.log which is attached has following error.

I hope I am missing some configurations, could anyone help to understand why there is below error? How to resolve this issue? Full slider-agent.log I have attached in the mail.

INFO 2015-10-06 13:38:21,067 main.py:259 - Unable to extract AM host details from ZK, retrying ...
ERROR 2015-10-06 13:38:31,077 Registry.py:63 - Could not connect to zk registry at /registry/users/rohith/services/org-apache-slider/t6 in quorum 10.18.130.110:54000. Error: 'NoneType' object has no attribute 'strip'
INFO 2015-10-06 13:38:31,078 Registry.py:69 - AM Host = , AM Secured Port = , ping port = INFO 2015-10-06 13:38:31,078 main.py:259 - Unable to extract AM host details from ZK, retrying ...
INFO 2015-10-06 13:38:41,081 Controller.py:140 - Registering with the server at https://localhost:8441/ws/v1/slider/agents/container_e04_1444115477719_0002_01_000016___HBASE_MASTER/register with data '{"tags": "", "timestamp": 1444118921081, "expectedState": 0, "responseId": -1, "actualState": 0, "logFolders": {}, "agentVersion": "1", "allocatedPorts": {}, "appVersion": null, "publicHostname": "host-10-18-130-110", "label": "container_e04_1444115477719_0002_01_000016___HBASE_MASTER"}'
INFO 2015-10-06 13:38:41,081 security.py:89 - SSL Connect being called.. connecting to the server ERROR 2015-10-06 13:38:41,082 Controller.py:625 - Exception raised Traceback (most recent call last):
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/Controller.py", line 619, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 106, in __init__
    self.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 111, in connect
    self.httpsconn.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 49, in connect
    sock=self.create_connection()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 90, in create_connection
    sock = socket.create_connection((self.host, self.port), 60)
  File "/usr/lib64/python2.6/socket.py", line 512, in create_connection




Thanks & Regards
Rohith Sharma K S


Re: Slider-agent can not be started

Posted by Gour Saha <gs...@hortonworks.com>.
Rohit,
I am assuming you are using the latest 0.81.0 version of Slider (or the develop branch). If not please do so.

Can you share the AM log (copy paste please)? Note, file attachments are dropped in the apache DLs. So the slider-agent.log also did not come through.

>From the error it shows that the agent is unable to connect to zk to retrieve the AM host/port details. The quorum value says 10.18.130.110:54000. Typically zk port is 2181, but it can be different in your env, so just wanted to confirm if that is the right value.

-Gour

From: Rohith Sharma K S <ro...@huawei.com>>
Reply-To: "dev@slider.incubator.apache.org<ma...@slider.incubator.apache.org>" <de...@slider.incubator.apache.org>>
Date: Tuesday, October 6, 2015 at 2:09 AM
To: "dev@slider.incubator.apache.org<ma...@slider.incubator.apache.org>" <de...@slider.incubator.apache.org>>
Subject: Slider-agent can not be started

Hi

I am trying to deploy HBase using Slider. I created HBase package with Hadoop-2.6 distribution. I submitted job using "python slider create t6 --template appConfig.json --resources resources.json".
Slider master started running but Hmaster launching is failed. In the slider-agent.log which is attached has following error.

I hope I am missing some configurations, could anyone help to understand why there is below error? How to resolve this issue? Full slider-agent.log I have attached in the mail.

INFO 2015-10-06 13:38:21,067 main.py:259 - Unable to extract AM host details from ZK, retrying ...
ERROR 2015-10-06 13:38:31,077 Registry.py:63 - Could not connect to zk registry at /registry/users/rohith/services/org-apache-slider/t6 in quorum 10.18.130.110:54000. Error: 'NoneType' object has no attribute 'strip'
INFO 2015-10-06 13:38:31,078 Registry.py:69 - AM Host = , AM Secured Port = , ping port =
INFO 2015-10-06 13:38:31,078 main.py:259 - Unable to extract AM host details from ZK, retrying ...
INFO 2015-10-06 13:38:41,081 Controller.py:140 - Registering with the server at https://localhost:8441/ws/v1/slider/agents/container_e04_1444115477719_0002_01_000016___HBASE_MASTER/register with data '{"tags": "", "timestamp": 1444118921081, "expectedState": 0, "responseId": -1, "actualState": 0, "logFolders": {}, "agentVersion": "1", "allocatedPorts": {}, "appVersion": null, "publicHostname": "host-10-18-130-110", "label": "container_e04_1444115477719_0002_01_000016___HBASE_MASTER"}'
INFO 2015-10-06 13:38:41,081 security.py:89 - SSL Connect being called.. connecting to the server
ERROR 2015-10-06 13:38:41,082 Controller.py:625 - Exception raised
Traceback (most recent call last):
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/Controller.py", line 619, in sendRequest
    self.cachedconnect = security.CachedHTTPSConnection(self.config)
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 106, in __init__
    self.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 111, in connect
    self.httpsconn.connect()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 49, in connect
    sock=self.create_connection()
  File "/home/rohith/os/tmp2.6/nm-local-dir/usercache/rohith/appcache/application_1444115477719_0002/filecache/24/slider-agent.tar.gz/slider-agent/agent/security.py", line 90, in create_connection
    sock = socket.create_connection((self.host, self.port), 60)
  File "/usr/lib64/python2.6/socket.py", line 512, in create_connection




Thanks & Regards
Rohith Sharma K S