You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Andy Alvarado (JIRA)" <ji...@apache.org> on 2013/04/09 00:28:16 UTC

[jira] [Commented] (AMBARI-1839) Ambari agent will not startup

    [ https://issues.apache.org/jira/browse/AMBARI-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13625910#comment-13625910 ] 

Andy Alvarado commented on AMBARI-1839:
---------------------------------------

***FROM ONE OF THE NODES THAT IS FIALING

==> ambari-agent.log <==
INFO 2013-04-08 22:11:48,319 shell.py:50 - Killing stale processes
INFO 2013-04-08 22:11:48,319 shell.py:58 - Killed stale processes
INFO 2013-04-08 22:11:48,320 main.py:148 - Connecting to the server at: https://ambari:8440
INFO 2013-04-08 22:11:48,320 NetUtil.py:69 - DEBUG: Trying to connect to the server at https://ambari:8440
INFO 2013-04-08 22:11:48,320 NetUtil.py:45 - DEBUG:: Connecting to the following url https://ambari:8440/cert/ca
INFO 2013-04-08 22:11:48,433 NetUtil.py:52 - DEBUG: Calling url received 200
INFO 2013-04-08 22:11:48,433 main.py:156 - Creating certs
INFO 2013-04-08 22:11:48,433 security.py:137 - Server certicate exists, ok
INFO 2013-04-08 22:11:48,434 security.py:142 - Agent key not exists, generating request
INFO 2013-04-08 22:11:48,434 security.py:190 - openssl req -new -newkey rsa:1024 -nodes -keyout /var/lib/ambari-agent/keys/hadoop4.key  -subj /OU=hadoop4/        -out /var/lib/ambari-agent/keys/hadoop4.csr

==> ambari-agent.out <==
INFO 2013-04-08 22:11:48,440 security.py:150 - Agent certificate not exists, sending sign request
WARNING: can't open config file: /usr/local/ssl/openssl.cnf
Unable to load config info from /usr/local/ssl/openssl.cnf
Connecting to the server at https://ambari:8440...
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 168, in <module>
    main()
  File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 158, in main
    certMan.initSecurity()
  File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 195, in initSecurity
    self.checkCertExists()
  File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 151, in checkCertExists
    self.reqSignCrt()
  File "/usr/lib/python2.6/site-packages/ambari_agent/security.py", line 166, in reqSignCrt
    agent_crt_req_f = open(self.getAgentCrtReqName())
IOError: [Errno 2] No such file or directory: '/var/lib/ambari-agent/keys/hadoop4.csr'
                
> Ambari agent will not startup
> -----------------------------
>
>                 Key: AMBARI-1839
>                 URL: https://issues.apache.org/jira/browse/AMBARI-1839
>             Project: Ambari
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.2.1
>         Environment: CentOS 6.4
> JDK - java version "1.6.0_43"
>            Reporter: Andy Alvarado
>             Fix For: 1.3.1
>
>
> I am going off the documentation based here:  http://incubator.apache.org/ambari/1.2.1/installing-hadoop-using-ambari/content/ambari-chap1.html
> Node registration fails with the following output.
> ***FROM AMBARI CONSOLE
> STDOUT
> STDERR
> STDOUT
> STDERR
> STDOUT
> Verifying Python version compatibility...
> Using python  /usr/bin/python2.6
> Checking for previously running Ambari Agent...
> /var/run/ambari-agent/ambari-agent.pid found with no process. Removing 15877...
> Starting ambari-agent
> Verifying ambari-agent process status...
> ERROR: ambari-agent start failed for unknown reason
> ('hostname: ok hadoop1
> ip: ok 198.23.67.58
> cpu: ok Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
> Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
> ****TRUNCATED FOR SPACE
> memory: ok 62.8843 GB
> disks: ok
>  Filesystem            Size  Used Avail Use% Mounted on
> /dev/sda3             457G  2.1G  432G   1% /
> tmpfs                  32G     0   32G   0% /dev/shm
> /dev/sda1              99M   53M   41M  57% /boot
> os: ok CentOS release 6.4 (Final)
> iptables: ok
>  Chain INPUT (policy ACCEPT 21056 packets, 82M bytes)
>  pkts bytes target     prot opt in     out     source               destination         
> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
>  pkts bytes target     prot opt in     out     source               destination         
> Chain OUTPUT (policy ACCEPT 20186 packets, 2041K bytes)
>  pkts bytes target     prot opt in     out     source               destination
> selinux: ok SELINUX=disabled
> SELINUXTYPE=targeted
> yum: ok yum-3.2.29-40.el6.centos.noarch
> rpm: ok rpm-4.8.0-32.el6.x86_64
> openssl: ok openssl-1.0.0-27.el6_4.2.x86_64
> curl: ok curl-7.19.7-35.el6.x86_64
> wget: ok wget-1.12-1.8.el6.x86_64
> net-snmp: UNAVAILABLE
> net-snmp-utils: UNAVAILABLE
> ntpd: UNAVAILABLE
> ruby: UNAVAILABLE
> puppet: UNAVAILABLE
> nagios: UNAVAILABLE
> ganglia: UNAVAILABLE
> passenger: UNAVAILABLE
> hadoop: UNAVAILABLE
> yum_repos: ok
>  HDP-UTILS-1.1.0.15 Hortonworks Data Platform Utils Version - HDP-UTILS-1.    52
> zypper_repos: UNAVAILABLE
> ', None)
> ('INFO 2013-04-08 21:47:18,375 main.py:148 - Connecting to the server at: https://ambari:8440
> INFO 2013-04-08 21:47:18,375 NetUtil.py:69 - DEBUG: Trying to connect to the server at https://ambari:8440
> INFO 2013-04-08 21:47:18,375 NetUtil.py:45 - DEBUG:: Connecting to the following url https://ambari:8440/cert/ca
> INFO 2013-04-08 21:47:18,518 NetUtil.py:52 - DEBUG: Calling url received 200
> INFO 2013-04-08 21:47:18,518 main.py:156 - Creating certs
> INFO 2013-04-08 21:47:18,518 security.py:137 - Server certicate exists, ok
> INFO 2013-04-08 21:47:18,518 security.py:142 - Agent key not exists, generating request
> INFO 2013-04-08 21:47:18,519 security.py:190 - openssl req -new -newkey rsa:1024 -nodes -keyout /var/lib/ambari-agent/keys/hadoop1.key\t-subj /OU=hadoop1/        -out /var/lib/ambari-agent/keys/hadoop1.csr
> INFO 2013-04-08 21:47:18,525 security.py:150 - Agent certificate not exists, sending sign request
> INFO 2013-04-08 22:11:43,254 shell.py:50 - Killing stale processes
> INFO 2013-04-08 22:11:43,254 shell.py:58 - Killed stale processes
> INFO 2013-04-08 22:11:43,255 main.py:148 - Connecting to the server at: https://ambari:8440
> INFO 2013-04-08 22:11:43,255 NetUtil.py:69 - DEBUG: Trying to connect to the server at https://ambari:8440
> INFO 2013-04-08 22:11:43,255 NetUtil.py:45 - DEBUG:: Connecting to the following url https://ambari:8440/cert/ca
> INFO 2013-04-08 22:11:43,423 NetUtil.py:52 - DEBUG: Calling url received 200
> INFO 2013-04-08 22:11:43,423 main.py:156 - Creating certs
> INFO 2013-04-08 22:11:43,423 security.py:137 - Server certicate exists, ok
> INFO 2013-04-08 22:11:43,424 security.py:142 - Agent key not exists, generating request
> INFO 2013-04-08 22:11:43,424 security.py:190 - openssl req -new -newkey rsa:1024 -nodes -keyout /var/lib/ambari-agent/keys/hadoop1.key\t-subj /OU=hadoop1/        -out /var/lib/ambari-agent/keys/hadoop1.csr
> INFO 2013-04-08 22:11:43,431 security.py:150 - Agent certificate not exists, sending sign request
> ', None)
> STDERR
> Connection to hadoop1 closed.
> Registering with the server...
> Registration with the server failed.
> ****FROM THE NODE
> ==> ambari-server.out <==
> INFO: Scanning for root resource and provider classes in the packages:
>   org.apache.ambari.server.security.unsecured.rest
> Apr 8, 2013 10:10:44 PM com.sun.jersey.api.core.ScanningResourceConfig logClasses
> INFO: Root resource classes found:
>   class org.apache.ambari.server.security.unsecured.rest.CertificateSign
>   class org.apache.ambari.server.security.unsecured.rest.CertificateDownload
> Apr 8, 2013 10:10:44 PM com.sun.jersey.api.core.ScanningResourceConfig init
> INFO: No provider classes found.
> Apr 8, 2013 10:10:44 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> INFO: Initiating Jersey application, version 'Jersey: 1.11 12/09/2011 10:27 AM'
> ==> ambari-server.log <==
> 22:13:23,445  INFO QueryImpl:130 - Executing resource query: {Host=null}
> 22:13:23,445  INFO ClusterControllerImpl:92 - Using resource provider org.apache.ambari.server.controller.internal.HostResourceProvider for request type Host
> 22:13:26,460  INFO QueryImpl:130 - Executing resource query: {Host=null}
> 22:13:26,460  INFO ClusterControllerImpl:92 - Using resource provider org.apache.ambari.server.controller.internal.HostResourceProvider for request type Host
> 22:13:29,469  INFO QueryImpl:130 - Executing resource query: {Host=null}
> 22:13:29,469  INFO ClusterControllerImpl:92 - Using resource provider org.apache.ambari.server.controller.internal.HostResourceProvider for request type Host

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira