You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Myroslav Papirkovskyi (JIRA)" <ji...@apache.org> on 2017/02/10 14:29:41 UTC

[jira] [Resolved] (AMBARI-19930) The service check status was set to TIMEOUT even if service check was failed

     [ https://issues.apache.org/jira/browse/AMBARI-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Myroslav Papirkovskyi resolved AMBARI-19930.
--------------------------------------------
    Resolution: Fixed

Pushed to trunk and branch-2.5

> The service check status was set to TIMEOUT even if service check was failed
> ----------------------------------------------------------------------------
>
>                 Key: AMBARI-19930
>                 URL: https://issues.apache.org/jira/browse/AMBARI-19930
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Myroslav Papirkovskyi
>
> Steps to reproduce:
> * Install a cluster with Hadoop, Tez, Hbase , Hive, Spark
> * Enable Wire encryption
> * Run Tez service check
> Here, agent.service.check.task.timeout is set to 600 sec. Tez application was started in background. The service check then  tries to find out SUCCESS file for couple of minutes only. In this particular instance, the application took 5 minutes to run. Thus, the check for SUCCESS file on HDFS failed. 
> In this scenario, the status for service check should be failed instead Timeout.
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-370.txt
> stdout:   /var/lib/ambari-agent/data/output-370.txt
> 2017-02-08 03:55:55,017 - HdfsResource['/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz'] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'source': '/usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz', 'dfs_type': '', 'default_fs': 'hdfs://host:8020', 'replace_existing_files': False, 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs@EXAMPLE.COM', 'user': 'hdfs', 'owner': 'hdfs', 'group': 'hadoop', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'file', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0444}
> 2017-02-08 03:55:55,017 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs@EXAMPLE.COM'] {'user': 'hdfs'}
> 2017-02-08 03:55:55,096 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET --negotiate -u : -k '"'"'https://host:50470/webhdfs/v1/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmpoIadeN 2>/tmp/tmp6nFiLj''] {'logoutput': None, 'quiet': False}
> 2017-02-08 03:55:55,292 - call returned (0, '')
> 2017-02-08 03:55:55,293 - DFS file /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz is identical to /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz, skipping the copying
> 2017-02-08 03:55:55,293 - Will attempt to copy tez tarball from /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz to DFS at /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz.
> 2017-02-08 03:55:55,293 - HdfsResource[None] {'security_enabled': True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab', 'dfs_type': '', 'default_fs': 'hdfs://host:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs@EXAMPLE.COM', 'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']}
> 2017-02-08 03:55:55,294 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/smokeuser.headless.keytab ambari-qa-cl1@EXAMPLE.COM;'] {'user': 'ambari-qa'}
> 2017-02-08 03:55:55,389 - ExecuteHadoop['jar /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'try_sleep': 5, 'tries': 3, 'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'user': 'ambari-qa', 'conf_dir': '/usr/hdp/current/hadoop-client/conf'}
> 2017-02-08 03:55:55,390 - Execute['hadoop --config /usr/hdp/current/hadoop-client/conf jar /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'logoutput': None, 'try_sleep': 5, 'environment': {}, 'tries': 3, 'user': 'ambari-qa', 'path': ['/usr/hdp/current/hadoop-client/bin']}{code}
> {code}
> Requests: {
> aborted_task_count: 0,
> cluster_name: "cl1",
> completed_task_count: 1,
> create_time: 1486526151743,
> end_time: 1486526463038,
> exclusive: false,
> failed_task_count: 0,
> id: 29,
> inputs: "{}",
> operation_level: null,
> progress_percent: 100,
> queued_task_count: 0,
> request_context: "WE API TEZ Service Check",
> request_schedule: null,
> request_status: "TIMEDOUT",
> resource_filters: [
> {
> service_name: "TEZ"
> }
> ],
> start_time: 1486526151751,
> task_count: 1,
> timed_out_task_count: 1,
> type: "COMMAND"
> },{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)