You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ambari.apache.org by Greg Hill <gr...@RACKSPACE.COM> on 2016/02/03 19:55:10 UTC

custom action times out even though it never even started

In our Ambari setup, we inject some custom actions.  Generally this has worked well, but lately I've been testing a specific one and the behavior I'm seeing confuses me.  We have one custom action that will download a script from a URL and run it.  However, despite my setting the timeout on the script to an hour, it randomly "times out" on specific servers in a matter of seconds.  It times out before it even attempts to run the script on that host.  Is there some time out based on whether ambari-agent starts running the action in a certain amount of time I need to tweak here?  It's random, but it usually affects at least one host in the cluster.

I should note that this is done after I restart the Ambari server, so it's possible that the agent hasn't fully re-established communications.  Should I check the host status before posting my Request to run the script to make sure it has gotten back to HEALTHY?

Ambari 2.1.1 if that matters (we're going to update to 2.2.1 when it's out).

Any help appreciated here.

Greg


Re: custom action times out even though it never even started

Posted by Greg Hill <gr...@RACKSPACE.COM>.
Thanks.  I added some code to wait until the agents returned to HEALTHY state and it seems to be a lot more reliable now.

Greg

From: Sumit Mohanty <sm...@hortonworks.com>>
Reply-To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Date: Wednesday, February 3, 2016 at 11:11 PM
To: "user@ambari.apache.org<ma...@ambari.apache.org>" <us...@ambari.apache.org>>
Subject: Re: custom action times out even though it never even started


Generally, if host is in heart-beat lost or unknown state then commands timeout immediately. Adding health check will help for sure.

________________________________
From: Greg Hill <gr...@RACKSPACE.COM>>
Sent: Wednesday, February 03, 2016 10:55 AM
To: user@ambari.apache.org<ma...@ambari.apache.org>
Subject: custom action times out even though it never even started

In our Ambari setup, we inject some custom actions.  Generally this has worked well, but lately I've been testing a specific one and the behavior I'm seeing confuses me.  We have one custom action that will download a script from a URL and run it.  However, despite my setting the timeout on the script to an hour, it randomly "times out" on specific servers in a matter of seconds.  It times out before it even attempts to run the script on that host.  Is there some time out based on whether ambari-agent starts running the action in a certain amount of time I need to tweak here?  It's random, but it usually affects at least one host in the cluster.

I should note that this is done after I restart the Ambari server, so it's possible that the agent hasn't fully re-established communications.  Should I check the host status before posting my Request to run the script to make sure it has gotten back to HEALTHY?

Ambari 2.1.1 if that matters (we're going to update to 2.2.1 when it's out).

Any help appreciated here.

Greg


Re: custom action times out even though it never even started

Posted by Sumit Mohanty <sm...@hortonworks.com>.
Generally, if host is in heart-beat lost or unknown state then commands timeout immediately. Adding health check will help for sure.

________________________________
From: Greg Hill <gr...@RACKSPACE.COM>
Sent: Wednesday, February 03, 2016 10:55 AM
To: user@ambari.apache.org
Subject: custom action times out even though it never even started

In our Ambari setup, we inject some custom actions.  Generally this has worked well, but lately I've been testing a specific one and the behavior I'm seeing confuses me.  We have one custom action that will download a script from a URL and run it.  However, despite my setting the timeout on the script to an hour, it randomly "times out" on specific servers in a matter of seconds.  It times out before it even attempts to run the script on that host.  Is there some time out based on whether ambari-agent starts running the action in a certain amount of time I need to tweak here?  It's random, but it usually affects at least one host in the cluster.

I should note that this is done after I restart the Ambari server, so it's possible that the agent hasn't fully re-established communications.  Should I check the host status before posting my Request to run the script to make sure it has gotten back to HEALTHY?

Ambari 2.1.1 if that matters (we're going to update to 2.2.1 when it's out).

Any help appreciated here.

Greg