You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ambari.apache.org by Greg Hill <gr...@RACKSPACE.COM> on 2016/02/19 15:59:57 UTC

Making the agent retry custom actions

This is for Ambari 2.1.1 so apologies if this has since been fixed.  We saw a failure today in one of our custom actions caused by a temporary network hiccup:

Caught an exception while executing custom service command: <class 'ambari_agent.FileCache.CachingException'>: Can not download file from url https://ambari.local:443/resources//custom_actions/.hash : <urlopen error timed out>; Can not download file from url https://ambari.local:443/resources//custom_actions/.hash : <urlopen error timed out>

Is there some way to tell the agent to not fail here? Just keep retrying until it can download the file from the server.  If it takes too long we'll handle timing out the build and cleaning up ourselves.

The 'tolerate_download_failures' setting doesn't trigger a retry, it just relies on the local cache to proceed, and the file isn't in the local cache yet, so it fails with a file missing exception if we enable it.

Greg