You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Jukka Zitting <ju...@gmail.com> on 2010/04/19 09:48:45 UTC
[hudson] Slaves going offline
Hi,
Every now and then I see Hudson marking slaves as offline due to "Ping
response time is too long or timed out", but when I check the slave
it's still running OK. The problem seems to be because of extra load
on the master [1] or perhaps due to the slave-status plugin [2].
Do we need the slave-status plugin for anything? If not, I'd like to
disable it for now to see if that makes a difference.
We may also want to review the other plugins we have installed. Do we
need them all?
[1] http://issues.hudson-ci.org/browse/HUDSON-6196
[2] http://www.echelog.com/logs/browse/hudson/1268866800
BR,
Jukka Zitting
Re: [hudson] Slaves going offline
Posted by Niklas Gustavsson <ni...@protocol7.com>.
On Mon, Apr 19, 2010 at 9:48 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Do we need the slave-status plugin for anything? If not, I'd like to
> disable it for now to see if that makes a difference.
We use it to monitor the health of Hudson on the Windows slave (the
slave has had a tendency to crash) using the regular Nagios
monitoring. So, if we disable the slave-status plugin, we need to make
sure to disable the Nagios check as well.
In the long run, we need a way to monitor at least the Windows slave.
But in the short run I have no issue with trying to disable the plugin
to see if it's causing these problems.
/niklas