You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by John Hardin <jh...@impsec.org> on 2011/10/29 22:10:42 UTC

ruleqa monitoring on spamassassin2.zones.apache.org

All:

I was able to access automc's crontab on spamassassin2.zones.apache.org, 
so I set up two monitors: one that checks to see if the ruleqa daemon is 
currently running (if it isn't then see 
http://wiki.apache.org/spamassassin/InfraNotes for how to restart it), and 
another that attempts to use ntpdate to check the difference between the 
clocks on spamassassin2.zones.apache.org and spamassassin.zones.apache.org 
and warn if the difference gets "too great" (which is yet to be 
intelligently defined).

Both checks are currently run once daily, and will email a warning to the 
dev list if they fail.

As the clocks are successfully synchronized at the moment, I don't have 
any way to test whether the clock drift check will actually work. Perhaps 
after we've gotten a rules update out we could ask someone in Infra to 
temporarily change the clock on one or the other box and see whether 
"ntpdate -q" will actually detect the problem.

Suggestions for a better way to check for clock drift are welcome.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...the Fates notice those who buy chainsaws...
                                               -- www.darwinawards.com
-----------------------------------------------------------------------
  2 days until Halloween

Re: ruleqa monitoring on spamassassin2.zones.apache.org

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
On 10/30/2011 1:51 PM, John Hardin wrote:
> On Sun, 30 Oct 2011, John Hardin wrote:
>
>> I suppose there should be a test for that situation, as one way this 
>> could happen again is ntpd on one box or the other dies and the 
>> clocks start drifting... I'll add a check for that.
>
> I must have been oxygen-starved Saturday. Both zones are now checking 
> their clocks against the clock on www.apache.org and will warn if the 
> difference reaches or exceeds ten minutes.
>
New version looks much less O2 sufficient ;-)


Regards,
KAM

Re: ruleqa monitoring on spamassassin2.zones.apache.org

Posted by John Hardin <jh...@impsec.org>.
On Sun, 30 Oct 2011, John Hardin wrote:

> I suppose there should be a test for that situation, as one way this could 
> happen again is ntpd on one box or the other dies and the clocks start 
> drifting... I'll add a check for that.

I must have been oxygen-starved Saturday. Both zones are now checking 
their clocks against the clock on www.apache.org and will warn if the 
difference reaches or exceeds ten minutes.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...the Fates notice those who buy chainsaws...
                                               -- www.darwinawards.com
-----------------------------------------------------------------------
  Tomorrow: Halloween

Re: ruleqa monitoring on spamassassin2.zones.apache.org

Posted by John Hardin <jh...@impsec.org>.
On Sun, 30 Oct 2011, Kevin A. McGrail wrote:

>>  another that attempts to use ntpdate to check the difference between the
>>  clocks on spamassassin2.zones.apache.org and spamassassin.zones.apache.org
>>  and warn if the difference gets "too great" (which is yet to be
>>  intelligently defined).
>
> Thanks for doing this.  As for testing it, I'm hoping the time skew is a 
> unique situation that never repeats because I don't see a better way than you 
> devised.
>
> Overall, though, theoretically Infra fixed ntpupdate on the master zone for 
> that box so even if they changed the time, my guess is there ntpupdate would 
> fix it very quickly.

Agreed. Checking for clock drift in an NTP environment is inherently kinda 
silly. :)

> And unfortunately, purposefully changing the time would change all the 
> zones on that box.
>
> So it's not something I would want to even ask to be tested on a 
> production box.

Yeah, there's that, too. However, if the temporary test clock drift was 10 
minutes that would be enough to test the monitor and (he says hopefully) 
not materially affect the other zones - which, I point out, were affected 
when the clock was skewed on its own earlier, right?

> However, I don't know that your system for time skew check will work without 
> the zones1 running ntpd.  Isn't it currently just return 0 skew because it 
> can't contact zones?

If there wasn't an NTP daemon available on the target host it would 
explicitly say no time daemon could be contacted.

I suppose there should be a test for that situation, as one way this could 
happen again is ntpd on one box or the other dies and the clocks start 
drifting... I'll add a check for that.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...the Fates notice those who buy chainsaws...
                                               -- www.darwinawards.com
-----------------------------------------------------------------------
  Tomorrow: Halloween

Re: ruleqa monitoring on spamassassin2.zones.apache.org

Posted by "Kevin A. McGrail" <KM...@PCCC.com>.
> I was able to access automc's crontab on 
> spamassassin2.zones.apache.org, so I set up two monitors: one that 
> checks to see if the ruleqa daemon is currently running (if it isn't 
> then see http://wiki.apache.org/spamassassin/InfraNotes for how to 
> restart it), and another that attempts to use ntpdate to check the 
> difference between the clocks on spamassassin2.zones.apache.org and 
> spamassassin.zones.apache.org and warn if the difference gets "too 
> great" (which is yet to be intelligently defined).
>
> Both checks are currently run once daily, and will email a warning to 
> the dev list if they fail.
>
> As the clocks are successfully synchronized at the moment, I don't 
> have any way to test whether the clock drift check will actually work. 
> Perhaps after we've gotten a rules update out we could ask someone in 
> Infra to temporarily change the clock on one or the other box and see 
> whether "ntpdate -q" will actually detect the problem.
>
> Suggestions for a better way to check for clock drift are welcome.
>
Hi John,

Thanks for doing this.  As for testing it, I'm hoping the time skew is a 
unique situation that never repeats because I don't see a better way 
than you devised.

Overall, though, theoretically Infra fixed ntpupdate on the master zone 
for that box so even if they changed the time, my guess is there 
ntpupdate would fix it very quickly.   And unfortunately, purposefully 
changing the time would change all the zones on that box.

So it's not something I would want to even ask to be tested on a 
production box.

However, I don't know that your system for time skew check will work 
without the zones1 running ntpd.  Isn't it currently just return 0 skew 
because it can't contact zones?

regards,
kAM