You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2011/10/09 06:41:14 UTC

[Bug 6671] New: Updates not happening due to lack of bb corpora since August 27th

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

             Bug #: 6671
           Summary: Updates not happening due to lack of bb corpora since
                    August 27th
           Product: Spamassassin
           Version: 3.4.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: Masses
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: Darxus@ChaosReigns.com
    Classification: Unclassified


Yesterday's weekly net mass-check again didn't have enough non-spam for score
regeneration to happen:
http://www.chaosreigns.com/dnswl/tot.svg
(We had 102,436 out of the needed 150,000 non-spams, 68.3%.)

And again, none of the bb corpora showed up in ruleqa:
http://ruleqa.spamassassin.org/?daterev=20111008

I'm guessing there is a direct relationship.

bb corpora are the ones where people upload their emails and mass-check is run
on a spamassassin server.  Can somebody look into why these aren't making it to
ruleqa?  

Is this sufficiently documented somewhere?

Missing ham-net-bb-guenther_fraud - last seen 20110820.
Missing ham-net-bb-jhardin - last seen 20110820.
Missing ham-net-bb-jhardin_fraud - last seen 20110820.
Missing ham-net-bb-jm - last seen 20110820.

It looks like ruleqa didn't run on 2011-08-27, and the bb corpora haven't been
included since.  That does correspond exactly to when net runs dropped below
the 150,000 non-spam threshold.

Last time they were included: http://ruleqa.spamassassin.org/?daterev=20110820

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #28 from Darxus <Da...@ChaosReigns.com> 2011-10-24 14:55:23 UTC ---
(In reply to comment #27)
> The time issue was fixed.

Nice.  I don't suppose they set up ntpd / ntpdate, by any chance?

Today (2011-10-24) and yesterday (2011-10-23) are the first days in a while
(2011-08-24?) that the nightly (non-net) ruleqa output includes non-bb corpora.
 That's encouraging.

> The date for the Release Candidate is an estimate.  I'm far more worried about
> the rules operation.

Sure.  And if everybody were working on rule generation I wouldn't have brought
it up.  So I'm wondering what else we need to do for a release while we wait to
see if rule generation works this coming Saturday.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

John Hardin <jh...@impsec.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #33 from John Hardin <jh...@impsec.org> 2011-10-30 23:36:05 UTC ---
Okay, fixing the time discrepancy between the zones seems to have revived
masscheck producing rule updates. Now we just need to figure out why
72_scores.cf is being omitted from the update tarball... (bug #6644)

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

John Hardin <jh...@impsec.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jhardin@impsec.org

--- Comment #1 from John Hardin <jh...@impsec.org> 2011-10-09 16:45:57 UTC ---
Data point - a snapshot of the current ruleqa output:

1180567: 2011-10-09 05:57:14
khopesh: auto-generated rules

   20111009-r1180567-n
   bb-jhardin_fraud bb-jm danmcdonald darxus-trap darxus grenier jarif kgolding
llanga wt-ackbar wt-en1 wt-en2-flh wt-en3 wt-hamtrap wt-homeone wt-jp1 [+]

(...Network masscheck omitted here...)

1179962: 2011-10-07 05:57:18
khopesh: auto-generated rules

   20111009-r1179962-n
   bb-guenther_fraud bb-jhardin [+] 


How is the masscheck from 2011-10-07 using logs dated 20111009?

I suspect two processes are running, the one to post last is using a smaller
corpus. I wager the large-corpus "1180567: 2011-10-09 05:57:14" masscheck
results will soon be overwritten by another masscheck using only the fraud
corpora.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #32 from Kevin A. McGrail <km...@pccc.com> 2011-10-25 13:53:21 UTC ---
> Okay, it indeed looks like the clock variance is what was causing the masscheck
> results to be discarded. It's now got a couple of days of full results that it
> hasn't destroyed.

Good call to Darxus.  I never would have checked without his impetus.


> I'll see if I can set up some monitoring tasks in automc's cron. If I can get
> that working should notifications be sent to the ruleqa list or the dev list?

My $0.02 is Dev, please.  RuleQA should be low volume. And people on Dev will
likely know how to use a rule to filter things...

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #2 from John Hardin <jh...@impsec.org> 2011-10-09 17:09:38 UTC ---
>From the automc/freqsd log it looks like it's deciding to go back and overwrite
older results outputs. I don't know if this is normal behavior; I'm grabbing
the log so that I can perform more analysis locally.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #22 from Darxus <Da...@ChaosReigns.com> 2011-10-21 20:54:57 UTC ---
Why aren't those interested people contributing yet?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #29 from AXB <ax...@gmail.com> 2011-10-24 14:58:23 UTC ---
(In reply to comment #28)

> > The date for the Release Candidate is an estimate.  I'm far more worried about
> > the rules operation.
> 
> Sure.  And if everybody were working on rule generation I wouldn't have brought
> it up.  So I'm wondering what else we need to do for a release while we wait to
> see if rule generation works this coming Saturday.

see that trunk gets the scores file for the auto promoted rules so these are
not scored 1.0 due to missing file.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #8 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 20:03:05 UTC ---
(In reply to comment #7)
> Time looks interesting, BTW:
> 
> zones: Wed Oct 19 19:48:38 GMT 2011
> zones2: Wed Oct 19 20:51:45 UTC 2011
> 
> Far as I know GMT and UTC are identical zones, yes?
> 
> I'm getting in touch with Infra now.

Jira ticket open: https://issues.apache.org/jira/browse/INFRA-4054

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

Darxus <Da...@ChaosReigns.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Darxus@ChaosReigns.com

--- Comment #4 from Darxus <Da...@ChaosReigns.com> 2011-10-17 00:05:37 UTC ---
Yesterday was the eighth week without a -net run including the bb corpora.  (In
the final output.)

Who has sufficient access to look at this?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #30 from Kevin A. McGrail <km...@pccc.com> 2011-10-24 15:20:49 UTC ---
(In reply to comment #28)
> (In reply to comment #27)
> > The time issue was fixed.
> 
> Nice.  I don't suppose they set up ntpd / ntpdate, by any chance?

NTPD is supposed to be on the master zone but they gave me no feedback as to
the technical nature of the fix beyond:

Zones and Zones2 are on two different hardware virtual machines.

Ntpd/ntpupdate ONLY run on the master zone not on virtual zones.

So my assumption is they fixed ntpd on the master zone.

> 
> Today (2011-10-24) and yesterday (2011-10-23) are the first days in a while
> (2011-08-24?) that the nightly (non-net) ruleqa output includes non-bb corpora.
>  That's encouraging.

Excellent.  That was my prediction/hope.  I will sacrifice an intern to appease
the computer gods if needed ;-)

> 
> > The date for the Release Candidate is an estimate.  I'm far more worried about
> > the rules operation.
> 
> Sure.  And if everybody were working on rule generation I wouldn't have brought
> it up.  So I'm wondering what else we need to do for a release while we wait to
> see if rule generation works this coming Saturday.

Good question.  I'll look.  Off-hand, Mark has done a great job of moving the
project towards a release with IPv6 support.  I'll respond to dev about that.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #20 from Darxus <Da...@ChaosReigns.com> 2011-10-21 20:25:52 UTC ---
(In reply to comment #19)
> > Missing ham-net-bb-guenther_fraud
> 
> Since this has been mentioned quite a few times: Please do note that this is a
> hand classified corpus of *fraud* spam, intended for the SOUGHT_FRAUD rule-set.
> 
> Naturally, the mentioned ham counterpart to the fraud corpus does not exist,

I realize that.  The ruleqa page does actually say the ham corpus existed:

ham-net-bb-guenther_fraud.20110820-r1159860-n.log:
started: 20110820T090310Z;
submitted: 20110820T090048Z;
size: 4547 bytes

It was there, although empty.  It is now missing.  What that indicates is that
either the ham or spam part is missing, or both.  I generally only include the
ham part of the list of missing corpora from the output of my script to avoid
redundancy.  

So when I say "ham-net-bb-guenther_fraud" is missing, what I mean is that some
part of "bb-guenther_fraud" is missing.  Sorry that bothers you.  I could start
stripping the ham-/spam- part off, but I'm afraid that could result in missing
interesting output.


(In reply to comment #17)
> (In reply to comment #16)
> > But I am a little concerned that near a quarter of the ham we've been using for
> > ruleqa / score generation is about 3-4 years old, from jm.
> 
> We will get more!  Though I will state that we are mostly pretty good at

How?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #21 from Kevin A. McGrail <km...@pccc.com> 2011-10-21 20:31:57 UTC ---
> > We will get more!  Though I will state that we are mostly pretty good at
> 
> How?

We have lots of people interested in contributing.  My corpora aren't even
included at the moment because something with rsync broke.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #15 from Darxus <Da...@ChaosReigns.com> 2011-10-21 18:30:07 UTC ---
We have bigger problems:  Bringing back the bb corpora will not be sufficient
to enable score generation / rule updates.

As of the last net run, we have 38,740 fewer hams than required.  Only hams no
more than 2 months old are used.  Of the missing bb corpora, number of hams in
the last 3 months when last seen:

bb-guenther_fraud  0 (all spam)
bb-jhardin         471
bb-jhardin_fraud   0 (all spam)
bb-jm              0 - most recent is 2010-11

So once we get bb corpora included again, we'll still be short at least 38,269
hams, having 74.5% of the required 150,000 no more than 2 months old.

The reason I gave counts over the last three months is, for some reason, when
the bb corpora were last included, ham counts were including ham back to around
2006: 
http://ruleqa.spamassassin.org/20110820-r1159860-n/RCVD_IN_XBL/detail?s_corpus=1#corpus
For example, it says there were 70,329 hams in bb-jm, and if you look at the
yearly / monthly counts, that would need to include hams back to 2006.  Counts
in the latest net run make sense for only including the last 2 months of ham. 
This is particularly weird because even before the age threshold was changed
for bug #6557 to match score generation, the threshold was still only 6 months.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #17 from Kevin A. McGrail <km...@pccc.com> 2011-10-21 18:53:47 UTC ---
(In reply to comment #16)
> Please disregard my last comment.  I got the ham and spam age limits backwards. 
> 
> But I am a little concerned that near a quarter of the ham we've been using for
> ruleqa / score generation is about 3-4 years old, from jm.

We will get more!  Though I will state that we are mostly pretty good at
guessing what the limits should be for a lot of rules.  AutoMC isn't the end of
the world.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #12 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 20:10:46 UTC ---
(In reply to comment #11)
> (In reply to comment #8)
> > Jira ticket open: https://issues.apache.org/jira/browse/INFRA-4054
> 
> Did ntpdate not work because the time was too far off?  I know it'll do that. 
> I think -b is the flag you want to force it.

This is the error I got:

19 Oct 21:13:05 ntpdate[17267]: Can't set time of day: Not owner

Since I think this is a virtualized box, I think it has to go upstream.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #19 from Karsten Bräckelmann <gu...@rudersport.de> 2011-10-21 20:09:23 UTC ---
> Missing ham-net-bb-guenther_fraud

Since this has been mentioned quite a few times: Please do note that this is a
hand classified corpus of *fraud* spam, intended for the SOUGHT_FRAUD rule-set.

Naturally, the mentioned ham counterpart to the fraud corpus does not exist,
and is not really expected to. With the only exception of occasionally holding
very few ham samples, purely to prevent FPs -- e.g. forged facebook
notifications.

Please stop mentioning that corpus is "lacking" recent ham.

Same goes for the jhardin fraud corpus, I believe.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #16 from Darxus <Da...@ChaosReigns.com> 2011-10-21 18:50:35 UTC ---
Please disregard my last comment.  I got the ham and spam age limits backwards. 

But I am a little concerned that near a quarter of the ham we've been using for
ruleqa / score generation is about 3-4 years old, from jm.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #13 from Darxus <Da...@ChaosReigns.com> 2011-10-19 20:21:48 UTC ---
Sounds like this is running on Solaris?  And a "zone" is a Solaris virtual
machine.  So it makes sense that the "global zone administrator" (someone with
access to the host OS) would need to fix this.
http://hub.opensolaris.org/bin/view/Community+Group+zones/faq#HQ:CanazonebeanNTPclientorserver3F

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #14 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 20:23:55 UTC ---
(In reply to comment #13)
> Sounds like this is running on Solaris?  And a "zone" is a Solaris virtual
> machine.  So it makes sense that the "global zone administrator" (someone with
> access to the host OS) would need to fix this.
> http://hub.opensolaris.org/bin/view/Community+Group+zones/faq#HQ:CanazonebeanNTPclientorserver3F

That was my take as well.  Already kicked up to ASF Infra via a Jira ticket.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #18 from AXB <ax...@gmail.com> 2011-10-21 19:05:41 UTC ---
(In reply to comment #17)
> (In reply to comment #16)
> > Please disregard my last comment.  I got the ham and spam age limits backwards. 
> > 
> > But I am a little concerned that near a quarter of the ham we've been using for
> > ruleqa / score generation is about 3-4 years old, from jm.
> 
> We will get more!  Though I will state that we are mostly pretty good at
> guessing what the limits should be for a lot of rules.  AutoMC isn't the end of
> the world.


the CID imhg thing and its issues goes way back to the SARE days when stock
spam was full of them

I'll re-enable the old SARE auto masschecker instance (for a chosen few) so
rules can be tested before they're put on sandbox.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #11 from Darxus <Da...@ChaosReigns.com> 2011-10-19 20:06:20 UTC ---
(In reply to comment #8)
> Jira ticket open: https://issues.apache.org/jira/browse/INFRA-4054

Did ntpdate not work because the time was too far off?  I know it'll do that. 
I think -b is the flag you want to force it.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #10 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 20:05:32 UTC ---
(In reply to comment #9)
> (In reply to comment #7)
> > zones: Wed Oct 19 19:48:38 GMT 2011
> > zones2: Wed Oct 19 20:51:45 UTC 2011
> 
> I was hoping it would be farther off.
> 
> > Far as I know GMT and UTC are identical zones, yes?
> 
> Yup.  Difference is basically leap seconds.
> http://geography.about.com/od/timeandtimezones/a/gmtutc.htm

Time was a very good thought.

My memory is there are some safety valves in the cron jobs that if time is off
it could definitely mess with masscheck.  I use ntpdate on all my boxes so I
never even thought about time being wrong.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #9 from Darxus <Da...@ChaosReigns.com> 2011-10-19 20:03:51 UTC ---
(In reply to comment #7)
> zones: Wed Oct 19 19:48:38 GMT 2011
> zones2: Wed Oct 19 20:51:45 UTC 2011

I was hoping it would be farther off.

> Far as I know GMT and UTC are identical zones, yes?

Yup.  Difference is basically leap seconds.
http://geography.about.com/od/timeandtimezones/a/gmtutc.htm

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Re: [Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by John Hardin <jh...@impsec.org>.
On Sun, 23 Oct 2011, bugzilla-daemon@issues.apache.org wrote:

> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671
>
> --- Comment #26 from Darxus <Da...@ChaosReigns.com> 2011-10-23 15:08:01 UTC ---
> Yesterday was the 9th week that rule updates didn't happen due to this problem.
>
> It's been 3.8 days since a ticket was opened for Apache Infrastructure to
> correctly set the time on the two relevant machines, which might fix it, with
> no response at all.

I just dropped a note to infrastructure@apache.org, we'll see if that gets 
any attention.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The United States has become a place where entertainers and
   professional athletes are mistaken for people of importance.
                                         -- Maureen Johnson Smith Long
-----------------------------------------------------------------------
  318 days since the first successful private orbital launch (SpaceX)

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #26 from Darxus <Da...@ChaosReigns.com> 2011-10-23 15:08:01 UTC ---
Yesterday was the 9th week that rule updates didn't happen due to this problem.

It's been 3.8 days since a ticket was opened for Apache Infrastructure to
correctly set the time on the two relevant machines, which might fix it, with
no response at all.

And 23 days since a SA v3.4.0 Release Candidate was supposed to be released,
that's being held up, at least in part, by this problem.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #3 from John Hardin <jh...@impsec.org> 2011-10-15 22:53:37 UTC ---
(In reply to comment #1)
> Data point - a snapshot of the current ruleqa output:
> 
> 1180567: 2011-10-09 05:57:14
> khopesh: auto-generated rules
> 
>    20111009-r1180567-n
>    bb-jhardin_fraud bb-jm danmcdonald darxus-trap darxus grenier jarif
>    kgolding llanga wt-ackbar wt-en1 wt-en2-flh wt-en3 wt-hamtrap wt-homeone
>    wt-jp1 [+]

> I wager the large-corpus "1180567: 2011-10-09 05:57:14" masscheck
> results will soon be overwritten by another masscheck

Yup:

1180567: 2011-10-09 05:57:14
khopesh: auto-generated rules

   20111010-r1180567-n
   bb-guenther_fraud bb-jhardin [+]

   [-]
   ham-bb-guenther_fraud.20111010-r1180567-n.log:
   started: 20111010T090246Z;
   submitted: 20111010T080113Z;
   size: 4402 bytes

   ham-bb-jhardin.20111010-r1180567-n.log:
   started: 20111010T090522Z;
   submitted: 20111010T110436Z;
   size: 9011673 bytes

   spam-bb-guenther_fraud.20111010-r1180567-n.log:
   started: 20111010T090246Z;
   submitted: 20111010T080113Z;
   size: 1702133 bytes

   spam-bb-jhardin.20111010-r1180567-n.log:
   started: 20111010T090522Z;
   submitted: 20111010T110436Z;
   size: 1604099 bytes

(end of corpus list. bb-jm et. al. are _gone_ now)

I still don't know enough about this to figure out why it's overwriting or
deleting old results.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #27 from Kevin A. McGrail <km...@pccc.com> 2011-10-24 12:56:10 UTC ---
(In reply to comment #26)
> Yesterday was the 9th week that rule updates didn't happen due to this problem.

That concurs closely with my timing, yes.

> It's been 3.8 days since a ticket was opened for Apache Infrastructure to
> correctly set the time on the two relevant machines, which might fix it, with
> no response at all.

The time issue was fixed.

> And 23 days since a SA v3.4.0 Release Candidate was supposed to be released,
> that's being held up, at least in part, by this problem.

The date for the Release Candidate is an estimate.  I'm far more worried about
the rules operation.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #7 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 19:56:58 UTC ---
Time looks interesting, BTW:

zones: Wed Oct 19 19:48:38 GMT 2011
zones2: Wed Oct 19 20:51:45 UTC 2011

Far as I know GMT and UTC are identical zones, yes?

I'm getting in touch with Infra now.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com

--- Comment #6 from Kevin A. McGrail <km...@pccc.com> 2011-10-19 18:17:07 UTC ---
(In reply to comment #5)
> Is the date / time on the machine running this set reasonably accurately?
> 
> I'd be happy to look at the problem if you want to give me access.  I believe I
> am very qualified.

Email me off-list and perhaps we can share a session and discuss what might be
the issue over the phone at the same time?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #25 from Darxus <Da...@ChaosReigns.com> 2011-10-22 18:35:37 UTC ---
Would it be worth mentioning in the jira ticket that this is holding up a
release?  Any guesses on how long it takes The Apache Software Foundation to
set the time on two computers?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #31 from John Hardin <jh...@impsec.org> 2011-10-25 13:50:27 UTC ---
(In reply to comment #30)
> (In reply to comment #28)
> > (In reply to comment #27)
> > 
> > Today (2011-10-24) and yesterday (2011-10-23) are the first days in a while
> > (2011-08-24?) that the nightly (non-net) ruleqa output includes non-bb corpora.
> >  That's encouraging.
> 
> Excellent.

Okay, it indeed looks like the clock variance is what was causing the masscheck
results to be discarded. It's now got a couple of days of full results that it
hasn't destroyed.

I'll see if I can set up some monitoring tasks in automc's cron. If I can get
that working should notifications be sent to the ruleqa list or the dev list?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #5 from Darxus <Da...@ChaosReigns.com> 2011-10-19 18:07:09 UTC ---
Is the date / time on the machine running this set reasonably accurately?

I'd be happy to look at the problem if you want to give me access.  I believe I
am very qualified.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #23 from Kevin A. McGrail <km...@pccc.com> 2011-10-21 21:03:55 UTC ---
(In reply to comment #22)
> Why aren't those interested people contributing yet?

Because I haven't re-opened giving out rsync accounts because of a security
hole we found that I'd rather not discuss in bugzilla.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6671] Updates not happening due to lack of bb corpora since August 27th

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

--- Comment #24 from John Hardin <jh...@impsec.org> 2011-10-22 01:00:12 UTC ---
(In reply to comment #19)

> Same goes for the jhardin fraud corpus, I believe.

Correct.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.