You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by John Hardin <jh...@impsec.org> on 2010/04/29 15:32:24 UTC

sa-updates wedged?

All:

The current 3.3.x autoupdate is for revision 932302:

jhardin@dendarii ~/develop/spamassassin $ dig TXT 2.3.3.updates.spamassassin.org

;; ANSWER SECTION:
2.3.3.updates.spamassassin.org.	2786 IN	TXT	"932302"

...yet we're well past that revision.

There's a fairly important FP bugfix in my autopromoted TO_EQ_FM rules 
that hasn't gone out. Is there something blocking the automatic update 
generation process?

If not, can we push a rules update?

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The Constitution is a written instrument. As such its meaning does
   not alter. That which it meant when adopted, it means now.
                     -- U.S. Supreme Court
                        SOUTH CAROLINA v. US, 199 U.S. 437, 448 (1905)
-----------------------------------------------------------------------
  9 days until the 65th anniversary of VE day

Re: sa-updates wedged?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 15/05/2010 4:31 PM, John Hardin wrote:
> On Fri, 14 May 2010, John Hardin wrote:
> 
>> On Mon, 10 May 2010, Daryl C. W. O'Shea wrote:
>>
>>>  I'd have to check the logs, but it could be that we're not meeting the
>>>  minimum ham/spam results that are required to generate an update.
>>>  I've got it set to a minimum of 150,000 ham and spam each.
>>
>> It looks like the current ruleqa ham corpus is ~127k.
> 
> Sorry, that should be the _spam_ corpus. 129925 as of the latest run.
> The ham corpus is nearly 260k.

My net enabled check was successful this week, so an update got
published last night.

Stats for usable net-checked (set 1) messages:

 HAM: 173841 (150000 required)
SPAM: 1050099 (150000 required)

For Friday's non-net-checked (set 0) messages:

 HAM: 186535 (150000 required)
SPAM: 1053687 (150000 required)

Only ham that is 39 months old or less is used.
Only spam that is 3 months old  or less is used.

>> Is your NFS server still down? I note the "dos" corpus has been
>> missing for a while now...

I had an old mass-check server running through a million messages at
about 50/min on a single ancient proc that was tying up the socket and
preventing my nightly mass-checks from running.  I killed it earlier
this week.

Daryl


Re: sa-updates wedged?

Posted by John Hardin <jh...@impsec.org>.
On Fri, 14 May 2010, John Hardin wrote:

> On Mon, 10 May 2010, Daryl C. W. O'Shea wrote:
>
>>  I'd have to check the logs, but it could be that we're not meeting the
>>  minimum ham/spam results that are required to generate an update.
>>  I've got it set to a minimum of 150,000 ham and spam each.
>
> It looks like the current ruleqa ham corpus is ~127k.

Sorry, that should be the _spam_ corpus. 129925 as of the latest run. The 
ham corpus is nearly 260k.

> Is your NFS server still down? I note the "dos" corpus has been missing 
> for a while now...

--
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Liberals love sex ed because it teaches kids to be safe around their
   sex organs. Conservatives love gun education because it teaches kids
   to be safe around guns. However, both believe that the other's
   education goals lead to dangers too terrible to contemplate.
-----------------------------------------------------------------------
  7 days since a sunspot last seen - EPA blames CO2 emissions

Re: sa-updates wedged?

Posted by John Hardin <jh...@impsec.org>.
On Mon, 10 May 2010, Daryl C. W. O'Shea wrote:

> I'd have to check the logs, but it could be that we're not meeting the
> minimum ham/spam results that are required to generate an update.  I've
> got it set to a minimum of 150,000 ham and spam each.

It looks like the current ruleqa ham corpus is ~127k.

Is your NFS server still down? I note the "dos" corpus has been missing 
for a while now...

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   News flash: Lowest Common Denominator down 50 points
-----------------------------------------------------------------------
  6 days since a sunspot last seen - EPA blames CO2 emissions

Re: sa-updates wedged?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 11/05/2010 10:51 AM, Tony Finch wrote:
> On Mon, 10 May 2010, Daryl C. W. O'Shea wrote:
>>
>> Yeah, since early April the ham results have fallen below the 150k
>> message threshold to about 143k messages.  150k was already quite a bit
>> lower than I was really comfortable with but I guess we could lower it
>> if necessary.
> 
> Are thre thresholds different for the 3.4 sa-update stream?

3.4 rules get published as long as they meet the promotion criteria
(which I believe requires a much much lower threshold). There is no
score generation step.  Rules get published without scores so they get
the default for the rule type.

Daryl


Re: sa-updates wedged?

Posted by Tony Finch <do...@dotat.at>.
On Mon, 10 May 2010, Daryl C. W. O'Shea wrote:
>
> Yeah, since early April the ham results have fallen below the 150k
> message threshold to about 143k messages.  150k was already quite a bit
> lower than I was really comfortable with but I guess we could lower it
> if necessary.

Are thre thresholds different for the 3.4 sa-update stream?

Tony.
-- 
f.anthony.n.finch  <do...@dotat.at>  http://dotat.at/  【ツ】
FISHER GERMAN BIGHT: NORTH OR NORTHWEST 5 TO 7, OCCASIONALLY GALE 8 IN FISHER
AT FIRST, DECREASING 4 OR 5, BECOMING VARIABLE OR NORTHEAST 3 OR 4 LATER.
MODERATE OR ROUGH, BECOMING SLIGHT OR MODERATE. SHOWERS THEN FAIR. GOOD.

RE: sa-updates wedged?

Posted by "Randal, Phil" <pr...@herefordshire.gov.uk>.
Daryl C. W. O'Shea wrote:
>> I'd have to check the logs, but it could be that we're not meeting 
>> the minimum ham/spam results that are required to generate an update.
>> I've got it set to a minimum of 150,000 ham and spam each.

>Yeah, since early April the ham results have fallen below the 150k
message threshold to about 143k messages.  
>150k was already quite a bit lower than I was really comfortable with
but I guess we could lower it if necessary.

>Spam results for the weekly mass-check since May 1 are way under the
150k threshold at 50k and 35k. 
>This will block an update for the entire week.

It looks like the "weekly" model is broken.  Perhaps we should consider
chaniging it?

One possibility is a threshold-based one.  Collect ham & spam until some
threshold is exceeded, generate mass-check, promote rules, reset
counters, repeat.

On the other hand, spam mutates over time, so last week's data may not
be relevant this week...

Cheers,

Phil

Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Herefordshire Council.
You should be aware that Herefordshire Council monitors its email service.
This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it.

Re: sa-updates wedged?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 10/05/2010 10:46 PM, Daryl C. W. O'Shea wrote:
> I'd have to check the logs, but it could be that we're not meeting the
> minimum ham/spam results that are required to generate an update.  I've
> got it set to a minimum of 150,000 ham and spam each.

Yeah, since early April the ham results have fallen below the 150k
message threshold to about 143k messages.  150k was already quite a bit
lower than I was really comfortable with but I guess we could lower it
if necessary.

Spam results for the weekly mass-check since May 1 are way under the
150k threshold at 50k and 35k.  This will block an update for the entire
week.

Daryl


Re: sa-updates wedged?

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
I'd have to check the logs, but it could be that we're not meeting the
minimum ham/spam results that are required to generate an update.  I've
got it set to a minimum of 150,000 ham and spam each.

I know that it's still running.  It looks like this week there weren't
enough results in the weekly mass-check to let it create results all
week (perhaps it should revert to using the previous week's mass-check).

Daryl


On 29/04/2010 9:32 AM, John Hardin wrote:
> All:
> 
> The current 3.3.x autoupdate is for revision 932302:
> 
> jhardin@dendarii ~/develop/spamassassin $ dig TXT
> 2.3.3.updates.spamassassin.org
> 
> ;; ANSWER SECTION:
> 2.3.3.updates.spamassassin.org.    2786 IN    TXT    "932302"
> 
> ...yet we're well past that revision.
> 
> There's a fairly important FP bugfix in my autopromoted TO_EQ_FM rules
> that hasn't gone out. Is there something blocking the automatic update
> generation process?
> 
> If not, can we push a rules update?
>