You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Da...@ChaosReigns.com on 2010/10/30 19:12:54 UTC

Massive drop in spam in network mass checks in the last two weeks

In the last two network mass checks, today and a week ago, only 3.1% and
3.4%, respectively, of the corpora has been spam.  Why?

Total number of spams in the network mass check corpora over the last five
weeks:

20101002 585728
20101009 150082
20101016 154352
20101023 6858
20101030 6244

The percentage of the corpora which is spam dropped similarly over the same
period.

The bottom two graphs on http://www.chaosreigns.com/dnswl/
are this data (total spam and ham, and percentage of corpora which
is ham).  (I recommend firefox for this page.  I opened a bug against
chrome for failure to scale embedded svgs properly.)

Such an extreme drop has happened three times before in as many years,
but this is the first time it was the result of a multi-week trend,
and the first time it stayed so low two weeks in a row.

The data is from http://ruleqa.spamassassin.org/

-- 
"When in doubt, gas it. It may not solve the problem,
But it ends the suspense." - Steve Moonitz, DoD #2319, 1994
http://www.ChaosReigns.com

Re: Massive drop in spam in network mass checks in the last two weeks

Posted by Da...@ChaosReigns.com.
On 10/30, Daryl C. W. O'Shea wrote:
> I had an IBM Deathstar go on me.  Although I thought it had, moving
> my mail spool and personal home directory to a RAID array never made
> it to the top of my to-do list.  To make it worse, my
> backups-to-disk array failed the week before.  I've never been able
> to justify a tape library for home, so I'm without any backups now.

I am sincerely sorry for your loss.  I've had a couple similar experiences.

> If anybody at a major data recovery firm feels like helping me out,
> I'd appreciate it.  I'm on the fence right now about spending big $$
> to recover the data.

It is my experience that people *always* decide it's not worth it once they
find out how much money it actually costs.  If you come up with something
different, I'd love to know.  

> >Such an extreme drop has happened three times before in as many years,
> 
> That's a pretty good record given the volunteer nature, I think.

Absolutely.  I only mentioned it to point out this recent event was
different.

> >but this is the first time it was the result of a multi-week trend,
> >and the first time it stayed so low two weeks in a row.
> 
> That's probably a result of me having been running a city council
> election campaign in October and not having time to get mass-checks
> running again on what mail I can collect from caches.
> Unfortunately, for the SpamAssassin community, I was elected so time
> continues to be short on my end.  Although I do think that I got it
> pretty much working last night so their should be results sometime
> today.  I maybe having some DNS issues though, so it might be
> another week or two before I have solid results.

Congratulations :)

I guess it's just weird that the failure mode involved dropping spam and
not non-spam.  You might want to mention on the ruleqa site that it's
broken.

Although it looks back to normal now.

Thanks for maintaining it.

-- 
"He who dies with the most toys... still dies." - No Fear
http://www.ChaosReigns.com

Re: Massive drop in spam in network mass checks in the last two weeks

Posted by "Daryl C. W. O'Shea" <sp...@dostech.ca>.
On 30/10/2010 1:12 PM, Darxus@ChaosReigns.com wrote:
> In the last two network mass checks, today and a week ago, only 3.1% and
> 3.4%, respectively, of the corpora has been spam.  Why?

I had an IBM Deathstar go on me.  Although I thought it had, moving my 
mail spool and personal home directory to a RAID array never made it to 
the top of my to-do list.  To make it worse, my backups-to-disk array 
failed the week before.  I've never been able to justify a tape library 
for home, so I'm without any backups now.

If anybody at a major data recovery firm feels like helping me out, I'd 
appreciate it.  I'm on the fence right now about spending big $$ to 
recover the data.

> Such an extreme drop has happened three times before in as many years,

That's a pretty good record given the volunteer nature, I think.

> but this is the first time it was the result of a multi-week trend,
> and the first time it stayed so low two weeks in a row.

That's probably a result of me having been running a city council 
election campaign in October and not having time to get mass-checks 
running again on what mail I can collect from caches.  Unfortunately, 
for the SpamAssassin community, I was elected so time continues to be 
short on my end.  Although I do think that I got it pretty much working 
last night so their should be results sometime today.  I maybe having 
some DNS issues though, so it might be another week or two before I have 
solid results.

Regards,

Daryl