You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Lars Jørgensen <lj...@gmail.com> on 2011/10/04 09:07:14 UTC

Rule updates

Hi,

Is it me or has it been a long time since there has been an update to 
the spamassassin ruleset?


-- 
Lars

Re: Rule updates

Posted by Robert Fitzpatrick <ro...@webtent.org>.
On 10/5/2011 5:46 PM, Jim Popovitch wrote:
> On Wed, Oct 5, 2011 at 17:41, RW <rw...@googlemail.com> wrote:
>> The usual reason for a hiatus is that too much spam or ham has aged-out
>> in the corpora, and a top-up is needed.
> 
> So, how do we get it top-up'ed?
> 

Anyone know if the 'usual reason' is because there are no rule updates
since Aug 27?

--Robert

Re: Rule updates

Posted by John Hardin <jh...@impsec.org>.
On Sun, 30 Oct 2011, Jim Popovitch wrote:

> I just got a new update.  THANKS!!!!
>
> Now, what can I do to contribute to providing updates?

Start generating hand-classified spam and ham corpora, set up SVN to keep 
a local up-to-date snapshot of SA and the rules sandboxes, then start 
running local masschecks against your corpora and uploading the results. 
See:

   http://wiki.apache.org/spamassassin/NightlyMassCheck

The SVN sync, masscheck and upload of the results can pretty easily be 
automated, but keeping your corpora fresh will be an ongoing task.

Especially desirable are ham in non-English languages.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...the Fates notice those who buy chainsaws...
                                               -- www.darwinawards.com
-----------------------------------------------------------------------
  Tomorrow: Halloween

Re: Rule updates

Posted by Jim Popovitch <ji...@gmail.com>.
On Wed, Oct 19, 2011 at 13:51, John Hardin <jh...@impsec.org> wrote:
> On Wed, 19 Oct 2011, darxus@chaosreigns.com wrote:
>
>> On 10/19, Jim Popovitch wrote:
>>>
>>> Is the missing entity one person, several people, many people?  Was
>>> there an untimely death?   I believe everyone is now aware that there
>>> exists a problem, how to we bridge the gap?
>>
>> My guess is that the only person familiar with the system is the original
>> author of spamassassin, and he doesn't have time to deal with it.  There
>> are 12 other people on the Project Management Committee, who I assume
>> could
>> all get sufficient access to the machine(s) running it:
>> http://svn.apache.org/repos/asf/spamassassin/trunk/CREDITS
>> And it seems they are all lacking the time to figure it out.
>
> I have access; getting a block of time to focus on figuring out what it's
> doing, and what it's _supposed_ to be doing, is what I'm having trouble
> with.
>

I just got a new update.  THANKS!!!!

Now, what can I do to contribute to providing updates?

-Jim P.

Re: Rule updates

Posted by John Hardin <jh...@impsec.org>.
On Wed, 19 Oct 2011, darxus@chaosreigns.com wrote:

> On 10/19, Jim Popovitch wrote:
>> Is the missing entity one person, several people, many people?  Was
>> there an untimely death?   I believe everyone is now aware that there
>> exists a problem, how to we bridge the gap?
>
> My guess is that the only person familiar with the system is the original
> author of spamassassin, and he doesn't have time to deal with it.  There
> are 12 other people on the Project Management Committee, who I assume could
> all get sufficient access to the machine(s) running it:
> http://svn.apache.org/repos/asf/spamassassin/trunk/CREDITS
> And it seems they are all lacking the time to figure it out.

I have access; getting a block of time to focus on figuring out what it's 
doing, and what it's _supposed_ to be doing, is what I'm having trouble 
with.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Politicians never accuse you of "greed" for wanting other people's
   money, only for wanting to keep your own money.    -- Joseph Sobran
-----------------------------------------------------------------------
  314 days since the first successful private orbital launch (SpaceX)

Re: Rule updates

Posted by da...@chaosreigns.com.
On 10/19, Jim Popovitch wrote:
> Is the missing entity one person, several people, many people?  Was
> there an untimely death?   I believe everyone is now aware that there
> exists a problem, how to we bridge the gap?

My guess is that the only person familiar with the system is the original
author of spamassassin, and he doesn't have time to deal with it.  There
are 12 other people on the Project Management Committee, who I assume could
all get sufficient access to the machine(s) running it:
http://svn.apache.org/repos/asf/spamassassin/trunk/CREDITS
And it seems they are all lacking the time to figure it out.

SpamAssassin can be pretty frustrating to try to work on.

-- 
"Wash daily from nose-tip to tail-tip; drink deeply, but never too deep;
And remember the night is for hunting, and forget not the day is for sleep."
- The Law of the Jungle, Rudyard Kipling
http://www.ChaosReigns.com

Re: Rule updates

Posted by Jim Popovitch <ji...@gmail.com>.
On Wed, Oct 19, 2011 at 12:26,  <da...@chaosreigns.com> wrote:
> On 10/05, Jim Popovitch wrote:
>> On Wed, Oct 5, 2011 at 17:41, RW <rw...@googlemail.com> wrote:
>> > The usual reason for a hiatus is that too much spam or ham has aged-out
>> > in the corpora, and a top-up is needed.
>
> I think it's more accurate to say the usual reason is that too many people
> have stopped automatically submitting data via masscheck, and we need
> more people to submit data.
>
> I have a graphical representation of the problem here:
> http://www.chaosreigns.com/dnswl/tot.svg
> Green is spam, red is non-spam.  They both need to be above the blue line
> (150,000 emails each) for score generation to run to create the rule updates.
> Counts as of the last (net) run:
> Non-spams: 136261  (90.8% of the minimum)
> Spams:     351950 (234.6% of the minimum)
>
>> So, how do we get it top-up'ed?
>
> You contribute your data:
> http://wiki.apache.org/spamassassin/NightlyMassCheck
> The more we have, the more accurately we can calculate optimal rule
> scores, always.  Unfortunately the Project Management Committee has a habit
> of never responding to requests for masscheck accounts.
>
>
> But the current situation appears to be abnormal.  For some reason RuleQA
> / score generation isn't including data submitted by uploading full emails
> (normally just rule hit stats are uploaded).
>
> There is an open bug about that problem here:
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671
>
> It seems there is nobody with the access, knowledge of the system,
> and time required to fix the problem.
>
> There was supposed to be a SpamAssassin v3.4.0 Release Candidate released
> 19 days ago, which seems to be primarily held up by this rule update
> problem.  Which nobody is working on.
>
> --
> "Go forth, and be excellent to one another." - http://www.jhuger.com/fredski.php
> http://www.ChaosReigns.com

Darxus, thanks for the summation of the situation.

Is the missing entity one person, several people, many people?  Was
there an untimely death?   I believe everyone is now aware that there
exists a problem, how to we bridge the gap?

Thanks!

-Jim P.

Re: Rule updates

Posted by da...@chaosreigns.com.
On 10/05, Jim Popovitch wrote:
> On Wed, Oct 5, 2011 at 17:41, RW <rw...@googlemail.com> wrote:
> > The usual reason for a hiatus is that too much spam or ham has aged-out
> > in the corpora, and a top-up is needed.

I think it's more accurate to say the usual reason is that too many people
have stopped automatically submitting data via masscheck, and we need
more people to submit data.

I have a graphical representation of the problem here:
http://www.chaosreigns.com/dnswl/tot.svg
Green is spam, red is non-spam.  They both need to be above the blue line
(150,000 emails each) for score generation to run to create the rule updates.
Counts as of the last (net) run:  
Non-spams: 136261  (90.8% of the minimum)
Spams:     351950 (234.6% of the minimum)

> So, how do we get it top-up'ed?

You contribute your data:
http://wiki.apache.org/spamassassin/NightlyMassCheck
The more we have, the more accurately we can calculate optimal rule
scores, always.  Unfortunately the Project Management Committee has a habit
of never responding to requests for masscheck accounts.


But the current situation appears to be abnormal.  For some reason RuleQA
/ score generation isn't including data submitted by uploading full emails
(normally just rule hit stats are uploaded).  

There is an open bug about that problem here:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6671

It seems there is nobody with the access, knowledge of the system,
and time required to fix the problem.

There was supposed to be a SpamAssassin v3.4.0 Release Candidate released
19 days ago, which seems to be primarily held up by this rule update
problem.  Which nobody is working on.

-- 
"Go forth, and be excellent to one another." - http://www.jhuger.com/fredski.php
http://www.ChaosReigns.com

Re: Rule updates

Posted by Jim Popovitch <ji...@gmail.com>.
On Wed, Oct 5, 2011 at 17:41, RW <rw...@googlemail.com> wrote:
> The usual reason for a hiatus is that too much spam or ham has aged-out
> in the corpora, and a top-up is needed.

So, how do we get it top-up'ed?

-Jim P.

Re: Rule updates

Posted by RW <rw...@googlemail.com>.
On Wed, 05 Oct 2011 09:50:08 +0200
Lars Jørgensen wrote:

> On 04-10-2011 15:39, Michael Scheidell wrote:
> > what is 'long'?
> 
> As you can see from your own example, rules were updated daily until 
> august 26th. Then there hasn't been any updates since. That is 'long' 
> for me.
> 
> I can also see that updates are daily for 3.4.0 currently. Does that 
> mean that updates for 3.3.2 (which I am on) has stopped?

I would guess that the normal rules don't apply because 3.4.0 is a
development branch. The usual reason for a hiatus is that too much spam
or ham has aged-out in the corpora, and a top-up is needed.

Re: Rule updates

Posted by Lars Jørgensen <lj...@gmail.com>.
On 04-10-2011 15:39, Michael Scheidell wrote:
> what is 'long'?

As you can see from your own example, rules were updated daily until 
august 26th. Then there hasn't been any updates since. That is 'long' 
for me.

I can also see that updates are daily for 3.4.0 currently. Does that 
mean that updates for 3.3.2 (which I am on) has stopped?

> -rw-r--r-- 1 rsync rsync 170211 Oct 4 04:51 1178724.tar.gz <-- 3.4.0
> -rw-r--r-- 1 rsync rsync 170211 Oct 3 04:51 1178340.tar.gz
> -rw-r--r-- 1 rsync rsync 170169 Oct 2 04:51 1178152.tar.gz
> -rw-r--r-- 1 rsync rsync 170169 Oct 1 04:51 1177951.tar.gz
> -rw-r--r-- 1 rsync rsync 170166 Sep 30 04:51 1177560.tar.gz
> -rw-r--r-- 1 rsync rsync 236977 Aug 26 23:32 1162027.tar.gz <-- 3.3.2
> -rw-r--r-- 1 rsync rsync 236957 Aug 25 23:23 1161446.tar.gz
> -rw-r--r-- 1 rsync rsync 236980 Aug 24 23:22 1161015.tar.gz
> -rw-r--r-- 1 rsync rsync 236920 Aug 23 23:18 1160585.tar.gz
> -rwxr--r-- 1 rsync rsync 237167 Aug 22 23:17 1160145.tar.gz


-- 
Lars

Re: Rule updates

Posted by Lars Jørgensen <lj...@gmail.com>.
On 04-10-2011 15:43, Jim Popovitch wrote:
>> what is 'long'?
>
> Since 27-Aug-2011 ?

So, not just me then.


-- 
Lars

Re: Rule updates

Posted by Jim Popovitch <ji...@gmail.com>.
On Tue, Oct 4, 2011 at 09:39, Michael Scheidell
<mi...@secnap.com> wrote:
> On 10/4/11 3:07 AM, Lars Jørgensen wrote:
>>
>> Hi,
>>
>> Is it me or has it been a long time since there has been an update to the
>> spamassassin ruleset?
>>
>>
> what is 'long'?

Since 27-Aug-2011 ?

$ ll /var/lib/spamassassin/3.003001/updates_spamassassin_org/MIRRORED.BY
-rw-r--r-- 1 root root 225 2011-08-27 21:25
/var/lib/spamassassin/3.003001/updates_spamassassin_org/MIRRORED.BY

~$ dig txt 1.3.3.updates.spamassassin.org
 "1162027"


-Jim P.

Re: Rule updates

Posted by Frank Leonhardt <fr...@extremecomputing.org.uk>.
On 04/10/2011 14:39, Michael Scheidell wrote:
> On 10/4/11 3:07 AM, Lars Jørgensen wrote:
>> Hi,
>>
>> Is it me or has it been a long time since there has been an update to 
>> the spamassassin ruleset?
>>
>>
>
Most common reasons for a problem (IME, on FreeBSD)

Incorrect permissions on directory
Incorrect permissions on /usr/local/share/spamassassin/sa-update-pubkey.txt
Incorrect update key

Check these - especially the permissions! Linux is laxer on the defaults.

-- 
--------------
Sent from my Cray XT5


Re: Rule updates

Posted by Michael Scheidell <mi...@secnap.com>.
On 10/4/11 3:07 AM, Lars Jørgensen wrote:
> Hi,
>
> Is it me or has it been a long time since there has been an update to 
> the spamassassin ruleset?
>
>
what is 'long'?

ls -lt *.tar.gz | grep 'gz$' | head
-rw-r--r--  1 rsync  rsync  170211 Oct  4 04:51 1178724.tar.gz <-- 3.4.0
-rw-r--r--  1 rsync  rsync  170211 Oct  3 04:51 1178340.tar.gz
-rw-r--r--  1 rsync  rsync  170169 Oct  2 04:51 1178152.tar.gz
-rw-r--r--  1 rsync  rsync  170169 Oct  1 04:51 1177951.tar.gz
-rw-r--r--  1 rsync  rsync  170166 Sep 30 04:51 1177560.tar.gz
-rw-r--r--  1 rsync  rsync  236977 Aug 26 23:32 1162027.tar.gz <-- 3.3.2
-rw-r--r--  1 rsync  rsync  236957 Aug 25 23:23 1161446.tar.gz
-rw-r--r--  1 rsync  rsync  236980 Aug 24 23:22 1161015.tar.gz
-rw-r--r--  1 rsync  rsync  236920 Aug 23 23:18 1160585.tar.gz
-rwxr--r--  1 rsync  rsync  237167 Aug 22 23:17 1160145.tar.gz


-- 
Michael Scheidell, CTO
o: 561-999-5000
d: 561-948-2259
 >*| *SECNAP Network Security Corporation

    * Best Mobile Solutions Product of 2011
    * Best Intrusion Prevention Product
    * Hot Company Finalist 2011
    * Best Email Security Product
    * Certified SNORT Integrator

______________________________________________________________________
This email has been scanned and certified safe by SpammerTrap(r). 
For Information please see http://www.spammertrap.com/
______________________________________________________________________