You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Rupert Gallagher <ru...@protonmail.com> on 2018/01/24 11:08:28 UTC

kam corpus

Is this the "official" version of kam.cf?

http://www.pccc.com/downloads/SpamAssassin/contrib/

The file is huge, and consists of ad-hoc rules against spammy keywords.

We use a completely different approach, resulting in few general rules and a short whitelist. We hardly see any kam-esque spam, but we are wise enough to verify. Is there an open corpus of kam-spam that we can process?

Re: kam corpus

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 1/24/2018 6:48 PM, @lbutlr wrote:
>> The file is huge, and consists of ad-hoc rules against spammy keywords.
> Is less than 300K huge?
>
> That does remind me, though, does SpamAssassin automatically load *.cf in /usr/local/etc/mail/SpamAssassin or do extra cf files like KAM need to be added somewhere to be loaded?
>
> I seem to recall having to do something, but ti's been a long time since I did anything outside of local.cf

It's a huge file and I need to bring our automation tools to bear on it 
to streamline it.

Any cf file including KAM.cf works if it is placed wherever your 
local.cf goes.

Regards,
KAM


Re: kam corpus

Posted by "@lbutlr" <kr...@kreme.com>.
On 24 Jan 2018, at 04:08, Rupert Gallagher <ru...@protonmail.com> wrote:
> Is this the "official" version of kam.cf? 
> 
> http://www.pccc.com/downloads/SpamAssassin/contrib/
> 
> The file is huge, and consists of ad-hoc rules against spammy keywords. 

Is less than 300K huge?

That does remind me, though, does SpamAssassin automatically load *.cf in /usr/local/etc/mail/SpamAssassin or do extra cf files like KAM need to be added somewhere to be loaded?

I seem to recall having to do something, but ti's been a long time since I did anything outside of local.cf

-- 
...but the senator, while insisting he was not intoxicated, could not
explain his nudity.


Re: kam corpus

Posted by Rupert Gallagher <ru...@protonmail.com>.
We had three spam messages in about 8 months? I lost count. Our clients are so used to have a clean inbox that they spot a spam like the proverbial white fly.

Sent from ProtonMail Mobile

On Wed, Jan 24, 2018 at 13:34, Kevin A. McGrail <ke...@mcgrail.com> wrote:

> On 1/24/2018 6:08 AM, Rupert Gallagher wrote:
>
>> Is this the "official" version of kam.cf?
>>
>> http://www.pccc.com/downloads/SpamAssassin/contrib/
>
> Yes.  Are there unofficial versions?
>
>> The file is huge, and consists of ad-hoc rules against spammy keywords.
>>
>> We use a completely different approach, resulting in few general rules and a short whitelist. We hardly see any kam-esque spam, but we are wise enough to verify. Is there an open corpus of kam-spam that we can process?
>
> Sorry, no, we do not provide a spam or ham corpora for verification.  I can tell you that we get about 2 problem reports a week average with 100's of millions of mailboxes using our cf.

Re: kam corpus

Posted by Benny Pedersen <me...@junc.eu>.
Kevin A. McGrail skrev den 2018-06-06 18:41:
> I've considered it. I even run channels for others but just haven't
> ever set it up.  Focused on 3.4.2 right now.

3.4.2 is long awaited

i like to see a wiki for how to build own rescores for local only tags, 
that could imho speedup new very good tags that catch spam in more 
general, as it is now we all miss more spam and rescore is thus biased 
incorrect in how it keeps bayes learned, on the other side i reject 
based on rbl in mta stage, no plan for me to limit that rbl testing, but 
it neutralised bayes learning, with is imho good :=)

as long it works

Re: kam corpus

Posted by "Kevin A. McGrail" <km...@apache.org>.
I've considered it. I even run channels for others but just haven't ever
set it up.  Focused on 3.4.2 right now.

--
Kevin A. McGrail
VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171

On Wed, Jun 6, 2018 at 11:13 AM, Nix <ni...@esperi.org.uk> wrote:

> On 24 Jan 2018, Kevin A. McGrail uttered the following:
>
> > On 1/24/2018 6:08 AM, Rupert Gallagher wrote:
> >>
> >> Is this the "official" version of kam.cf?
> >>
> >> http://www.pccc.com/downloads/SpamAssassin/contrib/
> >>
> > Yes.  Are there unofficial versions?
>
> I've long wondered whether there's an sa-update channel for KAM. It
> seems... inelegant and impolite to your site to do a curl for it at
> intervals and use the last-modified header (though it does work), when
> sa-update's cheaper DNS lookups could do the same job with less overhead.
>
> --
> NULL && (void)
>

Re: kam corpus

Posted by Nix <ni...@esperi.org.uk>.
On 24 Jan 2018, Kevin A. McGrail uttered the following:

> On 1/24/2018 6:08 AM, Rupert Gallagher wrote:
>>
>> Is this the "official" version of kam.cf?
>>
>> http://www.pccc.com/downloads/SpamAssassin/contrib/
>>
> Yes.  Are there unofficial versions?

I've long wondered whether there's an sa-update channel for KAM. It
seems... inelegant and impolite to your site to do a curl for it at
intervals and use the last-modified header (though it does work), when
sa-update's cheaper DNS lookups could do the same job with less overhead.

-- 
NULL && (void)

Re: kam corpus

Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 1/24/2018 6:08 AM, Rupert Gallagher wrote:
>
> Is this the "official" version of kam.cf?
>
>
> http://www.pccc.com/downloads/SpamAssassin/contrib/
>
Yes.  Are there unofficial versions?

> The file is huge, and consists of ad-hoc rules against spammy keywords.
>
>
> We use a completely different approach, resulting in few general rules 
> and a short whitelist. We hardly see any kam-esque spam, but we are 
> wise enough to verify. Is there an open corpus of kam-spam that we can 
> process?
>
Sorry, no, we do not provide a spam or ham corpora for verification.  I 
can tell you that we get about 2 problem reports a week average with 
100's of millions of mailboxes using our cf.