You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jari Fredriksson <ja...@iki.fi> on 2019/04/14 08:03:52 UTC

Hive Mind: postfix prescreen and SA ruleqa

We have had some discussions of this in the past. But now I became worried that all SA users do not have access to their border smtp and are NOT configuring postfix with this: https://pastebin.com/LGkdi7NM <https://pastebin.com/LGkdi7NM>

Now, I am part of RuleQA. Should I accept everything and pass it so SpamAssassin and to my corpus or not? Reindl Harald may have his say as a corporate maintainer or something but the SpamAssassin user base is more.

How can I best support SpamAssassin besides having a mass check automation and mirrors for the sa-update?

Br. jarif

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Jari Fredriksson <ja...@iki.fi>.
Bill Cole kirjoitti 14.4.2019 21:13:
> On 14 Apr 2019, at 4:03, Jari Fredriksson wrote:
> 

>> How can I best support SpamAssassin besides having a mass check 
>> automation and mirrors for the sa-update?
> 
> Those are both large contributions. Thank you for that support.
> 
> The obvious repository of things we need fixed is the Bugzilla. There
> are a lot of open bugs, most of which require some Perl prowess and
> substantial time to fix, because we've done pretty well on attacking
> simple bugs as they are reported. There are gaps and flaws in the
> documentation, both on the Wiki and internal to the code, where many
> authors have used only standard commenting instead of POD, effectively
> hiding documentation. rule development is also potentially quite
> helpful, if you are good at it and have the patience to deal with the
> testing process (e.g. like John Hardin.)

Thank you for your post. I am a professional software developer and work 
currently as a Senior Java Developer in a large multinational. I know 
"tons of" programming languages but sadly Perl or Regexp are not part of 
it. I can READ Perl, but I can not read the Perl book I have to the 
amount that I could actually be productive in a Perl software project. 
The language just feels too backwards to me that I can not bother to 
learn.

So I will stick to my day job and ponder other options for this.

-- 
jarif@iki.fi

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Bill Cole <sa...@billmail.scconsult.com>.
On 14 Apr 2019, at 4:03, Jari Fredriksson wrote:

> We have had some discussions of this in the past. But now I became 
> worried that all SA users do not have access to their border smtp and 
> are NOT configuring postfix with this: https://pastebin.com/LGkdi7NM 
> <https://pastebin.com/LGkdi7NM>

That's far more specified than my own postscreen config, and I'm a 
verifiable anti-spam obsessive...

It's absolutely the case that MOST de facto SA users (many of whom have 
no idea that they ARE SA users) don't control their SMTP edge. Most of 
those who do have control will not replicate any particular Pre-SA 
configuration. As a result, what SA sees as as total mailstream is not 
just a function of  the fact that every person receives their own unique 
collection of ham and is targeted by a unique collection of spam, but 
also by the  fact that the pre-SA filtering is bespoke, at least for 
each server and sometimes for each user.

> Now, I am part of RuleQA. Should I accept everything and pass it so 
> SpamAssassin and to my corpus or not?

I don't know whether there is an official policy on this so, even though 
I am a committer and a PMC member, my  opinions in this message are NOT 
official positions of the ASF SpamAssassin Project.

I believe that submissions to RuleQA should include all mail that is 
likely to be seen by SA anywhere. That does NOT include mail from 
systems that talk before greeting banners or pipeline improperly or use 
blatantly fraudulent HELO names but it would include mail which is 
easily rejected by sites using a mosaic of DNSBLs akin to those used in 
your postscreen config.

> Reindl Harald may have his say as a corporate maintainer or something 
> but the SpamAssassin user base is more.

To my knowledge, he has never made a positive contribution to SA as an 
ongoing open source project and user community, due largely to his 
choice of a confrontational and disrespectful communication style. Being 
frequently wrong in the context of mail systems and mailstreams unlike 
his own doesn't help.

ALL SA users share that problem of myopia. We can't know anything about 
the mailstreams of others or even about the mail aimed at us which we 
reasonably reject before seeing the actual data. Having mail that most 
sites could properly reject by using distributed reputation systems 
(i.e. DNSBLs) in the masscheck submissions makes the output of RuleQA 
more useful in identifying spam from sources that have not yet hit those 
distributed reputation systems.

> How can I best support SpamAssassin besides having a mass check 
> automation and mirrors for the sa-update?

Those are both large contributions. Thank you for that support.

The obvious repository of things we need fixed is the Bugzilla. There 
are a lot of open bugs, most of which require some Perl prowess and 
substantial time to fix, because we've done pretty well on attacking 
simple bugs as they are reported. There are gaps and flaws in the 
documentation, both on the Wiki and internal to the code, where many 
authors have used only standard commenting instead of POD, effectively 
hiding documentation. rule development is also potentially quite 
helpful, if you are good at it and have the patience to deal with the 
testing process (e.g. like John Hardin.)

-- 
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Henrik K <he...@hege.li>.
On Sun, Apr 14, 2019 at 11:03:52AM +0300, Jari Fredriksson wrote:
> 
> Now, I am part of RuleQA. Should I accept everything and pass it so
> SpamAssassin and to my corpus or not?

This is question for ruleqa list, not genpop.  Anyway, probably doesn't
matter much.  There are already major spamtrappers etc contributing to
ruleqa, I think most of the "easy dialup" spam is seen there too.  Just try
to look for the hard to catch spam not ending up in ham corpus.

Preferably optimize your setup for reuse like I described on ruleqa list
some time ago (subject: Masscheck reuse).


Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 4/14/19 2:03 AM, Jari Fredriksson wrote:
> We have had some discussions of this in the past. But now I became 
> worried that all SA users do not have access to their border smtp and 
> are NOT configuring postfix with this: https://pastebin.com/LGkdi7NM

I can tell you for a fact that some SpamAssassin users /do/ have access 
to their border SMTP servers and /leverage/ that position as much as 
possible.

I know that not everybody is running Postfix.  I also question the 
settings in the linked file.

Why would you allow 7 days for a newline, greeting, non-smtp command, or 
pipelining?

That seems like a LOT of RBLs to me.

> Now, I am part of RuleQA. Should I accept everything and pass it so 
> SpamAssassin and to my corpus or not? Reindl Harald may have his say as 
> a corporate maintainer or something but the SpamAssassin user base is more.

What?  (I need more caffeine.)

> How can I best support SpamAssassin besides having a mass check 
> automation and mirrors for the sa-update?

I'd say use SpamAssassin to the fullest extent you can, test it, report 
bugs, and actively participate in the community.  I'm sure you can give 
monetary support somewhere too.  Somebody usually wants pizza and / or beer.



-- 
Grant. . . .
unix || die


Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by RW <rw...@googlemail.com>.
On Tue, 16 Apr 2019 15:16:59 +0300
Jari Fredriksson wrote:

> John Hardin kirjoitti 15.4.2019 1:33:
> > On Sun, 14 Apr 2019, Jari Fredriksson wrote:
> >   
> >> Now, I am part of RuleQA. Should I accept everything and pass it
> >> so SpamAssassin and to my corpus or not?  
> > 
> > I would suggest yes, you should accept everything that reaches your
> > spamtrap addresses and include it in your corpora. Don't worry about
> > that, worry about whether or not the messages get correctly
> > classified.  
> 
> Thanks. I might test next weekend about dropping the postscreen 
> scanning. 

Before you do that, I would suggest you read what Henrik K wrote:

  "There are already major spamtrappers etc contributing to ruleqa, I
  think most of the "easy dialup" spam is seen there too.  Just try to
  look for the hard to catch spam not ending up in ham corpus."

IMO the corpus should contain a bit of everything, but ideally it should
be dominated by the spam that would reach SA on a server following
best practice. A corpus generated from unfiltered spamtraps is very
heavily biased in the wrong direction.

Also if someone is processing mail from an MTA they don't control it
doesn't mean that there's no upstream MTA filtering. My experience is
that lighter (low FP) spam filtering is something you have to pay extra
for these days. 


Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Jari Fredriksson <ja...@iki.fi>.
John Hardin kirjoitti 15.4.2019 1:33:
> On Sun, 14 Apr 2019, Jari Fredriksson wrote:
> 
>> Now, I am part of RuleQA. Should I accept everything and pass it so 
>> SpamAssassin and to my corpus or not?
> 
> I would suggest yes, you should accept everything that reaches your
> spamtrap addresses and include it in your corpora. Don't worry about
> that, worry about whether or not the messages get correctly
> classified.

Thanks. I might test next weekend about dropping the postscreen 
scanning. It has been bearable these days with spam, not a hard labor to 
work with the ham & spam into dedicated folders. I am subscribed to tons 
of domestic and foreign ham delivery sources just because of this hobby.

-- 
jarif@iki.fi

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by John Hardin <jh...@impsec.org>.
On Sun, 14 Apr 2019, Jari Fredriksson wrote:

> Now, I am part of RuleQA. Should I accept everything and pass it so SpamAssassin and to my corpus or not?

I would suggest yes, you should accept everything that reaches your 
spamtrap addresses and include it in your corpora. Don't worry about that, 
worry about whether or not the messages get correctly classified.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Are you a mildly tech-literate politico horrified by the level of
   ignorance demonstrated by lawmakers gearing up to regulate online
   technology they don't even begin to grasp? Cool. Now you have a
   tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
-----------------------------------------------------------------------
  Today: the 154th anniversary of Lincoln's assassination

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by Jari Fredriksson <ja...@iki.fi>.
David Jones kirjoitti 14.4.2019 23:31:
> Once you get this type of platform setup, it can be used for other spam
> fighting techniques on the primary mail filters like:
> 
> - train your shared redis Bayes DB with the ham and spam folder

I have similar system including the Redis, SpamCop, Pyzor and Razor2 
reporting, but the following is not (yet :D) implemented.

> - fail2ban - run a custom script on the secondary server to block IPs 
> on
> the primary filters
> - swatch - run custom scripts when certain rules are hit:
> 	- add entries to your own private RBL to catch zero hour spam
> 	- auto release certain messages from quarantine as an attachment
> 	- etc...

Thanks for your post!

-- 
jarif@iki.fi

Re: Hive Mind: postfix prescreen and SA ruleqa

Posted by David Jones <dj...@ena.com>.
On 4/14/19 3:03 AM, Jari Fredriksson wrote:
> We have had some discussions of this in the past. But now I became 
> worried that all SA users do not have access to their border smtp and 
> are NOT configuring postfix with this: https://pastebin.com/LGkdi7NM
> 
> Now, I am part of RuleQA. Should I accept everything and pass it so 
> SpamAssassin and to my corpus or not? Reindl Harald may have his say as 
> a corporate maintainer or something but the SpamAssassin user base is more.
> 
> How can I best support SpamAssassin besides having a mass check 
> automation and mirrors for the sa-update?

I have a secondary iredmail server hosting a few decommissioned domains 
receiving both ham and spam, but mostly spam.  This iredmail server has 
been stripped down to a very basic/stock Postfix and amavis-new configs 
to allow SA to score everything.  I sync over my /etc/mail/spamassassin 
custom config files and point to the same Bayes DB in redis so it scores 
similarly.  Dovecot Sieve rules put mail into the ham and spam folders 
for my masscheck processing based on very low scores for ham and very 
high scores for spam.  Messages that score in the middle are checked and 
manually moved into the ham or spam folder.  I also have a Spamcop 
folder that a script runs every 5 minutes to submit to Spamcop.  If I am 
able to catch a compromised account quickly, then I release that email 
to spamcop+postmaster@sa.ena.net which automatically goes into the 
Spamcop folder.

Note the sa.ena.net is an internal-only domain that is used to route 
mail to my iredmail server for the SA masscheck corpus.

Once you get this type of platform setup, it can be used for other spam 
fighting techniques on the primary mail filters like:

- train your shared redis Bayes DB with the ham and spam folder
- fail2ban - run a custom script on the secondary server to block IPs on 
the primary filters
- swatch - run custom scripts when certain rules are hit:
	- add entries to your own private RBL to catch zero hour spam
	- auto release certain messages from quarantine as an attachment
	- etc...

-- 
David Jones