You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Alex <my...@gmail.com> on 2017/07/13 01:04:37 UTC
"bout u" campaign
Hi all,
Has anyone else experienced a spam campaign with any one of the
following subjects:
- sometimes enjoy it wild, how bout you?
- sometimes like it ruff, what bout you?
- sumtimes enjoy it ruff, wat bout you?
The body contains something like "wild hukups" then a phone number.
https://pastebin.com/X5xNn9RZ
It comes from AOL and other freemails, but doesn't hit much, and hits
bayes50 or lower here.
Is this a snowshoe thing? Ideas on how to stop them? I've now trained
them but I thought someone might like to see them for their own
benefit, and perhaps had ideas on a more general way of blocking
these.
What is even the point of spam with a phone number?
The IP range for the ones originating from AOL are all in the
204.29.186.0/24 block. None of them are in any meaningful blacklist
and have a 90+ senderscore.
I'm sure the campaign will change soon, but I thought there was
something more general we could look for the next time...
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
On Thu, Jul 13, 2017 at 12:01 PM, Dianne Skoll <df...@roaringpenguin.com> wrote:
> On Wed, 12 Jul 2017 21:04:37 -0400
> Alex <my...@gmail.com> wrote:
>
>> Has anyone else experienced a spam campaign with any one of the
>> following subjects:
>
>> - sometimes enjoy it wild, how bout you?
>> - sometimes like it ruff, what bout you?
>> - sumtimes enjoy it ruff, wat bout you?
>
> 144 hits, all of them except one on Tuesday, 11 July. All
> whacked very handily by Bayes.
Thank you. We also saw a variation with "irs collection" that were
subsequently caught by bayes.
Re: "bout u" campaign
Posted by Dianne Skoll <df...@roaringpenguin.com>.
On Wed, 12 Jul 2017 21:04:37 -0400
Alex <my...@gmail.com> wrote:
> Has anyone else experienced a spam campaign with any one of the
> following subjects:
> - sometimes enjoy it wild, how bout you?
> - sometimes like it ruff, what bout you?
> - sumtimes enjoy it ruff, wat bout you?
144 hits, all of them except one on Tuesday, 11 July. All
whacked very handily by Bayes.
Regards,
Dianne.
Re: "bout u" campaign
Posted by "Kevin A. McGrail" <ke...@mcgrail.com>.
On 7/12/2017 9:04 PM, Alex wrote:
> Has anyone else experienced a spam campaign with any one of the
> following subjects:
0 hits today on this, nothing that's gotten through for me on our servers.
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
As a follow up, it says how to do the DNS, just now how to list in the .cf files, maybe I can copy another blacklist syntax?
Infinite Systems
Charles Amstutz | Systems Administrator
charlesa@infinitesys.com 402.477.2474
134 S 13th Street, Suite 302 | Lincoln, NE 68508
-----Original Message-----
From: David Jones [mailto:djones@ena.com]
Sent: Thursday, July 13, 2017 8:17 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign
On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
>
>> pretty high mainly due to DCC and BAYES_99.
>
> Are you paying for DCC? I think we're over their limit and they
> blacklisted us long ago, lol.
I have my own DCC server joined into the DCC network.
https://www.dcc-servers.net/dcc/
>
>> I guess I have well trained Bayes.
>
> I think you just don't have many one-liner emails as a regular course
> of business?
I am classifying about 10K ham and 8K spam each day which I also use in the masscheck processing (currently on hold). Since I have started doing this about a month or so ago, my BAYES scores seem to be more accurate. Maybe I wasn't training enough ham/spam before? I don't know for sure yet.
>
>> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback
>> usb.unsubscore.com
>> [204.29.186.60 listed in
>> ubl.unsubscore.com]
>
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
>
>> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
>
> We do have some in SORBS, but only score it 0.5. Do you really
> recommend scoring it so high?
> Obviously I do because it's working well in my platform. I have other
WL rules that subtract points to offset this one. If there are no other WL (i.e. list.dnswl.org) hits then this will stand out more.
Do some analysis of your emails that hit this rule and what the scores were. My threshold for blocking is 6.0 (default for MailScanner). If your threshold is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.
>> 0.0 OS_UNKNOWN Relay runs on unknown OS
>
> That's an interesting one. Fingerprinting?
>
Yeh. I thought it might be a useful data point for making meta rules but it turns out to not be. I will probably leave this out when I rebuild my filters in the next couple of months on CentOS 7.
>> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail
>
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
>
Yep. Again my block threshold is 6.0 in MailScanner and I have less default trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and other hits that add to the score based on combinations I have seen over the years.
FREEMAIL senders are very difficult to accurately filter but I feel like my rules are pretty good. I have to postwhite exclude most freemail providers since they are listed on some RBLs which makes no sense to me.
You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because they are so large and there are many legit senders in the middle of the spammers.
>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
>
> For 90_100, I think we're only subtracting -0.2.
>
For my mail flow, I have noticed that senders in the 90's are normally very trustworthy.
If you separate your rules into 2 main categories, then you can setup scores based on their category to balance out the other category.
1. IP and domain reputation
2. Message content
Good IP reputation can offset questionable message content and vice versa. I tend to go heavy on the reputation side at the MTA and in SA which has serve me well in the past several years. Before that, I was constantly adjusting content rule scores and writing custom rules to react to the latest spam campaign where I was always behind.
I have a huge list of whitelist_auth based on domain reputation which allows me to crank up some content scores and not let Bayes block good reputation senders based on content.
>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
>
> The problem I always had with pyzor/dcc was that it works on very
> small blocks of text, no? Perhaps it works well for small messages,
> but isn't it problematic for larger messages?
>
I have no idea. I just analyzed my mail scoring and noticed combinations like DCC and FREEMAIL are common in my spam.
>> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>> listed in MSPIKE_H2 so add back points.
>> 0.0 ENA_BAD_SPAM Spam hitting really bad rules.
>> 2.2 ENA_BAD_SPAM_FREEMAIL Bad spam from freemail (hotmail, gmail, msn,
>> yahoo).
>
> These are interesting, but I suppose privileged...
>
The ENA_BAD_SPAM rule is a combination of 2 different types (reputation and content) rules with an AND between them. For example (this is is about one-third of the rule):
meta ENA_BAD_SPAM (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK || RAZOR2_CF_RANGE_E8_51_100 || BAYES_999 || BAYES_99 ||
BAYES_95 || RCVD_IN_BL_SPAMCOP_NET || RCVD_IN_SORBS_WEB ||
RCVD_IN_SENDERSCORE_60_69 || RCVD_IN_SENDERSCORE_50_59 ||
RCVD_IN_SENDERSCORE_30_49 || RCVD_IN_SENDERSCORE_0_29 || RCVD_IN_SORBS_SPAM ) && (URI_PHISH || URIBL_IVMURI || FREEMAIL_FROM || FREEMAIL_REPLYTO || FREEMAIL_FORGED_REPLYTO || MISSING_SUBJECT || MISSING_DATE || KAM_REALLYHUGEIMGSRC || KAM_HUGEIMGSRC || KAM_MANYTO || HTML_FONT_LOW_CONTRAST || ADVANCE_FEE_2_NEW_MONEY || ADVANCE_FEE_2_NEW_FORM || ADVANCE_FEE_3_NEW || ADVANCE_FEE_3_NEW_MONEY
|| ADVANCE_FEE_3_NEW_FORM || ADVANCE_FEE_4_NEW || TVD_RCVD_SINGLE)
describe ENA_BAD_SPAM Spam hitting really bad rules.
score ENA_BAD_SPAM 0.001
/etc/mail/spamassassin/99_mailspike.cf
shortcircuit RCVD_IN_MSPIKE_H5 on
score RCVD_IN_MSPIKE_H4 -3.2
score RCVD_IN_MSPIKE_H3 -2.2
score RCVD_IN_MSPIKE_H2 -1.2
score RCVD_IN_MSPIKE_WL -0.82
score RCVD_IN_MSPIKE_BL 1.2
score RCVD_IN_MSPIKE_L2 0.2
score RCVD_IN_MSPIKE_L3 1.2
score RCVD_IN_MSPIKE_L4 2.2
score RCVD_IN_MSPIKE_L5 3.2
meta ENA_DIGEST_FREEMAIL FREEMAIL_FROM && (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK)
describe ENA_DIGEST_FREEMAIL Freemail account hitting message digest
spam seen by the Internet (DCC, Pyzor, or Razor).
score ENA_DIGEST_FREEMAIL 2.2
meta ENA_DIGEST_MULTIPLE_DNSWL_MED (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_DNSWL_MED
describe ENA_DIGEST_MULTIPLE_DNSWL_MED Dcc, Razor, or Pyzor hits from
servers listed in DNSWL so add back points.
score ENA_DIGEST_MULTIPLE_DNSWL_MED 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H4 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H4
describe ENA_DIGEST_MULTIPLE_MSPIKE_H4 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H4 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H4 3.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H3 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H3
describe ENA_DIGEST_MULTIPLE_MSPIKE_H3 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H3 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H3 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H2 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H2
describe ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H2 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H2 1.2
Hope this is helpful.
--
David Jones
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
I'm starting mine out at 0.5 until I see what happens.
Infinite Systems
Charles Amstutz | Systems Administrator
charlesa@infinitesys.com 402.477.2474
134 S 13th Street, Suite 302 | Lincoln, NE 68508
-----Original Message-----
From: David Jones [mailto:djones@ena.com]
Sent: Thursday, July 13, 2017 11:13 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign
On 07/13/2017 10:56 AM, RW wrote:
> On Thu, 13 Jul 2017 09:33:04 -0400
> Alex wrote:
>
>> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz
>> <ch...@infinitesys.com> wrote:
>>> How do you use lashback? It says that it is free to use for
>>> commercial and non commercial use. How do I set it up?
>>
>> Drop this into your local.cf or similar:
>>
>> header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
>> 'ubl.unsubscore.com')
>
> I have it as lastexternal:
>
> header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal',
> 'ubl.unsubscore.com')
>
> I've found there to be quite a lot of ISP pool addresses in it, so
> deep checks are probably unsafe.
>
I started mine with lastexternal and didn't find much added value over other major RBLs and since my MTA was blocking mostly with IVM and Spamhaus RBLs that overlapped Lashback. I also wanted to check outbound mail where the second or more hop was from an infected device most likely under botnet control. It would have helped in the OP spam.
> I've also found it has quite a high FP rate of ~2%.
>
I am working with them to fix these FPs (they include major mail providers like Comcast, Microsoft and Google which are pointless) and potentially be included in the default SA rules. It's still a valuable RBL to help with an overall score even with a ~2% FP. I wouldn't score it too high like you can with Spamhaus and IVM. I also have it at 1.2.
--
David Jones
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/13/2017 10:56 AM, RW wrote:
> On Thu, 13 Jul 2017 09:33:04 -0400
> Alex wrote:
>
>> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz
>> <ch...@infinitesys.com> wrote:
>>> How do you use lashback? It says that it is free to use for
>>> commercial and non commercial use. How do I set it up?
>>
>> Drop this into your local.cf or similar:
>>
>> header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
>> 'ubl.unsubscore.com')
>
> I have it as lastexternal:
>
> header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal', 'ubl.unsubscore.com')
>
> I've found there to be quite a lot of ISP pool addresses in it, so deep
> checks are probably unsafe.
>
I started mine with lastexternal and didn't find much added value over
other major RBLs and since my MTA was blocking mostly with IVM and
Spamhaus RBLs that overlapped Lashback. I also wanted to check outbound
mail where the second or more hop was from an infected device most
likely under botnet control. It would have helped in the OP spam.
> I've also found it has quite a high FP rate of ~2%.
>
I am working with them to fix these FPs (they include major mail
providers like Comcast, Microsoft and Google which are pointless) and
potentially be included in the default SA rules. It's still a valuable
RBL to help with an overall score even with a ~2% FP. I wouldn't score
it too high like you can with Spamhaus and IVM. I also have it at 1.2.
--
David Jones
RE: "bout u" campaign
Posted by John Hardin <jh...@impsec.org>.
On Thu, 13 Jul 2017, Charles Amstutz wrote:
> Hello,
>
> For the inexeperienced, what is the difference between lashback and lastexternal.
"lashback" is a DNSBL
"lastexternal" is which MTA gets checked against that DNSBL. In this case,
the last MTA external to your network - the MTA that handed the message
over to your MTA. This is versus, for example, the MTA that first accepted
the message from an email program.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Microsoft is not a standards body.
-----------------------------------------------------------------------
3 days until the 72nd anniversary of the dawn of the Atomic Age
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/13/2017 11:00 AM, Charles Amstutz wrote:
> Hello,
>
> For the inexeperienced, what is the difference between lashback and lastexternal.
>
>
> Infinite Systems
> Charles Amstutz | Systems Administrator
> charlesa@infinitesys.com 402.477.2474
> 134 S 13th Street, Suite 302 | Lincoln, NE 68508
>
>
>
Search for "lastexternal" on this page:
https://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html
Basically you should have your internal_networks setup with your
internal network space plus the trusted_networks setup with networks
that you trust to never send spam or if you are POP/IMAP from a
provider. Then SA can determine the lastexternal IP to check against
and not all hops.
For example, I do score.senderscore.org as lastexternal since it has
such a high score:
header RCVD_IN_SENDERSCORE_0_29
eval:check_rbl('senderscore0-lastexternal','score.senderscore.com.','^127\.0\.4\.([1-2]?[0-9])$')
describe RCVD_IN_SENDERSCORE_0_29 Senderscore.org score of
0 to 29
score RCVD_IN_SENDERSCORE_0_29 5.2
tflags RCVD_IN_SENDERSCORE_0_29 net
--
David Jones
Re: "bout u" campaign
Posted by RW <rw...@googlemail.com>.
On Thu, 13 Jul 2017 16:00:14 +0000
Charles Amstutz Top-Posted:
> Hello,
>
> For the inexeperienced, what is the difference between lashback and
> lastexternal.
lashback is just a label, the difference is between
eval:check_rbl('LASHBACK', ...
and
eval:check_rbl('LASHBACK-lastexternal', ...
The former checks all the IP addresses outside your trusted network,
the latter only checks the last-external IP address, i.e. the MX
handover.
If a dynamic IP address gets into a blocklist, it may get
reassigned to many other devices before it's delisted. Most blocklists
are lastexternal because a device with a dynamic address shouldn't be
delivering direct to MX in the first place.
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
Hello,
For the inexeperienced, what is the difference between lashback and lastexternal.
Infinite Systems
Charles Amstutz | Systems Administrator
charlesa@infinitesys.com 402.477.2474
134 S 13th Street, Suite 302 | Lincoln, NE 68508
-----Original Message-----
From: RW [mailto:rwmaillists@googlemail.com]
Sent: Thursday, July 13, 2017 10:57 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign
On Thu, 13 Jul 2017 09:33:04 -0400
Alex wrote:
> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz
> <ch...@infinitesys.com> wrote:
> > How do you use lashback? It says that it is free to use for
> > commercial and non commercial use. How do I set it up?
>
> Drop this into your local.cf or similar:
>
> header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
> 'ubl.unsubscore.com')
I have it as lastexternal:
header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal', 'ubl.unsubscore.com')
I've found there to be quite a lot of ISP pool addresses in it, so deep checks are probably unsafe.
I've also found it has quite a high FP rate of ~2%.
Re: "bout u" campaign
Posted by RW <rw...@googlemail.com>.
On Thu, 13 Jul 2017 09:33:04 -0400
Alex wrote:
> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz
> <ch...@infinitesys.com> wrote:
> > How do you use lashback? It says that it is free to use for
> > commercial and non commercial use. How do I set it up?
>
> Drop this into your local.cf or similar:
>
> header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK',
> 'ubl.unsubscore.com')
I have it as lastexternal:
header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal', 'ubl.unsubscore.com')
I've found there to be quite a lot of ISP pool addresses in it, so deep
checks are probably unsafe.
I've also found it has quite a high FP rate of ~2%.
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
Thanks
Infinite Systems
Charles Amstutz | Systems Administrator
charlesa@infinitesys.com 402.477.2474
134 S 13th Street, Suite 302 | Lincoln, NE 68508
-----Original Message-----
From: Alex [mailto:mysqlstudent@gmail.com]
Sent: Thursday, July 13, 2017 8:33 AM
To: Charles Amstutz <ch...@infinitesys.com>; SA Mailing list <us...@spamassassin.apache.org>
Subject: Re: "bout u" campaign
On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz <ch...@infinitesys.com> wrote:
> How do you use lashback? It says that it is free to use for commercial and non commercial use. How do I set it up?
Drop this into your local.cf or similar:
header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com')
describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist
tflags RCVD_IN_LASHBACK net
score RCVD_IN_LASHBACK 1.2
I've scored it at 1.2. You may wish to change that, perhaps lower for a while, while you see how it works in your organization.
Re: "bout u" campaign
Posted by RW <rw...@googlemail.com>.
On Thu, 13 Jul 2017 15:56:59 +0000
Charles Amstutz wrote:
> Thanks,
>
> I was looking at the default RBL lists
>
> https://wiki.apache.org/spamassassin/DnsBlocklists
>
> But was looking for other things that are free for commercial use. I
> found this that is possible.
>
> http://0spam.fusionzero.com/
>
> but don't know if wanyone had experience with it, or could make other
> recommendations.
You might try this one, it may be well suited to postscreen or outright
rejection as it supposed to be quite conservative - I haven't had any
FPs.
header RCVD_IN_GBUDB eval:check_rbl('gbudb-lastexternal', 'truncate.gbudb.net.')
describe RCVD_IN_GBUDB Listed in truncate.gbudb.net
tflags RCVD_IN_GBUDB net
score RCVD_IN_GBUDB 1.0 # adjust after testing
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
Thanks,
I was looking at the default RBL lists
https://wiki.apache.org/spamassassin/DnsBlocklists
But was looking for other things that are free for commercial use. I found this that is possible.
http://0spam.fusionzero.com/
but don't know if wanyone had experience with it, or could make other recommendations.
>Drop this into your local.cf or similar:
>header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com')
>describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist
>tflags RCVD_IN_LASHBACK net
>score RCVD_IN_LASHBACK 1.2
> I've scored it at 1.2. You may wish to change that, perhaps lower for a while, while you see how it works in your organization.
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz
<ch...@infinitesys.com> wrote:
> How do you use lashback? It says that it is free to use for commercial and non commercial use. How do I set it up?
Drop this into your local.cf or similar:
header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com')
describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist
tflags RCVD_IN_LASHBACK net
score RCVD_IN_LASHBACK 1.2
I've scored it at 1.2. You may wish to change that, perhaps lower for
a while, while you see how it works in your organization.
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
How do you use lashback? It says that it is free to use for commercial and non commercial use. How do I set it up?
Infinite Systems
Charles Amstutz | Systems Administrator
charlesa@infinitesys.com 402.477.2474
134 S 13th Street, Suite 302 | Lincoln, NE 68508
-----Original Message-----
From: David Jones [mailto:djones@ena.com]
Sent: Thursday, July 13, 2017 8:17 AM
To: users@spamassassin.apache.org
Subject: Re: "bout u" campaign
On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
>
>> pretty high mainly due to DCC and BAYES_99.
>
> Are you paying for DCC? I think we're over their limit and they
> blacklisted us long ago, lol.
I have my own DCC server joined into the DCC network.
https://www.dcc-servers.net/dcc/
>
>> I guess I have well trained Bayes.
>
> I think you just don't have many one-liner emails as a regular course
> of business?
I am classifying about 10K ham and 8K spam each day which I also use in the masscheck processing (currently on hold). Since I have started doing this about a month or so ago, my BAYES scores seem to be more accurate. Maybe I wasn't training enough ham/spam before? I don't know for sure yet.
>
>> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback
>> usb.unsubscore.com
>> [204.29.186.60 listed in
>> ubl.unsubscore.com]
>
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
>
>> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
>
> We do have some in SORBS, but only score it 0.5. Do you really
> recommend scoring it so high?
> Obviously I do because it's working well in my platform. I have other
WL rules that subtract points to offset this one. If there are no other WL (i.e. list.dnswl.org) hits then this will stand out more.
Do some analysis of your emails that hit this rule and what the scores were. My threshold for blocking is 6.0 (default for MailScanner). If your threshold is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.
>> 0.0 OS_UNKNOWN Relay runs on unknown OS
>
> That's an interesting one. Fingerprinting?
>
Yeh. I thought it might be a useful data point for making meta rules but it turns out to not be. I will probably leave this out when I rebuild my filters in the next couple of months on CentOS 7.
>> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail
>
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
>
Yep. Again my block threshold is 6.0 in MailScanner and I have less default trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and other hits that add to the score based on combinations I have seen over the years.
FREEMAIL senders are very difficult to accurately filter but I feel like my rules are pretty good. I have to postwhite exclude most freemail providers since they are listed on some RBLs which makes no sense to me.
You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because they are so large and there are many legit senders in the middle of the spammers.
>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
>
> For 90_100, I think we're only subtracting -0.2.
>
For my mail flow, I have noticed that senders in the 90's are normally very trustworthy.
If you separate your rules into 2 main categories, then you can setup scores based on their category to balance out the other category.
1. IP and domain reputation
2. Message content
Good IP reputation can offset questionable message content and vice versa. I tend to go heavy on the reputation side at the MTA and in SA which has serve me well in the past several years. Before that, I was constantly adjusting content rule scores and writing custom rules to react to the latest spam campaign where I was always behind.
I have a huge list of whitelist_auth based on domain reputation which allows me to crank up some content scores and not let Bayes block good reputation senders based on content.
>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
>
> The problem I always had with pyzor/dcc was that it works on very
> small blocks of text, no? Perhaps it works well for small messages,
> but isn't it problematic for larger messages?
>
I have no idea. I just analyzed my mail scoring and noticed combinations like DCC and FREEMAIL are common in my spam.
>> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>> listed in MSPIKE_H2 so add back points.
>> 0.0 ENA_BAD_SPAM Spam hitting really bad rules.
>> 2.2 ENA_BAD_SPAM_FREEMAIL Bad spam from freemail (hotmail, gmail, msn,
>> yahoo).
>
> These are interesting, but I suppose privileged...
>
The ENA_BAD_SPAM rule is a combination of 2 different types (reputation and content) rules with an AND between them. For example (this is is about one-third of the rule):
meta ENA_BAD_SPAM (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK || RAZOR2_CF_RANGE_E8_51_100 || BAYES_999 || BAYES_99 ||
BAYES_95 || RCVD_IN_BL_SPAMCOP_NET || RCVD_IN_SORBS_WEB ||
RCVD_IN_SENDERSCORE_60_69 || RCVD_IN_SENDERSCORE_50_59 ||
RCVD_IN_SENDERSCORE_30_49 || RCVD_IN_SENDERSCORE_0_29 || RCVD_IN_SORBS_SPAM ) && (URI_PHISH || URIBL_IVMURI || FREEMAIL_FROM || FREEMAIL_REPLYTO || FREEMAIL_FORGED_REPLYTO || MISSING_SUBJECT || MISSING_DATE || KAM_REALLYHUGEIMGSRC || KAM_HUGEIMGSRC || KAM_MANYTO || HTML_FONT_LOW_CONTRAST || ADVANCE_FEE_2_NEW_MONEY || ADVANCE_FEE_2_NEW_FORM || ADVANCE_FEE_3_NEW || ADVANCE_FEE_3_NEW_MONEY
|| ADVANCE_FEE_3_NEW_FORM || ADVANCE_FEE_4_NEW || TVD_RCVD_SINGLE)
describe ENA_BAD_SPAM Spam hitting really bad rules.
score ENA_BAD_SPAM 0.001
/etc/mail/spamassassin/99_mailspike.cf
shortcircuit RCVD_IN_MSPIKE_H5 on
score RCVD_IN_MSPIKE_H4 -3.2
score RCVD_IN_MSPIKE_H3 -2.2
score RCVD_IN_MSPIKE_H2 -1.2
score RCVD_IN_MSPIKE_WL -0.82
score RCVD_IN_MSPIKE_BL 1.2
score RCVD_IN_MSPIKE_L2 0.2
score RCVD_IN_MSPIKE_L3 1.2
score RCVD_IN_MSPIKE_L4 2.2
score RCVD_IN_MSPIKE_L5 3.2
meta ENA_DIGEST_FREEMAIL FREEMAIL_FROM && (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK)
describe ENA_DIGEST_FREEMAIL Freemail account hitting message digest
spam seen by the Internet (DCC, Pyzor, or Razor).
score ENA_DIGEST_FREEMAIL 2.2
meta ENA_DIGEST_MULTIPLE_DNSWL_MED (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_DNSWL_MED
describe ENA_DIGEST_MULTIPLE_DNSWL_MED Dcc, Razor, or Pyzor hits from
servers listed in DNSWL so add back points.
score ENA_DIGEST_MULTIPLE_DNSWL_MED 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H4 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H4
describe ENA_DIGEST_MULTIPLE_MSPIKE_H4 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H4 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H4 3.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H3 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H3
describe ENA_DIGEST_MULTIPLE_MSPIKE_H3 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H3 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H3 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H2 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H2
describe ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H2 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H2 1.2
Hope this is helpful.
--
David Jones
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/15/2017 09:42 AM, RW wrote:
> On Thu, 13 Jul 2017 18:26:54 -0400
> Alex wrote:
>
>> Hi,
>>
>>>> Are you paying for DCC? I think we're over their limit and they
>>>> blacklisted us long ago, lol.
>>>
>>> I have my own DCC server joined into the DCC network.
>>>
>>> https://www.dcc-servers.net/dcc/
>>
>> So you only provide spam services for your own users? Or do you pay?
>>
>>> I am classifying about 10K ham and 8K spam each day which I also
>>> use in the masscheck processing (currently on hold). Since I have
>>> started doing this
>>
>> Through autolearn?
>>
>> It is otherwise extremely time-intensive.
>>
>>> Yep. Again my block threshold is 6.0 in MailScanner and I have
>>> less default trust for FREEMAIL senders. I also have meta rules
>>> based on FREEMAIL and other hits that add to the score based on
>>> combinations I have seen over the years.
>>
>> Adjusting many of the default rules disrupts the score balance created
>> by masschecks, no?
>>
>> I want to avoid having to juggle scores around, in addition to already
>> worrying about writing rules that ultimately have the same effect as
>> existing metas.
>>
>>>>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message
>>>>> digest spam seen by the Internet (DCC, Pyzor, or Razor).
>>
>> Are you worried about overlap between the checksum systems?
>>
>> I've enabled DCC again today, and remembered what I don't like about
>> it. Do you have DCC_CHECK at its default 1.1 score? That's quite high
>> for something described as "bulk mail" when bulk mail is already
>> scored very close to 5.0.
>
> And with FREEMAIL_FROM plus DCC_CHECK (or any digest) you
> have
>
> 1.2 FREEMAIL_FROM
> 2.2 DCC_CHECK
> 2.2 ENA_DIGEST_FREEMAIL
> 0.0 ENA_BAD_SPAM
>
> which is 5.6 points. And judging by the name, at least in some cases,
> maybe all:
>
> 2.2 ENA_BAD_SPAM_FREEMAIL
>
> which makes 7.8 points. This is something that presumably works for
> him, but could cause problems in general.
>
I was trying to give high-level information on the difference between
reputation-based rules and content-based rules and how they can be used
in combination. For FREEMAIL, I have found that making the average
message score just below the threshold gives the maximum reliability.
Since my threshold for blocking is 6.0, I try to get the average
FREEMAIL message to score in the 3.0 to 5.0 range. With well-trained
BAYES and a few other rules that subtract (BAYES_00, good reputation,
etc.), this is working well. When FREEMAIL messages hit DCC and a few
other meta rules common in spam, then they will be over 6.0 like
mentioned above.
Each person has to examine their mail flow and scoring to determine what
will work in there environment but the concepts should still apply.
1. Create a large list of whitelist_auth and whitelist_from_rcvd for
those senders that a) aren't FREEMAIL, b) aren't human mailboxes with
potentially compromised passwords, and c) have a valid unsubscribe
link/process.
Examples:
whitelist_auth *@*.wayfair.com
whitelist_auth *@*.dunkindonuts.com
whitelist_auth *@mktgdillards.com
whitelist_auth *@*.usaa.com
whitelist_auth *@*.citi.com
whitelist_auth *@*.sophos.com
whitelist_auth *@*.myfedloan.org
whitelist_auth *@*.hiltonhonors.com
whitelist_auth *@*.usatoday.com
whitelist_auth *@*.usbank.com
2. Enable SHORTCIRCUIT'ing:
shortcircuit USER_IN_WHITELIST on
priority USER_IN_WHITELIST -400
shortcircuit USER_IN_DEF_WHITELIST on
shortcircuit USER_IN_BLACKLIST on
shortcircuit USER_IN_DKIM_WHITELIST on
shortcircuit USER_IN_DEF_DKIM_WL on
shortcircuit USER_IN_SPF_WHITELIST on
shortcircuit USER_IN_DEF_SPF_WL on
shortcircuit RCVD_IN_RP_CERTIFIED on
shortcircuit RCVD_IN_RP_SAFE on
shortcircuit RCVD_IN_DNSWL_HI on
shortcircuit RCVD_IN_IADB_LISTED on
shortcircuit RCVD_IN_IADB_SPF on
shortcircuit RCVD_IN_IADB_DK on
shortcircuit RCVD_IN_IADB_RDNS on
shortcircuit RCVD_IN_IADB_SENDERID on
shortcircuit RCVD_IN_IADB_OPTIN on
3. Add in extra RBL rules that aren't included with SA. Test these with
low scores until comfortable. Lashback, senderscore.org, Mailspike and
IVM if you have a subscription.
Once you tweak the above list to your email, you should have the
reputation side covered well which will allow content-based checks to
help with the rest of the spam. Well-trained Bayes, ClamAV unofficial
signatures, DCC, Razor, Pyzor, KAM.cf, custom meta rules, etc. will all
help with the rest of the spam and you won't have to constantly react to
the latest spam campaign. You will still have to tweak and tune a
little but not nearly as much as before.
--
David Jones
Re: "bout u" campaign
Posted by RW <rw...@googlemail.com>.
On Thu, 13 Jul 2017 18:26:54 -0400
Alex wrote:
> Hi,
>
> >> Are you paying for DCC? I think we're over their limit and they
> >> blacklisted us long ago, lol.
> >
> > I have my own DCC server joined into the DCC network.
> >
> > https://www.dcc-servers.net/dcc/
>
> So you only provide spam services for your own users? Or do you pay?
>
> > I am classifying about 10K ham and 8K spam each day which I also
> > use in the masscheck processing (currently on hold). Since I have
> > started doing this
>
> Through autolearn?
>
> It is otherwise extremely time-intensive.
>
> > Yep. Again my block threshold is 6.0 in MailScanner and I have
> > less default trust for FREEMAIL senders. I also have meta rules
> > based on FREEMAIL and other hits that add to the score based on
> > combinations I have seen over the years.
>
> Adjusting many of the default rules disrupts the score balance created
> by masschecks, no?
>
> I want to avoid having to juggle scores around, in addition to already
> worrying about writing rules that ultimately have the same effect as
> existing metas.
>
> >>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message
> >>> digest spam seen by the Internet (DCC, Pyzor, or Razor).
>
> Are you worried about overlap between the checksum systems?
>
> I've enabled DCC again today, and remembered what I don't like about
> it. Do you have DCC_CHECK at its default 1.1 score? That's quite high
> for something described as "bulk mail" when bulk mail is already
> scored very close to 5.0.
And with FREEMAIL_FROM plus DCC_CHECK (or any digest) you
have
1.2 FREEMAIL_FROM
2.2 DCC_CHECK
2.2 ENA_DIGEST_FREEMAIL
0.0 ENA_BAD_SPAM
which is 5.6 points. And judging by the name, at least in some cases,
maybe all:
2.2 ENA_BAD_SPAM_FREEMAIL
which makes 7.8 points. This is something that presumably works for
him, but could cause problems in general.
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/14/2017 09:22 PM, Alex wrote:
> Hi,
>
>>>> The ENA_BAD_SPAM rule is a combination of 2 different types (reputation
>>>> and
>>>> content) rules with an AND between them. For example (this is is about
>>>> one-third of the rule):
>>>
>>> Is it usable like this?
>>
>> Try it out with a score of 0.001 and see what you think. It should have
>> been valid. Just drop it in and run:
>>
>> spamassassin -D --lint 2>&1 | /bin/grep -Ei '(failed|undefined
>> dependency|score set for non-existent rule)' | /bin/grep ENA_
>
> By "usable" I meant have you included enough of the rule for it to
> really be effective?
>
> I let it run for the day, and it's just not anchored well enough to
> provide any meaningful benefit. It's hitting on jcpenny, vresp.com,
> constantcontact, sendgrid, facebook, etc.
>
I have all of those senders in whitelist_auth entries. The ENA_BAD_SPAM
has a score of 0.001 just as a place holder for other meta rules based
on it that have a score of 1.2 - 3.2.
Once you setup different tiers of senders and SHORTCIRCUIT all of the
trusted senders that usually score very low, you will be able to handle
regular and untrusted senders more aggressively.
As I have said before, I SHORTCIRCUIT as ham thousands of domains based
on their envelope-from domain as long as they have legit unsubscribe/opt
out processes/links. Now I don't have to worry about these being
falsely categorized as spam based on content. I don't SHORTCIRCUIT any
FREEMAIL domains or any domains that have user mailboxes that can be
compromised.
My MTA blocks the majority of the junk so what passes through SA is
mostly SHORTCIRCUIT'd as ham. Less than 5 percent is spam blocked by
SA. I only get the occasional report of spam from customers from
compromised accounts now which are very difficult to block based on
reputation. Content-based rules are really the only way since these
spammers are crafting zero-hour email that are designed to get through
major mail filters.
--
David Jones
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
>>> The ENA_BAD_SPAM rule is a combination of 2 different types (reputation
>>> and
>>> content) rules with an AND between them. For example (this is is about
>>> one-third of the rule):
>>
>> Is it usable like this?
>
> Try it out with a score of 0.001 and see what you think. It should have
> been valid. Just drop it in and run:
>
> spamassassin -D --lint 2>&1 | /bin/grep -Ei '(failed|undefined
> dependency|score set for non-existent rule)' | /bin/grep ENA_
By "usable" I meant have you included enough of the rule for it to
really be effective?
I let it run for the day, and it's just not anchored well enough to
provide any meaningful benefit. It's hitting on jcpenny, vresp.com,
constantcontact, sendgrid, facebook, etc.
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/13/2017 05:26 PM, Alex wrote:
> Hi,
>
>>> Are you paying for DCC? I think we're over their limit and they
>>> blacklisted us long ago, lol.
>>
>> I have my own DCC server joined into the DCC network.
>>
>> https://www.dcc-servers.net/dcc/
>
> So you only provide spam services for your own users? Or do you pay?
>
Our DCC server was setup 6+ years ago by a previous mail sysadmin before
I started working at my current job. We don't budget or pay anything
annually for DCC. We are peered with another DCC server in their
network. All I know is that we must keep our current IP address the
same to maintain the peering. I have one DCC server that I point my 8
mail filters to.
>> I am classifying about 10K ham and 8K spam each day which I also use in the
>> masscheck processing (currently on hold). Since I have started doing this
>
> Through autolearn?
>
> It is otherwise extremely time-intensive.
>
Actually I have found some rule combinations and score thresholds that
are definitely ham/spam. I have built an iRedMail VM with no RBLs,
postscreen, or other MTA optimizations and disabled some things in
amavis-new so spam will get to SA. Ham comes from a subset of my
primary SA filters based on SHORTCIRCUIT rules and very low scoring
messages.
I setup Inbox rules to move certain messages into ham/spam folders. I
have to login once a day and spend a few minutes quickly reviewing the
unread messages and marking them as read. My masscheck and SA learning
uses the read folder (Maildir cur directory).
>> Yep. Again my block threshold is 6.0 in MailScanner and I have less default
>> trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and
>> other hits that add to the score based on combinations I have seen over the
>> years.
>
> Adjusting many of the default rules disrupts the score balance created
> by masschecks, no?
>
Correct. Before I knew about the masscheck processing and what it does,
I used to adjust the scores on most of the rules which was time
consuming just like re-actively creating rules for new spam campaigns.
A few months ago I removed most of my custom scores on default SA rules
and I use meta rules to combine hits on certain rules to add some points.
> I want to avoid having to juggle scores around, in addition to already
> worrying about writing rules that ultimately have the same effect as
> existing metas.
>
>>>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
>>>> seen by the Internet (DCC, Pyzor, or Razor).
>
> Are you worried about overlap between the checksum systems?
>
> I've enabled DCC again today, and remembered what I don't like about
> it. Do you have DCC_CHECK at its default 1.1 score? That's quite high
> for something described as "bulk mail" when bulk mail is already
> scored very close to 5.0.
>
If you follow my logical separation of rules into reputation-based and
content-based then DCC, RAZOR, and PYZOR are going to fall into the
content side. You still have the reputation rules that will lower the
score and offset these DIGEST rules. Plus with many SHORTCIRCUIT'd
senders based on whitelist_auth and whitelist_from_rcvd, the
trusted/safe bulk senders with a valid unsubscribe process will pass
through fine.
> How much more effective do you find DCC than PYZOR? That's already
> scored at 1.4.
>
Haven't really had to worry about this with SHORTCIRCUIT'ing and
whitelist_auth on the envelope-from domain (SPF_PASS + non-human account
domains).
>> I have no idea. I just analyzed my mail scoring and noticed combinations
>> like DCC and FREEMAIL are common in my spam.
>
> That's a good combination.
>
>> The ENA_BAD_SPAM rule is a combination of 2 different types (reputation and
>> content) rules with an AND between them. For example (this is is about
>> one-third of the rule):
>
> Is it usable like this?
>
Try it out with a score of 0.001 and see what you think. It should have
been valid. Just drop it in and run:
spamassassin -D --lint 2>&1 | /bin/grep -Ei '(failed|undefined
dependency|score set for non-existent rule)' | /bin/grep ENA_
You can also run the first section and check for a zero return code. I
have a config distribution script that runs the first part above and
will not send it out if the return code is not zero.
>> /etc/mail/spamassassin/99_mailspike.cf
>> shortcircuit RCVD_IN_MSPIKE_H5 on
>>
>> score RCVD_IN_MSPIKE_H4 -3.2
>> score RCVD_IN_MSPIKE_H3 -2.2
>> score RCVD_IN_MSPIKE_H2 -1.2
>> score RCVD_IN_MSPIKE_WL -0.82
>> score RCVD_IN_MSPIKE_BL 1.2
>> score RCVD_IN_MSPIKE_L2 0.2
>> score RCVD_IN_MSPIKE_L3 1.2
>> score RCVD_IN_MSPIKE_L4 2.2
>> score RCVD_IN_MSPIKE_L5 3.2
>
> The default scores for these rules are all almost 0 when bayes and
> network tests are enabled. I've adjusted the L[2-5] rules from 0.2 to
> 1.2. Took a quick look at a handful of L5 mail and anything higher
> would be problematic.
>
>> Hope this is helpful.
>
> Thanks, as always.
>
>
>>
>> --
>> David Jones
--
David Jones
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
>> Are you paying for DCC? I think we're over their limit and they
>> blacklisted us long ago, lol.
>
> I have my own DCC server joined into the DCC network.
>
> https://www.dcc-servers.net/dcc/
So you only provide spam services for your own users? Or do you pay?
> I am classifying about 10K ham and 8K spam each day which I also use in the
> masscheck processing (currently on hold). Since I have started doing this
Through autolearn?
It is otherwise extremely time-intensive.
> Yep. Again my block threshold is 6.0 in MailScanner and I have less default
> trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and
> other hits that add to the score based on combinations I have seen over the
> years.
Adjusting many of the default rules disrupts the score balance created
by masschecks, no?
I want to avoid having to juggle scores around, in addition to already
worrying about writing rules that ultimately have the same effect as
existing metas.
>>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
>>> seen by the Internet (DCC, Pyzor, or Razor).
Are you worried about overlap between the checksum systems?
I've enabled DCC again today, and remembered what I don't like about
it. Do you have DCC_CHECK at its default 1.1 score? That's quite high
for something described as "bulk mail" when bulk mail is already
scored very close to 5.0.
How much more effective do you find DCC than PYZOR? That's already
scored at 1.4.
> I have no idea. I just analyzed my mail scoring and noticed combinations
> like DCC and FREEMAIL are common in my spam.
That's a good combination.
> The ENA_BAD_SPAM rule is a combination of 2 different types (reputation and
> content) rules with an AND between them. For example (this is is about
> one-third of the rule):
Is it usable like this?
> /etc/mail/spamassassin/99_mailspike.cf
> shortcircuit RCVD_IN_MSPIKE_H5 on
>
> score RCVD_IN_MSPIKE_H4 -3.2
> score RCVD_IN_MSPIKE_H3 -2.2
> score RCVD_IN_MSPIKE_H2 -1.2
> score RCVD_IN_MSPIKE_WL -0.82
> score RCVD_IN_MSPIKE_BL 1.2
> score RCVD_IN_MSPIKE_L2 0.2
> score RCVD_IN_MSPIKE_L3 1.2
> score RCVD_IN_MSPIKE_L4 2.2
> score RCVD_IN_MSPIKE_L5 3.2
The default scores for these rules are all almost 0 when bayes and
network tests are enabled. I've adjusted the L[2-5] rules from 0.2 to
1.2. Took a quick look at a handful of L5 mail and anything higher
would be problematic.
> Hope this is helpful.
Thanks, as always.
>
> --
> David Jones
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/12/2017 09:50 PM, Alex wrote:
> Hi,
>
>> pretty high mainly due to DCC and BAYES_99.
>
> Are you paying for DCC? I think we're over their limit and they
> blacklisted us long ago, lol.
I have my own DCC server joined into the DCC network.
https://www.dcc-servers.net/dcc/
>
>> I guess I have well trained Bayes.
>
> I think you just don't have many one-liner emails as a regular course
> of business?
I am classifying about 10K ham and 8K spam each day which I also use in
the masscheck processing (currently on hold). Since I have started
doing this about a month or so ago, my BAYES scores seem to be more
accurate. Maybe I wasn't training enough ham/spam before? I don't know
for sure yet.
>
>> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback
>> usb.unsubscore.com
>> [204.29.186.60 listed in ubl.unsubscore.com]
>
> I forgot about this. I have it in postscreen (+1) but now also added it in SA.
>
>> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
>
> We do have some in SORBS, but only score it 0.5. Do you really
> recommend scoring it so high?
> Obviously I do because it's working well in my platform. I have other
WL rules that subtract points to offset this one. If there are no other
WL (i.e. list.dnswl.org) hits then this will stand out more.
Do some analysis of your emails that hit this rule and what the scores
were. My threshold for blocking is 6.0 (default for MailScanner). If
your threshold is 5.0 and your ham with this rule his is scoring below
3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2.
>> 0.0 OS_UNKNOWN Relay runs on unknown OS
>
> That's an interesting one. Fingerprinting?
>
Yeh. I thought it might be a useful data point for making meta rules
but it turns out to not be. I will probably leave this out when I
rebuild my filters in the next couple of months on CentOS 7.
>> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail
>
> This is also scored *much* lower here - we have many freemail senders.
> The default score is 0.001, so you must have changed it.
>
Yep. Again my block threshold is 6.0 in MailScanner and I have less
default trust for FREEMAIL senders. I also have meta rules based on
FREEMAIL and other hits that add to the score based on combinations I
have seen over the years.
FREEMAIL senders are very difficult to accurately filter but I feel like
my rules are pretty good. I have to postwhite exclude most freemail
providers since they are listed on some RBLs which makes no sense to me.
You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just
because they are so large and there are many legit senders in the middle
of the spammers.
>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
>
> For 90_100, I think we're only subtracting -0.2.
>
For my mail flow, I have noticed that senders in the 90's are normally
very trustworthy.
If you separate your rules into 2 main categories, then you can setup
scores based on their category to balance out the other category.
1. IP and domain reputation
2. Message content
Good IP reputation can offset questionable message content and vice
versa. I tend to go heavy on the reputation side at the MTA and in SA
which has serve me well in the past several years. Before that, I was
constantly adjusting content rule scores and writing custom rules to
react to the latest spam campaign where I was always behind.
I have a huge list of whitelist_auth based on domain reputation which
allows me to crank up some content scores and not let Bayes block good
reputation senders based on content.
>> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
>> seen by the Internet (DCC, Pyzor, or Razor).
>
> The problem I always had with pyzor/dcc was that it works on very
> small blocks of text, no? Perhaps it works well for small messages,
> but isn't it problematic for larger messages?
>
I have no idea. I just analyzed my mail scoring and noticed
combinations like DCC and FREEMAIL are common in my spam.
>> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
>> listed in MSPIKE_H2 so add back points.
>> 0.0 ENA_BAD_SPAM Spam hitting really bad rules.
>> 2.2 ENA_BAD_SPAM_FREEMAIL Bad spam from freemail (hotmail, gmail, msn,
>> yahoo).
>
> These are interesting, but I suppose privileged...
>
The ENA_BAD_SPAM rule is a combination of 2 different types (reputation
and content) rules with an AND between them. For example (this is is
about one-third of the rule):
meta ENA_BAD_SPAM (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK || RAZOR2_CF_RANGE_E8_51_100 || BAYES_999 || BAYES_99 ||
BAYES_95 || RCVD_IN_BL_SPAMCOP_NET || RCVD_IN_SORBS_WEB ||
RCVD_IN_SENDERSCORE_60_69 || RCVD_IN_SENDERSCORE_50_59 ||
RCVD_IN_SENDERSCORE_30_49 || RCVD_IN_SENDERSCORE_0_29 ||
RCVD_IN_SORBS_SPAM ) && (URI_PHISH || URIBL_IVMURI || FREEMAIL_FROM ||
FREEMAIL_REPLYTO || FREEMAIL_FORGED_REPLYTO || MISSING_SUBJECT ||
MISSING_DATE || KAM_REALLYHUGEIMGSRC || KAM_HUGEIMGSRC || KAM_MANYTO ||
HTML_FONT_LOW_CONTRAST || ADVANCE_FEE_2_NEW_MONEY ||
ADVANCE_FEE_2_NEW_FORM || ADVANCE_FEE_3_NEW || ADVANCE_FEE_3_NEW_MONEY
|| ADVANCE_FEE_3_NEW_FORM || ADVANCE_FEE_4_NEW || TVD_RCVD_SINGLE)
describe ENA_BAD_SPAM Spam hitting really bad rules.
score ENA_BAD_SPAM 0.001
/etc/mail/spamassassin/99_mailspike.cf
shortcircuit RCVD_IN_MSPIKE_H5 on
score RCVD_IN_MSPIKE_H4 -3.2
score RCVD_IN_MSPIKE_H3 -2.2
score RCVD_IN_MSPIKE_H2 -1.2
score RCVD_IN_MSPIKE_WL -0.82
score RCVD_IN_MSPIKE_BL 1.2
score RCVD_IN_MSPIKE_L2 0.2
score RCVD_IN_MSPIKE_L3 1.2
score RCVD_IN_MSPIKE_L4 2.2
score RCVD_IN_MSPIKE_L5 3.2
meta ENA_DIGEST_FREEMAIL FREEMAIL_FROM && (DCC_CHECK || PYZOR_CHECK ||
RAZOR2_CHECK)
describe ENA_DIGEST_FREEMAIL Freemail account hitting message digest
spam seen by the Internet (DCC, Pyzor, or Razor).
score ENA_DIGEST_FREEMAIL 2.2
meta ENA_DIGEST_MULTIPLE_DNSWL_MED (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_DNSWL_MED
describe ENA_DIGEST_MULTIPLE_DNSWL_MED Dcc, Razor, or Pyzor hits from
servers listed in DNSWL so add back points.
score ENA_DIGEST_MULTIPLE_DNSWL_MED 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H4 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H4
describe ENA_DIGEST_MULTIPLE_MSPIKE_H4 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H4 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H4 3.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H3 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H3
describe ENA_DIGEST_MULTIPLE_MSPIKE_H3 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H3 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H3 2.2
meta ENA_DIGEST_MULTIPLE_MSPIKE_H2 (DIGEST_MULTIPLE ||
ENA_DIGEST_FREEMAIL) && RCVD_IN_MSPIKE_H2
describe ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from
servers listed in MSPIKE_H2 so add back points.
score ENA_DIGEST_MULTIPLE_MSPIKE_H2 1.2
Hope this is helpful.
--
David Jones
Re: "bout u" campaign
Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 12.07.17 22:50, Alex wrote:
>> pretty high mainly due to DCC and BAYES_99.
>
>Are you paying for DCC? I think we're over their limit and they
>blacklisted us long ago, lol.
Configure your own DCC server and connect to their network.
It is not a paid service (paid is if you don't connect server to their
nwetwork).
--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Fucking windows! Bring Bill Gates! (Southpark the movie)
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
> pretty high mainly due to DCC and BAYES_99.
Are you paying for DCC? I think we're over their limit and they
blacklisted us long ago, lol.
> I guess I have well trained Bayes.
I think you just don't have many one-liner emails as a regular course
of business?
> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback
> usb.unsubscore.com
> [204.29.186.60 listed in ubl.unsubscore.com]
I forgot about this. I have it in postscreen (+1) but now also added it in SA.
> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
We do have some in SORBS, but only score it 0.5. Do you really
recommend scoring it so high?
> 0.0 OS_UNKNOWN Relay runs on unknown OS
That's an interesting one. Fingerprinting?
> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail
This is also scored *much* lower here - we have many freemail senders.
The default score is 0.001, so you must have changed it.
> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
For 90_100, I think we're only subtracting -0.2.
> 2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest spam
> seen by the Internet (DCC, Pyzor, or Razor).
The problem I always had with pyzor/dcc was that it works on very
small blocks of text, no? Perhaps it works well for small messages,
but isn't it problematic for larger messages?
> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
> listed in MSPIKE_H2 so add back points.
> 0.0 ENA_BAD_SPAM Spam hitting really bad rules.
> 2.2 ENA_BAD_SPAM_FREEMAIL Bad spam from freemail (hotmail, gmail, msn,
> yahoo).
These are interesting, but I suppose privileged...
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
> I SHORTCIRCUIT any trustworthy sender with a legit unsubscribe process to
> put control back in the hands/mouse of the end user. I also SHORTCIRCUIT
> with whitelist_auth any domains (primarily subdomains) that are
> system-generated and consistently score very low.
Just now received this one, and thought it was relevant given our
conversation today. Would you have (did you?) shortcircuited this?
https://pastebin.com/CXj0mgw1
This hit SENDERSCORE_90_100 as well as MSPIKE_H2 and BAYES_50. It also
didn't hit any FREEMAIL rules. It also hit LASHBACK.
I'm curious how your system would have blocked this. I've lowered
LASHBACK to 0.65 because it was hitting so much ham, but even at the
1.2+ we were discussing it would not have been marked as spam.
Thanks,
Alex
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/13/2017 12:56 PM, Dave Jones wrote:
> On 07/13/2017 12:39 PM, Alex wrote:
>> Hi,
>>
>>> header RCVD_IN_SENDERSCORE_0_29
>>> eval:check_rbl('senderscore0-lastexternal','score.senderscore.com.','^127\.0\.4\.([1-2]?[0-9])$')
>>>
>>> describe RCVD_IN_SENDERSCORE_0_29 Senderscore.org score
>>> of 0
>>> to 29
>>> score RCVD_IN_SENDERSCORE_0_29 5.2
>>> tflags RCVD_IN_SENDERSCORE_0_29 net
>>
>> At least in my environment, this one in particular would catch a ton
>> of legitimate mail. This also assumes a 6.0 score for you, correct?
>>
>
> Correct. My block threshold of 6.0 is the default in MailScanner.
>
> The legit email should be SHORTCIRCUIT'd with whitelist_auth entries.
>
> I SHORTCIRCUIT any trustworthy sender with a legit unsubscribe process
> to put control back in the hands/mouse of the end user. I also
> SHORTCIRCUIT with whitelist_auth any domains (primarily subdomains) that
> are system-generated and consistently score very low.
>
> From my own email analysis, the majority of my spam is from FREEMAIL
> senders and compromised accounts with zero-hour spam campaigns that the
> mail server is not yet on any RBLs. Botnet controlled devices are
> another source of spam but they seem to be sending through compromised
> accounts these days. They will phish a password, sit on it for days or
> weeks, craft a zero-hour spam campaign to get through most mail filters,
> then blast as much spam as they can until RBLs and DCC catch up to it in
> about 30 minutes or so. These compromised accounts from normally
> trusted mail server IPs are they reason why some SA RBL rules need to go
> beyond the lastexternal hop.
>
Let me clarify a bit. Don't put any FREEMAIL or domains with human
accounts (potentially compromised) in your whitelist_auth unless you
have to for some reason. Some senders may not have SPF or DKIM setup
properly so you may have to put some of them in the whitelist_from_rcvd
to get the same result.
Doing this will separate out trustworthy senders from potential content
pitfalls. For example, legit eBay emails will get through while spoofed
emails with identical email content can be blocked by Bayes or other
content rules.
I am seeing a lot of email spoofing USAA insurance lately to phish
accounts. I whitelist_auth legit USAA emails then train the rest as
spam so Bayes and other rules can block the phishing.
--
David Jones
Re: "bout u" campaign
Posted by Dave Jones <da...@apache.org>.
On 07/13/2017 12:39 PM, Alex wrote:
> Hi,
>
>> header RCVD_IN_SENDERSCORE_0_29
>> eval:check_rbl('senderscore0-lastexternal','score.senderscore.com.','^127\.0\.4\.([1-2]?[0-9])$')
>> describe RCVD_IN_SENDERSCORE_0_29 Senderscore.org score of 0
>> to 29
>> score RCVD_IN_SENDERSCORE_0_29 5.2
>> tflags RCVD_IN_SENDERSCORE_0_29 net
>
> At least in my environment, this one in particular would catch a ton
> of legitimate mail. This also assumes a 6.0 score for you, correct?
>
Correct. My block threshold of 6.0 is the default in MailScanner.
The legit email should be SHORTCIRCUIT'd with whitelist_auth entries.
I SHORTCIRCUIT any trustworthy sender with a legit unsubscribe process
to put control back in the hands/mouse of the end user. I also
SHORTCIRCUIT with whitelist_auth any domains (primarily subdomains) that
are system-generated and consistently score very low.
From my own email analysis, the majority of my spam is from FREEMAIL
senders and compromised accounts with zero-hour spam campaigns that the
mail server is not yet on any RBLs. Botnet controlled devices are
another source of spam but they seem to be sending through compromised
accounts these days. They will phish a password, sit on it for days or
weeks, craft a zero-hour spam campaign to get through most mail filters,
then blast as much spam as they can until RBLs and DCC catch up to it in
about 30 minutes or so. These compromised accounts from normally
trusted mail server IPs are they reason why some SA RBL rules need to go
beyond the lastexternal hop.
--
David Jones
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
> header RCVD_IN_SENDERSCORE_0_29
> eval:check_rbl('senderscore0-lastexternal','score.senderscore.com.','^127\.0\.4\.([1-2]?[0-9])$')
> describe RCVD_IN_SENDERSCORE_0_29 Senderscore.org score of 0
> to 29
> score RCVD_IN_SENDERSCORE_0_29 5.2
> tflags RCVD_IN_SENDERSCORE_0_29 net
At least in my environment, this one in particular would catch a ton
of legitimate mail. This also assumes a 6.0 score for you, correct?
Re: "bout u" campaign
Posted by Dave Jones <da...@apache.org>.
On 07/13/2017 12:03 PM, @lbutlr wrote:
> On Jul 12, 2017, at 8:18 PM, David Jones <dj...@ena.com> wrote:
>> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
>
> I haven’t seen that before (or not that I’ve noticed). Is it part fo the base SA package or something that was added?
>
>
I posted something generic about score.senderscore.org a year or so ago
but here are the rules that I have been using now for a couple of years:
/etc/mail/spamassassin/99_senderscore.cf
ifplugin Mail::SpamAssassin::Plugin::DNSEval
header __RCVD_IN_SENDERSCORE_90_100
eval:check_rbl('senderscore90-lastexternal','score.senderscore.com.','^1
27\.0\.4\.(9[0-9]|100)$')
meta RCVD_IN_SENDERSCORE_90_100 SPF_PASS &&
__RCVD_IN_SENDERSCORE_90_100
describe RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of
90 to 100
score RCVD_IN_SENDERSCORE_90_100 -2.2
tflags RCVD_IN_SENDERSCORE_90_100 net
header __RCVD_IN_SENDERSCORE_80_89
eval:check_rbl('senderscorer80-lastexternal','score.senderscore.com.','^127\.0\.4\.(8[0-9])$')
meta RCVD_IN_SENDERSCORE_80_89 SPF_PASS && __RCVD_IN_SENDERSCORE_80_89
describe RCVD_IN_SENDERSCORE_80_89 Senderscore.org score of 80 to 89
score RCVD_IN_SENDERSCORE_80_89 -0.2
tflags RCVD_IN_SENDERSCORE_80_89 net
header RCVD_IN_SENDERSCORE_70_79
eval:check_rbl('senderscorer70-lastexternal','score.senderscore.com.','^127\.0\.4\.(7[0-9])$')
describe RCVD_IN_SENDERSCORE_70_79 Senderscore.org score of 70 to 79
score RCVD_IN_SENDERSCORE_70_79 1.2
tflags RCVD_IN_SENDERSCORE_70_79 net
header RCVD_IN_SENDERSCORE_60_69
eval:check_rbl('senderscorer60-lastexternal','score.senderscore.com.','^127\.0\.4\.(6[0-9])$')
describe RCVD_IN_SENDERSCORE_60_69 Senderscore.org score of 60 to 69
score RCVD_IN_SENDERSCORE_60_69 2.2
tflags RCVD_IN_SENDERSCORE_60_69 net
header RCVD_IN_SENDERSCORE_50_59
eval:check_rbl('senderscorer50-lastexternal','score.senderscore.com.','^127\.0\.4\.(5[0-9])$')
describe RCVD_IN_SENDERSCORE_50_59 Senderscore.org score of 50 to 59
score RCVD_IN_SENDERSCORE_50_59 3.2
tflags RCVD_IN_SENDERSCORE_50_59 net
header RCVD_IN_SENDERSCORE_30_49
eval:check_rbl('senderscorer30-lastexternal','score.senderscore.com.','^127\.0\.4\.([3-4][0-9])$')
describe RCVD_IN_SENDERSCORE_30_49 Senderscore.org score of 30 to 49
score RCVD_IN_SENDERSCORE_30_49 4.2
tflags RCVD_IN_SENDERSCORE_30_49 net
header RCVD_IN_SENDERSCORE_0_29
eval:check_rbl('senderscore0-lastexternal','score.senderscore.com.','^127\.0\.4\.([1-2]?[0-9])$')
describe RCVD_IN_SENDERSCORE_0_29 Senderscore.org score of 0 to 29
score RCVD_IN_SENDERSCORE_0_29 5.2
tflags RCVD_IN_SENDERSCORE_0_29 net
endif
--
David Jones
Re: "bout u" campaign
Posted by "@lbutlr" <kr...@kreme.com>.
On Jul 12, 2017, at 8:18 PM, David Jones <dj...@ena.com> wrote:
> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
I haven’t seen that before (or not that I’ve noticed). Is it part fo the base SA package or something that was added?
--
Apple broke AppleScripting signatures in Mail.app, so no random signatures.
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/12/2017 08:04 PM, Alex wrote:
> Hi all,
>
> Has anyone else experienced a spam campaign with any one of the
> following subjects:
>
> - sometimes enjoy it wild, how bout you?
> - sometimes like it ruff, what bout you?
> - sumtimes enjoy it ruff, wat bout you?
>
> The body contains something like "wild hukups" then a phone number.
>
> https://pastebin.com/X5xNn9RZ
>
> It comes from AOL and other freemails, but doesn't hit much, and hits
> bayes50 or lower here.
>
> Is this a snowshoe thing? Ideas on how to stop them? I've now trained
> them but I thought someone might like to see them for their own
> benefit, and perhaps had ideas on a more general way of blocking
> these.
>
> What is even the point of spam with a phone number?
>
> The IP range for the ones originating from AOL are all in the
> 204.29.186.0/24 block. None of them are in any meaningful blacklist
> and have a 90+ senderscore.
>
> I'm sure the campaign will change soon, but I thought there was
> something more general we could look for the next time...
>
Time has passed so there could be more hits on RBLs by now and DCC hit
now that may not have hit earlier but my SA scored it pretty high mainly
due to DCC and BAYES_99. I guess I have well trained Bayes. I have
some meta rules that trigger adding more points when FREEMAIL hits
things like KAM_URI, DIGEST_MULTIPLE and high BAYES. The ENA_BAD_SPAM
is a huge list of combinations of bad rule hits built over years that
triggers other rules with points.
Content analysis details: (14.1 points, 5.0 required)
pts rule name description
---- ----------------------
--------------------------------------------------
1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback
usb.unsubscore.com
[204.29.186.60 listed in ubl.unsubscore.com]
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100%
[score: 0.9993]
2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source
[204.29.186.60 listed in dnsbl.sorbs.net]
-0.2 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no
trust
[204.29.186.60 listed in list.dnswl.org]
-1.2 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2)
[204.29.186.60 listed in wl.mailspike.net]
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 OS_UNKNOWN Relay runs on unknown OS
1.2 FREEMAIL_FROM Sender email is commonly abused enduser
mail provider
(georgia32ce[at]aol.com)
1.5 KAM_MXURI URI: URI begins with a mail exchange
prefix, i.e. mx.[...]
0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100%
[score: 0.9993]
0.0 HTML_MESSAGE BODY: HTML included in message
2.2 DCC_CHECK Detected as bulk mail by DCC (dcc-servers.net)
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not
necessarily valid
-2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100
0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid
2.2 ENA_DIGEST_FREEMAIL Freemail account hitting message digest
spam seen
by the Internet (DCC, Pyzor, or Razor).
1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or Pyzor hits from servers
listed in MSPIKE_H2 so add back points.
0.0 ENA_BAD_SPAM Spam hitting really bad rules.
2.2 ENA_BAD_SPAM_FREEMAIL Bad spam from freemail (hotmail, gmail, msn,
yahoo).
--
David Jones
Re: "bout u" campaign
Posted by Kevin Golding <kp...@caomhin.org>.
On Mon, 17 Jul 2017 18:38:24 +0100, David Jones <dj...@ena.com> wrote:
> It would be nice if there was a local tool that could be part of the SA
> project that would extend the masscheck processing and help build
> content and meta rules.
As John's already mentioned, there is a surprising array of tools already
included:
https://svn.apache.org/repos/asf/spamassassin/trunk/masses/rule-dev/
It's less amount creating and more about refining.
Re: "bout u" campaign
Posted by David Jones <dj...@ena.com>.
On 07/17/2017 12:03 PM, Jesse Norell wrote:
> This description:
>
> On Thu, 2017-07-13 at 15:07 +0100, Martin Gregorie wrote:
>> I'm continuing to get good results from a multi-level approach:
>>
>> I use two or more subrules with low scores (0.01 or so) that are
>> combined by an AND relation in a meta-rule that triggers a suitably
>> spammy score when all subrules get hits.
>>
>> The subrules are typically automatically assembled lists of words or
>> phrases - automatically assembled because that makes maintenance
>> vastly
>> easier. The list contents are typically words and phrases found in
>> spam, e.g. one list might be selling phrases such as "get you rocks
>> off
>> with" that are unlikely to appear in personal or legit commercial mail
>> and another might be names or slang terms for less common
>> pharmaceuticals.
>
>
> and what David Jones has been describing in this thread of identifying
> specific combinations of rules (his based on reputation vs. content)
> both remind me of the description of Marc Perkel's "evolution filter",
> which from memory identified sets of rules which are very indicative of
> ham/spam. Both David and Martin are reporting good success, as did
> Marc - maybe worth looking into implementing in spamassassin?
>
> Does masscheck automate meta rule creation? (ie. not just generate
> scores) Not the full "evolution filter" idea which would have to run on
> the endpoint, but that would benefit everyone via rule updates.
>
>
I have been working on rebuilding the SA project's server the past four
months. The first priority was getting the spamassassin.org hidden DNS
master active again. This was pretty easy. The second priority was the
masscheck processing which turned out to be pretty time intensive and
still could have an open issue so SA updates are currently on hold.
From what I can tell, the masscheck is only meant to dynamically update
the rule scores in 72_scores.cf (manual scores are in 50_scores.cf) and
help validate new rules added by the SA developers. I doesn't create
new rules. It's not able to create new rules based on content since the
masscheck processing is run locally by easy user. The email content is
not uploaded to the SA server. Only a special log file showing all of
the rule hits each message hit for ham and spam is sent to the SA server.
It would be nice if there was a local tool that could be part of the SA
project that would extend the masscheck processing and help build
content and meta rules. This would create more interest in masschecking
and get more people involved. (I use my masscheck ham/spam to also
train my Bayes DB or else it may not have been helpful enough for me to
set it up and understand the value of it.) I suspect the advanced users
of SA like Kevin's KAM.cf rules and a few others on this list have
something like this they are using to build custom rules in an automated
way. Thankfully Kevin publishes his KAM.cf and allows public downloading.
I know that Kevin has a desire to be able to speed up rule development
and SA updates (could take up to ~40 hours today if it weren't currently
on hold) to react faster to new spam but it will never be fast enough to
react to zero-hour spam like other technologies. The best thing you can
do is selective greylisting, rate limiting, DCC, Razor, Pyzor, and hope
the RBLs catch up quickly. I also have a local ruleset that I add
zero-hour spam to shortcircuit as spam based on content which does a
pretty good job at most new spam and phishing but some still get through
now and then from compromised accounts.
--
David Jones
Re: "bout u" campaign
Posted by John Hardin <jh...@impsec.org>.
On Mon, 17 Jul 2017, Jesse Norell wrote:
> Does masscheck automate meta rule creation? (ie. not just generate
> scores) Not the full "evolution filter" idea which would have to run on
> the endpoint, but that would benefit everyone via rule updates.
No, it does not.
There were a couple of rule generation experiments (the Sought and
Sought-Fraud rulesets) but they fell by the wayside. The code is there if
someone would like to start generating rulesets, and some of the corpus
contributors might be willing to provide classified corpora (I still have
a separate maintained 419 spams folder even though sought-fraud went
dark).
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Back in 1969 the technology to fake a Moon landing didn't exist,
but the technology to actually land there did.
Today, it is the opposite. -- unknown
-----------------------------------------------------------------------
3 days until the 48th anniversary of Apollo 11 landing on the Moon
Re: "bout u" campaign
Posted by Jesse Norell <je...@kci.net>.
This description:
On Thu, 2017-07-13 at 15:07 +0100, Martin Gregorie wrote:
> I'm continuing to get good results from a multi-level approach:
>
> I use two or more subrules with low scores (0.01 or so) that are
> combined by an AND relation in a meta-rule that triggers a suitably
> spammy score when all subrules get hits.
>
> The subrules are typically automatically assembled lists of words or
> phrases - automatically assembled because that makes maintenance
> vastly
> easier. The list contents are typically words and phrases found in
> spam, e.g. one list might be selling phrases such as "get you rocks
> off
> with" that are unlikely to appear in personal or legit commercial mail
> and another might be names or slang terms for less common
> pharmaceuticals.
and what David Jones has been describing in this thread of identifying
specific combinations of rules (his based on reputation vs. content)
both remind me of the description of Marc Perkel's "evolution filter",
which from memory identified sets of rules which are very indicative of
ham/spam. Both David and Martin are reporting good success, as did
Marc - maybe worth looking into implementing in spamassassin?
Does masscheck automate meta rule creation? (ie. not just generate
scores) Not the full "evolution filter" idea which would have to run on
the endpoint, but that would benefit everyone via rule updates.
--
Jesse Norell
Kentec Communications, Inc.
970-522-8107 - www.kci.net
Re: "bout u" campaign
Posted by Martin Gregorie <ma...@gregorie.org>.
On Thu, 2017-07-13 at 13:26 -0400, Alex wrote:
> Would you be willing to share a few examples?
>
You can download the script processor and documentation from here:
http://www.libelle-systems.com/free/
Its called 'portmanteau' and is a .tgz compressed tar archive
Contact me offlist if you want copies of the definition files for my
MG_SALE and MG_PRODUCT rules.
Martin
Re: "bout u" campaign
Posted by Alex <my...@gmail.com>.
Hi,
On Thu, Jul 13, 2017 at 10:07 AM, Martin Gregorie <ma...@gregorie.org> wrote:
> On Thu, 2017-07-13 at 12:59 +0000, Charles Amstutz wrote:
>> I find it challenging to constantly keep up with campaign's. My
>> guess with the phone number is to try to make it seem more
>> legitimate.
>> More recent, I try to look for general characteristics and go for
>> that, in order to futureproof rules. However, there are always
>> legitimate emails being sent that would trigger a potential rule
>> (depending on what you are matching on)
>>
> I'm continuing to get good results from a multi-level approach:
>
> I use two or more subrules with low scores (0.01 or so) that are
> combined by an AND relation in a meta-rule that triggers a suitably
> spammy score when all subrules get hits.
>
> The subrules are typically automatically assembled lists of words or
> phrases - automatically assembled because that makes maintenance vastly
> easier. The list contents are typically words and phrases found in
> spam, e.g. one list might be selling phrases such as "get you rocks off
> with" that are unlikely to appear in personal or legit commercial mail
> and another might be names or slang terms for less common
> pharmaceuticals.
>
> The basis of this idea, which works surprisingly well in practise, is
> that a hit on one list may be accidental but a message hitting on both
> lists is more likely than not to be spam. A side benefit of this
> approach is that it will also hit combinations that weren't used in any
> of the spam analysed to create the lists, and that this will not
> generate false positives if the list contents are carefully chosen.
>
> I use an awk script to turn easily edited definition files into valid
> SA rules and hand-write the combining meta-rules.
We have a local blocklist that generates rules based on strings
identified in the body, subject and sender. I don't think it's quite
the same, however.
Would you be willing to share a few examples?
We also have a system where we use some of the address collection
rules combined with some of our own rules for catching "list" spam
("Sports enthusiasts", etc).
Re: "bout u" campaign
Posted by Martin Gregorie <ma...@gregorie.org>.
On Thu, 2017-07-13 at 12:59 +0000, Charles Amstutz wrote:
> I find it challenging to constantly keep up with campaign's. My
> guess with the phone number is to try to make it seem more
> legitimate.
> More recent, I try to look for general characteristics and go for
> that, in order to futureproof rules. However, there are always
> legitimate emails being sent that would trigger a potential rule
> (depending on what you are matching on)
>
I'm continuing to get good results from a multi-level approach:
I use two or more subrules with low scores (0.01 or so) that are
combined by an AND relation in a meta-rule that triggers a suitably
spammy score when all subrules get hits.
The subrules are typically automatically assembled lists of words or
phrases - automatically assembled because that makes maintenance vastly
easier. The list contents are typically words and phrases found in
spam, e.g. one list might be selling phrases such as "get you rocks off
with" that are unlikely to appear in personal or legit commercial mail
and another might be names or slang terms for less common
pharmaceuticals.
The basis of this idea, which works surprisingly well in practise, is
that a hit on one list may be accidental but a message hitting on both
lists is more likely than not to be spam. A side benefit of this
approach is that it will also hit combinations that weren't used in any
of the spam analysed to create the lists, and that this will not
generate false positives if the list contents are carefully chosen.
I use an awk script to turn easily edited definition files into valid
SA rules and hand-write the combining meta-rules.
Martin
RE: "bout u" campaign
Posted by Charles Amstutz <ch...@infinitesys.com>.
I find it challenging to constantly keep up with campaign's. My guess with the phone number is to try to make it seem more legitimate.
More recent, I try to look for general characteristics and go for that, in order to futureproof rules. However, there are always legitimate emails being sent that would trigger a potential rule (depending on what you are matching on)
>> What is even the point of spam with a phone number?