You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Amir Caspi <ce...@3phase.com> on 2019/06/04 01:10:28 UTC

Re: Meta for bogus MIME with DKIM valid?

Hi Kevin,

Here are some spamples -- I've specifically chosen the ones that did NOT score enough through other means to get tagged, i.e., these are false negatives.  Note that many of them have valid DKIM and hit no other markers.  (The spample will NOT pass DKIM because headers have been modified for anonymity.)  If you run them through NOW you'll probably find they hit Razor and Pyzor and various other things... but they clearly didn't at the time of receipt.  Most of them score 4.6 unless they manage to have enough Bayes "poison" to score lower.  (And I STILL don't know how they keep hitting only BAYES_50...)

https://pastebin.com/BQH3JgWD
https://pastebin.com/nXtZtUdm
https://pastebin.com/tBQt1Raw
https://pastebin.com/wEGvcs73
https://pastebin.com/nuFJ48k0
https://pastebin.com/ykCuEPNQ
** This last one I received from two different servers within a minute of each other.  The first one got nailed by SPFBL so it got marked as spam, but only because the combo of SPFBL (2.2) and local BOGUS_MIME_VERSION (4.0) pushed it over threshold.  This spample, the second of the two, didn't get nailed because the relay wasn't in SPFBL, so BOGUS_MIME_VERSION wasn't enough by itself at a score of 4.0, although it WOULD have been enough at a score of 4.5.

I should also mention I've seen at least a few recent ones that hit Mailscanner's "Eudora long-MIME-boundary attack" rule.  I'm not including those as spamples since they got sanitized by MailScanner so aren't useful, but I figured it was worth mentioning.

My feeling is that BOGUS_MIME_VERSION is incredibly useful during the early hits of snowshoers, before the RBLs, URIBLs, and content hash DBs can catch up.  Since it would seem to be 100% spam and 0% ham, I think scoring it very highly (4+ points) would be both safe and useful -- it will help nix these early hits but won't hinder anything else.

From my experience and these spamples, where most of them are scoring 4.6 (with 4.0 of that from BOGUS_MIME_VERSION), an optimal score would be in the range of 4.5 to 4.9 ... that would push these 4.6s to 5.1 or higher.

I've got MANY other examples in the Junk folders on my server, and I would be happy to send them to you privately if needed.

Cheers.

--- Amir

On May 30, 2019, at 9:24 AM, Kevin A. McGrail <km...@apache.org> wrote:
> 
> Fair enough.  Happy to look at spamples but I've seen virtually nothing in the wild for this.

Re: Meta for bogus MIME with DKIM valid?

Posted by Amir Caspi <ce...@3phase.com>.

On Jul 8, 2019, at 2:15 PM, Joseph Brennan <br...@columbia.edu> wrote:
> 
> I am sorry to say that this spammer seems to have fixed the error. I have seen none at all for a few weeks. What I *have* seen are heavy spam barrages once a week that are from similar IP ranges that the spammer used but without the error. 125,000 today.

Indeed, I also have not gotten any of these in a while, which is unfortunate because this spammer's "product" unfortunately usually doesn't hit ANY other content rule, including Bayes (WTF), so I'm getting a lot of FN spams with scores of 0.6 or so.  Still trying to nail down some other identifying characteristics that can be used for a rule, but coming up empty at the moment.

--- Amir

Re: Meta for bogus MIME with DKIM valid?

Posted by John Hardin <jh...@impsec.org>.

On Mon, 8 Jul 2019, Joseph Brennan wrote:

> I am sorry to say that this spammer seems to have fixed the error. I have
> seen none at all for a few weeks. What I *have* seen are heavy spam
> barrages once a week that are from similar IP ranges that the spammer used
> but without the error. 125,000 today.

Depending on the IP ranges, it sounds like tarpitting would be a useful 
response.

> On Thu, Jun 13, 2019 at 4:17 PM Joseph Brennan <br...@columbia.edu> wrote:
>
>> Yes, replying to myself.
>>
>> It just occurred to me that that we refuse mail from hosts in the Spamhaus
>> lists, so messages from those don't get analyzed by spamassassin. The
>> 50,000 I mentioned is how many were NOT caught that way. I wonder how many
>> there really are!

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   If the rock of doom requires a gentle nudge away from Gaia to
   prevent a very bad day for Earthlings, NASA won’t be riding to the
   rescue. These days, NASA does dodgy weather research and outreach
   programs, not stuff in actual space with rockets piloted by
   flinty-eyed men called Buzz.                       -- Daily Bayonet
-----------------------------------------------------------------------
  12 days until the 50th anniversary of Apollo 11 landing on the Moon

Re: Meta for bogus MIME with DKIM valid?

Posted by Joseph Brennan <br...@columbia.edu>.

I am sorry to say that this spammer seems to have fixed the error. I have
seen none at all for a few weeks. What I *have* seen are heavy spam
barrages once a week that are from similar IP ranges that the spammer used
but without the error. 125,000 today.

On Thu, Jun 13, 2019 at 4:17 PM Joseph Brennan <br...@columbia.edu> wrote:

> Yes, replying to myself.
>
> It just occurred to me that that we refuse mail from hosts in the Spamhaus
> lists, so messages from those don't get analyzed by spamassassin. The
> 50,000 I mentioned is how many were NOT caught that way. I wonder how many
> there really are!
>
>
>
> --
> Joseph Brennan
> Lead, Email and Systems Applications
>
>
>

-- 
Joseph Brennan
Lead, Email and Systems Applications

Re: Meta for bogus MIME with DKIM valid?

Posted by Joseph Brennan <br...@columbia.edu>.

Yes, replying to myself.

It just occurred to me that that we refuse mail from hosts in the Spamhaus
lists, so messages from those don't get analyzed by spamassassin. The
50,000 I mentioned is how many were NOT caught that way. I wonder how many
there really are!



-- 
Joseph Brennan
Lead, Email and Systems Applications

Re: Meta for bogus MIME with DKIM valid?

Posted by Joseph Brennan <br...@columbia.edu>.

On Thu, Jun 13, 2019 at 3:01 PM Antony Stone <
Antony.Stone@spamassassin.open.source.it> wrote:

> On Thursday 13 June 2019 at 17:45:02, Joseph Brennan wrote:
>
> > We've been refusing mail based on this stupid error for a year and a half
> > (local rule) and no false positive has ever come to attention. The volume
> > averages about 50,000 a day here.
>
> What's that as a percentage of total inbound mail?
>

Ah yes, perspective-- that's of about 1.5 million. But 50,000 is close to
the total
number of students, faculty, and staff at the university.

Sunday is the day of rest, right? Go to church, play with the kids, reboot
the
spam engine...

Joe Brennan

Re: Meta for bogus MIME with DKIM valid?

Posted by Antony Stone <An...@spamassassin.open.source.it>.

On Thursday 13 June 2019 at 17:45:02, Joseph Brennan wrote:

> We've been refusing mail based on this stupid error for a year and a half
> (local rule) and no false positive has ever come to attention. The volume
> averages about 50,000 a day here.

What's that as a percentage of total inbound mail?

> Yesterday it was 72,000 from 69.16.199.0/24. It comes from 1 to 3 IP subnets
> each day, changing daily, except that the spammer does not send on Sundays.

That's not something I've ever come across - more spam during US daylight 
time, yes, but less spam on Sundays!?

Fascinating.


Antony.

-- 
Numerous psychological studies over the years have demonstrated that the 
majority of people genuinely believe they are not like the majority of people.

                                                   Please reply to the list;
                                                         please *don't* CC me.

Re: Meta for bogus MIME with DKIM valid?

Posted by Joseph Brennan <br...@columbia.edu>.

We've been refusing mail based on this stupid error for a year and a half
(local rule) and no false positive has ever come to attention. The volume
averages about 50,000 a day here. Yesterday it was 72,000 from
69.16.199.0/24. It comes from 1 to 3 IP subnets each day, changing daily,
except that the spammer does not send on Sundays. I agree that many of them
hit no other rule.


-- 
Joseph Brennan
Lead, Email and Systems Applications

Re: Meta for bogus MIME with DKIM valid?

Posted by John Hardin <jh...@impsec.org>.

On Wed, 12 Jun 2019, Amir Caspi wrote:

> On Jun 4, 2019, at 2:11 PM, Amir Caspi <Ce...@3phase.com> wrote:
>>
>> Locally, I've got the score at 4.0, and will be increasing it to 4.5 shortly.  At least with my spamset (per the spamples I posted), a score of 4.5 seems to be the "magic" value that should catch almost all the FNs (at least the ones that hit BAYES_50 ... the ones that hit BAYES_00 might require more aggression).
>
> I'm getting a ton of zero-hour snowshoe spam today that's scoring BAYES_50 and hitting no other rules besides BOGUS_MIME_VERSION.  These all score 4.6 with BOGUS_MIME_VERSION = 4.0.  I'm going to increase locally to 4.5, and that should get rid of these for me... but I think we should really expedite deployment of this rule for production, I expect I'm not the only one this affects...

Looks like it's suddenly worthwhile in masscheck as well:

https://ruleqa.spamassassin.org/20190612-r1861099-n/__BOGUS_MIME_VER_01/detail
https://ruleqa.spamassassin.org/20190612-r1861099-n/__BOGUS_MIME_VER_02/detail

I'll add a scored rule.


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Are you a mildly tech-literate politico horrified by the level of
   ignorance demonstrated by lawmakers gearing up to regulate online
   technology they don't even begin to grasp? Cool. Now you have a
   tiny glimpse into a day in the life of a gun owner.   -- Sean Davis
-----------------------------------------------------------------------
  804 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: Meta for bogus MIME with DKIM valid?

Posted by Amir Caspi <ce...@3phase.com>.

On Jun 4, 2019, at 2:11 PM, Amir Caspi <Ce...@3phase.com> wrote:
> 
> Locally, I've got the score at 4.0, and will be increasing it to 4.5 shortly.  At least with my spamset (per the spamples I posted), a score of 4.5 seems to be the "magic" value that should catch almost all the FNs (at least the ones that hit BAYES_50 ... the ones that hit BAYES_00 might require more aggression).

I'm getting a ton of zero-hour snowshoe spam today that's scoring BAYES_50 and hitting no other rules besides BOGUS_MIME_VERSION.  These all score 4.6 with BOGUS_MIME_VERSION = 4.0.  I'm going to increase locally to 4.5, and that should get rid of these for me... but I think we should really expedite deployment of this rule for production, I expect I'm not the only one this affects...

Cheers.

--- Amir

Re: Meta for bogus MIME with DKIM valid?

Posted by Amir Caspi <ce...@3phase.com>.

On Jun 4, 2019, at 1:24 PM, Paul Stead <pa...@gmail.com> wrote:
> 
> Certainly worth letting QA do it's thing and autoscore?

My worry about autoscore is that if it looks at network tests, particularly RBLs, then it may reduce the value of the rule.  The primary value of this rule is for early botnet runs before the relays and/or URIs are caught by the RBLs, and for content that doesn't hit any/many other rules (such as all of the spamples I posted).  After only a few minutes, the RBLs pick up these runs and the rule becomes relatively less important when considering the network tests... but it's a REALLY good spamminess indicator in isolation.  (The same argument applies with/without Bayes.)

So, if autoscore gives it a high value without network/bayes tests but a low value with network/bayes tests, then my strong recommendation would be to give it a single atomic score rather than network/non-network scoreset.

Locally, I've got the score at 4.0, and will be increasing it to 4.5 shortly.  At least with my spamset (per the spamples I posted), a score of 4.5 seems to be the "magic" value that should catch almost all the FNs (at least the ones that hit BAYES_50 ... the ones that hit BAYES_00 might require more aggression).

Cheers.

--- Amir

Re: Meta for bogus MIME with DKIM valid?

Posted by Paul Stead <pa...@gmail.com>.

The rules looks to be performing better in masscheck after the updates to
the corpus checking:

https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_01/detail
https://ruleqa.spamassassin.org/20190604-r1860591-n/__BOGUS_MIME_VER_02/detail

Certainly worth letting QA do it's thing and autoscore?

On Tue, 4 Jun 2019 at 02:10, Amir Caspi <ce...@3phase.com> wrote:

> Hi Kevin,
>
> Here are some spamples -- I've specifically chosen the ones that did NOT
> score enough through other means to get tagged, i.e., these are false
> negatives.  Note that many of them have valid DKIM and hit no other
> markers.  (The spample will NOT pass DKIM because headers have been
> modified for anonymity.)  If you run them through NOW you'll probably find
> they hit Razor and Pyzor and various other things... but they clearly
> didn't at the time of receipt.  Most of them score 4.6 unless they manage
> to have enough Bayes "poison" to score lower.  (And I STILL don't know how
> they keep hitting only BAYES_50...)
>
> https://pastebin.com/BQH3JgWD
> https://pastebin.com/nXtZtUdm
> https://pastebin.com/tBQt1Raw
> https://pastebin.com/wEGvcs73
> https://pastebin.com/nuFJ48k0
> https://pastebin.com/ykCuEPNQ
> ** This last one I received from two different servers within a minute of
> each other.  The first one got nailed by SPFBL so it got marked as spam,
> but only because the combo of SPFBL (2.2) and local BOGUS_MIME_VERSION
> (4.0) pushed it over threshold.  This spample, the second of the two,
> didn't get nailed because the relay wasn't in SPFBL, so BOGUS_MIME_VERSION
> wasn't enough by itself at a score of 4.0, although it WOULD have been
> enough at a score of 4.5.
>
> I should also mention I've seen at least a few recent ones that hit
> Mailscanner's "Eudora long-MIME-boundary attack" rule.  I'm not including
> those as spamples since they got sanitized by MailScanner so aren't useful,
> but I figured it was worth mentioning.
>
> My feeling is that BOGUS_MIME_VERSION is incredibly useful during the
> early hits of snowshoers, before the RBLs, URIBLs, and content hash DBs can
> catch up.  Since it would seem to be 100% spam and 0% ham, I think scoring
> it very highly (4+ points) would be both safe and useful -- it will help
> nix these early hits but won't hinder anything else.
>
> From my experience and these spamples, where most of them are scoring 4.6
> (with 4.0 of that from BOGUS_MIME_VERSION), an optimal score would be in
> the range of 4.5 to 4.9 ... that would push these 4.6s to 5.1 or higher.
>
> I've got MANY other examples in the Junk folders on my server, and I would
> be happy to send them to you privately if needed.
>
> Cheers.
>
> --- Amir
>
> On May 30, 2019, at 9:24 AM, Kevin A. McGrail <km...@apache.org> wrote:
>
>
> Fair enough.  Happy to look at spamples but I've seen virtually nothing in
> the wild for this.
>
>
>