You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Amir Caspi <ce...@3phase.com> on 2018/12/20 22:11:50 UTC
Proposed rule for too many dots in From
John, would you mind sandboxing a rule?
Two or more dots in the From username seems to be rather spammy (and we've talked about it before on the list). Would you mind sandboxing this test rule to see if it would be helpful as a main rule? I get a lot of spam locally that hits this...
header AC_FROM_MANY_DOTS From =~ /<(?:\w+\.){2,}\w+@/
describe AC_FROM_MANY_DOTS Two or more periods in the From username
We could, of course, increase to three or more dots... maybe the three-dot version would score higher on its own, but the two-dot could be better in combo... not sure.
Hopefully it's helpful...
Cheers.
--- Amir
Re: Proposed rule for too many dots in From
Posted by Paul Stead <pa...@gmail.com>.
Looks like it was hitting a fair amount of ham the last week or so.
https://ruleqa.spamassassin.org/20190607-r1860743-n/T_AC_FROM_MANY_DOTS/detail
The last few days have looked a bit better:
https://ruleqa.spamassassin.org/20190609-r1860879-n/T_AC_FROM_MANY_DOTS/detail
https://ruleqa.spamassassin.org/20190610-r1860930-n/T_AC_FROM_MANY_DOTS/detail
3 days good performance on ruleqa equals promotion followed by a scoring
day.
On Mon, 10 Jun 2019 at 19:13, Amir Caspi <ce...@3phase.com> wrote:
> On Jan 26, 2019, at 10:27 AM, John Hardin <jh...@impsec.org> wrote:
> >
> > On Thu, 24 Jan 2019, Amir Caspi wrote:
> >
> >> On Jan 15, 2019, at 8:46 AM, John Hardin <jh...@impsec.org> wrote:
> >>>
> >>>> On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
> >>>>>
> >>>>> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
> >>>
> >>> Argh. I lost track of that over the holidays. Thanks for the reminder,
> adding it now.
> >>
> >> Anything interesting with the results on sandboxing this rule?
> >
> > Not really, at least not by itself.
>
> It looks like this rule was still being tested last month (I saw it
> hitting a bunch of my spams), but now appears to be gone (it's not hitting
> on spams that it normally would). Did you decide it wasn't sufficiently
> useful, either alone or in meta?
>
> Cheers.
>
> --- Amir
>
>
Re: Proposed rule for too many dots in From
Posted by Amir Caspi <ce...@3phase.com>.
On Jan 26, 2019, at 10:27 AM, John Hardin <jh...@impsec.org> wrote:
>
> On Thu, 24 Jan 2019, Amir Caspi wrote:
>
>> On Jan 15, 2019, at 8:46 AM, John Hardin <jh...@impsec.org> wrote:
>>>
>>>> On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
>>>>>
>>>>> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
>>>
>>> Argh. I lost track of that over the holidays. Thanks for the reminder, adding it now.
>>
>> Anything interesting with the results on sandboxing this rule?
>
> Not really, at least not by itself.
It looks like this rule was still being tested last month (I saw it hitting a bunch of my spams), but now appears to be gone (it's not hitting on spams that it normally would). Did you decide it wasn't sufficiently useful, either alone or in meta?
Cheers.
--- Amir
Re: Proposed rule for too many dots in From
Posted by John Hardin <jh...@impsec.org>.
On Thu, 24 Jan 2019, Amir Caspi wrote:
> On Jan 15, 2019, at 8:46 AM, John Hardin <jh...@impsec.org> wrote:
>>
>>> On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
>>>>
>>>> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
>>
>> Argh. I lost track of that over the holidays. Thanks for the reminder, adding it now.
>
> Anything interesting with the results on sandboxing this rule?
Not really, at least not by itself.
https://ruleqa.spamassassin.org/20190125-r1852100-n/__AC_FROM_MANY_DOTS/detail
It hits low-scoring spam, so it *might* be worthwhile in a meta. I'll see
what I can do with it.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Tomorrow: Wolfgang Amadeus Mozart's 263rd Birthday
Re: Proposed rule for too many dots in From
Posted by Amir Caspi <ce...@3phase.com>.
On Jan 15, 2019, at 8:46 AM, John Hardin <jh...@impsec.org> wrote:
>
>> On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
>>>
>>> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
>
> Argh. I lost track of that over the holidays. Thanks for the reminder, adding it now.
Anything interesting with the results on sandboxing this rule?
Thanks!
--- Amir
Re: Proposed rule for too many dots in From
Posted by John Hardin <jh...@impsec.org>.
On Mon, 14 Jan 2019, Amir Caspi wrote:
> On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
>>
>> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
>>
>> John, could you update the sandbox rule to the above? That should whittle down FPs. I'd recommend leaving it as 2 letters, though, since a number of spammy addresses are things like john.at.amazon or some such like that.
>
> John, just curious on whether there are any results from sandboxing this rule, and whether it -- or some variant -- might be a good addition? (Likely within a meta, I'm sure.)
>
> Thanks and happy new year!
Argh. I lost track of that over the holidays. Thanks for the reminder,
adding it now.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
All I could think about was this bear is so close to me I can
see its teeth. I could have kissed it. I wished I had a gun.
-- Alyson Jones-Robinson
-----------------------------------------------------------------------
2 days until Benjamin Franklin's 313th Birthday
Re: Proposed rule for too many dots in From
Posted by Amir Caspi <ce...@3phase.com>.
On Dec 20, 2018, at 6:16 PM, Amir Caspi <Ce...@3phase.com> wrote:
>
> header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
>
> John, could you update the sandbox rule to the above? That should whittle down FPs. I'd recommend leaving it as 2 letters, though, since a number of spammy addresses are things like john.at.amazon or some such like that.
John, just curious on whether there are any results from sandboxing this rule, and whether it -- or some variant -- might be a good addition? (Likely within a meta, I'm sure.)
Thanks and happy new year!
--- Amir
Re: Proposed rule for too many dots in From
Posted by RW <rw...@googlemail.com>.
On Thu, 20 Dec 2018 21:12:33 -0700
Grant Taylor wrote:
> On 12/20/18 8:34 PM, Grant Taylor wrote:
> > I'm going back through and analyzing how I'm extracting data and
> > trying to satisfactorily explain some oddities.
>
> Out of 244,921 messages there are 16,528 unique addresses, this is
> how the messages break down for
>
> Here's how the dots in the user parts of 16,528 unique addresses out
> of 244,921 messages break down:
>
> 13,277 (no dots 80.3%)
> 2,936 . ( 1 dot 17.7%)
> 281 .. ( 2 dots 1.7%)
> 29 ... ( 3 dots 0.2%)
> 3 .... ( 4 dots 0.0%)
> 1 ..... ( 5 dots 0.0%)
> 1 ........... (11 dots 0.0%)
>
> So, in light of this information, I would be willing to concede 3 or
> more dots is possibly and indicator of spam.
I think you are a bit premature there, without having separate figures
for spam and ham, you can't say even whether any of these are good spam
indicator - even in isolation.
> My previous log methodology
Isn't a sound method for scoring. For one thing it assumes that more
dots are more spammy. It could be that the S/O peaks at 4.
For another, scoring should be about the balance of extra TPs and FPs
that the rule creates. Sometimes the more spammy looking rule hits
higher scoring spam and warrants a lower score.
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 8:34 PM, Grant Taylor wrote:
> I'm going back through and analyzing how I'm extracting data and trying
> to satisfactorily explain some oddities.
Out of 244,921 messages there are 16,528 unique addresses, this is how
the messages break down for
Here's how the dots in the user parts of 16,528 unique addresses out of
244,921 messages break down:
13,277 (no dots 80.3%)
2,936 . ( 1 dot 17.7%)
281 .. ( 2 dots 1.7%)
29 ... ( 3 dots 0.2%)
3 .... ( 4 dots 0.0%)
1 ..... ( 5 dots 0.0%)
1 ........... (11 dots 0.0%)
So, in light of this information, I would be willing to concede 3 or
more dots is possibly and indicator of spam.
My previous log methodology would add the following spam score to
messages with 3 or more dots. (Assuming 3 dots is the number we start
adding to the spam score.)
3 dots = 1
4 dots = 1.26
5 dots = 1.46
11 dots = 2.18
Assuming 2 dots are allowed and is the number:
3 dots = 1.58
4 dots = 2.00
5 dots = 2.32
11 dots = 3.46
I think I would be comfortable blindly adding log$Base($numberOfDots)
(when numberOfDots > $Base) to the spam score. I don't even see a need
to mess with a meta rule.
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 7:54 PM, Amir Caspi wrote:
> Some of the ones with equal-signs look like bounce addresses from
> envelopes, that would not be in the From header.
I'm going back through and analyzing how I'm extracting data and trying
to satisfactorily explain some oddities. I don't think there will be
any significant change in the numbers. But it wouldn't be the first
time I was wrong about something, even today.
I have found that one of the IETF mailing lists that I subscribe to and
participate in seemingly encodes the sending address as original from
user part, =40 (hex for @), original from domain part, (actual) @,
mailing list, .ietf.org. This seems to especially be the case for
senders from domains with DMARC enabled.
So:
john.doe@example.com
Becomes:
john.doe=40example.com@list.ietf.org
This is the contents of the From: header.
I consider that to be a legitimate email address. Granted, it's
probably atypical. But none-the-less legitimate.
I'm also seeing email addresses use the (…) comments in From: headers.
From: "So and So" <jo...@example.com> (please no spam)
Again, legitimate email addresses.
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 7:54 PM, Amir Caspi wrote:
> Are these in the From: header or the envelope-from (Return-Path)?
These are all the From: header.
> Some of the ones with equal-signs look like bounce addresses from
> envelopes, that would not be in the From header. Or did you just look for
> any email address, so these could include Reply-To and emails in the body?
Nope. I explicitly looked for the From: header.
They may look like VERP. They may be VERP. But remember that there's
nothing that prohibits using VERP like addresses in the From: header.
This is particularly important when an email address tries to encode
another email address. The original "@" usually becomes another
character, frequently "=".
> In general it looks like you do have a bunch that would hit on even the
> multi-letter rule... but if they're envelopes or destinations (not From:)
> then they wouldn't be searched in this rule.
They are indeed addresses in the From: header.
I'm using formail to explicitly extract only the From: header.
formail -c -X From:
> Cheers and thanks.
You're welcome.
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Amir Caspi <ce...@3phase.com>.
On Dec 20, 2018, at 7:49 PM, Grant Taylor <gt...@tnetconsulting.net> wrote:
>
> So here's the user parts (left hand side of the @) of emails.
Are these in the From: header or the envelope-from (Return-Path)? Some of the ones with equal-signs look like bounce addresses from envelopes, that would not be in the From header. Or did you just look for any email address, so these could include Reply-To and emails in the body?
In general it looks like you do have a bunch that would hit on even the multi-letter rule... but if they're envelopes or destinations (not From:) then they wouldn't be searched in this rule.
Cheers and thanks.
--- Amir
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 8:36 PM, Benny Pedersen wrote:
> and xxx is a real tld,
Yes.
> so you ddos maillist members now
How so?
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Benny Pedersen <me...@junc.eu>.
Grant Taylor skrev den 2018-12-21 03:49:
> Note: These are what I considered legitimate enough to keep in my
> mail structure. I don't keep spam for very long. This corpus goes
> back to 2001.
and xxx is a real tld, so you ddos maillist members now
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 7:36 PM, Grant Taylor wrote:
> I don't know. I'm re-running the command to scan my mailbox extracting
> From: addresses. (I'm logging to a file this time.) I'll do some
> analysis and let you know.
I don't know what sort of characterization you may want. So here's the
user parts (left hand side of the @) of emails.
Note: These are what I considered legitimate enough to keep in my mail
structure. I don't keep spam for very long. This corpus goes back to 2001.
1 x.xxxx.x
1 xxxx.x.x.
1 x.x.x.x.xx.x
1 xxx.xx.xx
1 xxx.x.xxxx
1 xxxxx.x.xx
1 x.xxxxxxx.x
1 xxx.xxx.xxx
1 xxxxx.x.xxx
1 xxxxx.xx.xx
1 x.xx.xxxx.xxx
1 x.xxx.xxxxxx
1 xxx.x.xxxxxx
1 xxxx.xxxx.xx
1 x.xxxxx.xxxxx
1 xxxx.xx.xxxxx
1 xxx.x.xxxxxxxx
1 xxx.xxx.xxx.xxx
1 xxxx.xxxxxx.xx
1 xxxxxxxx.x.xxx
1 xxxxx.xx.xxxxxx
1 xxxxxxx.x.xxxxx
1 xxxxxxx.xxxxx.x
1 xxx.xxxx.xxxxxxx
1 xxx.xxxxxxxxx.xx
1 xxxx.xxxxx.xxxxx
1 xx.xxxxxx.xxxxxxx
1 xxxxx.xxxxxx.xxxx
1 xxxxxxx.xxxxxx.xx
1 xx.xxxxxx.xxxxxxxx
1 xxx.xx.xxx.xxxxxxxx
1 xxxxxx.xxx.xxxxxxx
1 xxxxxxxx.xxxx.xxxx
1 xxxxxx.xxxxxxx.xxxx
1 xxxxxxxxx.x.xxxxxxx
1 xxxxxxxxx.xxx.xxxxx
1 xxx.xxxx=xxxxxx.xxx.xx
1 xxxx.xxxxx=xxxxxx.xxx
1 xxxx.x.xxxxxxxxxxx.xxx
1 xxxx.xxxxxxxxx.xxxxxx
1 xxxxxxx.xxxxxx.xxxxx.x
1 xxx.xxxx.xxx.xxxxxxxxxx
1 xxxxxx.xxxxxxxx.xxxxxx
1 xxxxxxx.xxxxxx.xxxxxxx
1 xxxxxxxx.xx.xxxxxxxxxx
1 xxxxx-xxxx.xxxx.xxxxxxxx
1 xxxxxx.xxxxxx.xxxxxx+xxx
1 xxxxxxxxxxx.xxxx.xxxxxx
1 xxx-xxxxxxxx.xx.xxxxxxxxxx
1 xxxxxxxx+xxxxxx.xxxxxx.xxx
1 xxxxxxxxxxx.xxxxxxx.xxxxx
1 xxxxxx.xxxxxxxxxxxx.xxxxxx
1 xxxxxx.xxxxxxxx=xxxxxxxxxx.xx
1 xxxxxxxxx.xxxxxxx.xxxxx.xx.xxxxxxx
1 xxxxxxx-xxxxx-xxxxxxx=xxxxxxx.xxxxxxxxxxxxx.xxx
1 xxxxxxx-xxxxxxxxxxxx-xxxxxxx=xxxxxxx.xxxxxxxxxxxxx.xxx
1
xxxxxxx-xx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxx-xxx=xxxxxxx.xxxxxxxxxxxxxx.xxx
2 xx.xxxxx.xx
2 xxx.xxxx.xx
2 xxxxxxx.x.xxxx
2 xxxxxxx.xxxx.x
2 xxxx.xxxxx.xxxx
2 xxxx.xxxxxxx.xx
2 xx.xxxxxx.xxxxxx
2 xxxxxx.x.xxxxxxx
2 xxxxxx.x.xxxx.xxx
2 xxxxxx.xx.xxxxxx
2 xxxxxxx.xxxxxx.x
2 xxxxxxxx.x.xxxxx
2 xxx.xxxxx+xxxxx.xx
2 xxxx.x.xxxxxxxxxx
2 xxxx.xxxxxxxx.xxx
2 xxxxxxx.xx.xxxxxx
2 xxxxxx.x.xxxxxxxxx
2 xxxxxxx.xx.xxxxxxx
2 xxxxxxxx.x.xxxxxxx
2 x.xxxxxxxxxx.xxxxxx
2 xxx.xxxxx.xxxxxxxxx
2 xxxx.xxx.xxxxxxxxxx
2 xxxxx.xxxxxxx.xxxxx
2 xxx.xxxxx.xxxx.xxxxxx
2 xxxx.xxxxxxx.xxxxxxx
2 xxxxx.x.xxxxxxxx.xxxx
2 xxxxx.xxxx=xxxxxx.xxx
2 xxxxx.xxxx.xxxxxxxxx
2 xxxxxx.xx.xxxxxxxxxx
2 xxxx.xxxxxxxxxx.xxxxxxxx
2 xxxxxxx.xxxx=xxxxxxxxxx.xxx
2 xxx-xxxxxx.xxxxxxxx.xxxxxxxxxx
2 xxxxx.xxxx.xx.xx.xx.xxx.xxx.xx.xxx.xxx.xx.xx
3 xx.x.x
3 x.xx.x.x.x
3 xx.xxxxxx.xx
3 xxx.xxxx.xxx
3 xxx.x.xxxxxxx
3 xxxxx.xx.xxxxx
3 xxxxxx.xxxxx.x
3 x.xxxxxxxx.xxxx
3 xx-xxxx.xxxx.xxx
3 xxxxx.xxxxx.xxx
3 xxxxxx.xxxx.xxx
3 x.xxxx.xxxxxxxxx
3 xxx.xxxxxxx.xxxx
3 xxxxx.xxx.xxxxxxx
3 xxxxx.xxxxx.xxxxx
3 xxxxxxx.xxxxxxx.x
3 xxx.xxxxxxxx.xxxxx
3 xxxxxxx.x.xxxxx.xxx
3 xxxxxx.x.xxxxxxxxxx
3 xxxxxxxx.xxxxxxxx.xxxx
3 xxxx.xxxxxx.xxxxxxxxxxx
4 x.x.x.xxxxx
4 xxx.x.xxxxx
4 xxxx.xx.xxx
4 x.xxxxx.xxxx
4 x.xxxx.xxxxxxxx
4 xxx.xxxxxxxx.xxx
4 xxxx.xxxxxxx.xxxx
4 xxxxxx.xxxxxxx.xxx
4 xxxxx.xxxxxx.xxxxxxx
4 xxxxxx.xxxxxx.xxxxxxxx
4 xxxxxx.xxxxxxx.xxxxxxxxx
5 xx.xxx.xxxx
5 xxxxx.xxxxx.xx
5 xxxxxx.x.xxxxx
5 xxx.xxxx.xxxxxx
5 xxxx.x.xxxxxxxx
5 xxxxxxx.xx.xxxx
5 xxxx.xxxxxx.xxxx
5 xxxxxx.xxxxxxxx.x
5 xxxxxxxx.x.xxxxxx
5 xxxxxx.xx.xxxxx.xxx>
5 xxxxxxx.xxxxx.xxxxxxx
6 xx.x.xxxx
6 x.x.x.xxxxxxx
6 xxxx.xxxxxxxx.xx
6 xxxxx.xxx.xxxxxx
6 xxxxxx.x.xxxxxxxx
6 x.xxxxxxx.xxxxxxxx
6 xxxxxx.xxxxxxxx.xxx
6 xxxxx.xxxxxxxxx.xxxx
7 xx.x.x.x
7 x.x.xxxxxxxx
7 xxxxx.xx.xxxx
7 xxxxxxx.x.xxxx-x
7 xxxxxxxxx.x.xxxx
7 xxxx.xxxxxxxxx.xxxxx
7 xxxxxxxxx.xxxxx.xxxxxxxx
7 xxxxx.xxxxx=xxxxxxxxxxxxx.xx
8 xxxxxx.xxxxx.xxxxx
8 xxxx.xxxxxxxxx.xxxx
8 xxxxxx.xxxx.xx.xxxxxxxxx
9 xxxx.x.xxxxx
9 xxxxxx.x.xxxx
9 xxxxxxx.xxxxx.xx
9 xxxxx.x.xxxxxxx.xxx
9 x-xxxxxxx=xxxxxxx.xxxxxxxxxxxxxx.xxx-xxxxx
10 xxxx.xx.xxxx
10 xxxxxx.xxxxxx.xxx
11 xxxxx.x.xxxxx
11 xxxxxx..xxx.xxxx
11 xxxx.xxxxxx.xxxxx
11 xxxxxxx.xxxxxx.xxxxx
12 xxx.xxxx.xxxxx
12 xxx.xx.xxxxx.xxx
13 x.xxxxxxx.xxxxxx
13 xxxx.xxxx.xxxxxxxx
13 xxxx.xxxxxx.xxxxxx
14 xxxxx.xxxx.xxx
15 xxxx.xxxx.xxxxxx
15 x.x.xxxxxxxxxxxxxxx
15 xxxxxxxx.xxxxxxxx.xx
16 xxxxxx.xxx.xx
17 x.x.xxxxxxxxx
18 x.x.xxx
18 xx.xx.xxxxxx
19 xxxxxx.x.xxxxxx
20 xxxxxx.xxxx.xxxxxx
23 x.x.xxxxxxxxxx
23 xxxxxxx.x.xxxxxxxx
26 xxxxxx.xxxxxx.xxxxxxx
27 x.x.xxxxxx
28 xxxxx.xx.xxxxxxxxx
29 xxxxxxx.xxxx.xxx
29 xxxxx.xxxxxxx.xxxx
29 xxxxx.xxxxxx.xxxxxxxx
30 x.x.x
31 xxx.x.xxxxxxxxxxxx
33 x.xxxx.xxx
33 xxxx.x.xxxx
38 xxx.xxxxxxxxx.xx.xxxxx
39 xxxxx.xxxxxxxxxx.xx
43 xxxxx.x.xxxx
49 xxxx.x.xxxxxx
56 xxxxx.x.xxxxxx
56 xxxxx.xxxxx.xxxxxxx
57 x-x.xxxxxxxxxx.xxxxx.xxxx
58 xxxxxxx.x.xxxxxx
59 x.x.x.xxxxxx
61 xxxxxx+xxxx.xxxx.xxxxxxxx
73 xxx.xxx.xxxx.xxx.xxxxxxxxx
78 xxxxx.x.xxxxxxxxx
79 xxxxx.xxx.xxxx
82 xxxx.xxxx.xxxxxxxx.xxxx
142 xxxxxxxx.xxxxxxx.xx
147 xxxxx.x.xxxxxxx
149 xx.xxxx.xxxx
234 xxxx.x.xxxxxxx
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/18 6:16 PM, Amir Caspi wrote:
> I never intended for the rule to be applied on its own, but far more
> likely that it would become part of a meta rule with other spammy
> indicators.
Ah. That makes more sense.
That being said, it is your server and you're free to run it however you
want.
> That said, you're absolutely right -- I interact with a bunch of gov folks
> and forgot about the middle initial being commonplace in the address.
;-)
> Typically that middle part is just one letter for the initial, so one
> could change the rule to require at least two word characters between
> the dots. That is:
>
> headerAC_FROM_MANY_DOTSFrom =~ /<(?:\w{2,}\.){2,}\w+@/
You could do something like that. But I think that you're making the
rule more complex (which is okay) but I'm not convinced that's
necessarily a good thing.
I think I'd be likely to have people pick a number of dots that they
think is reasonable (possibly with a default) and then take the log base
that number of the number of dots in the message. Then I'd add that
result to the spam score. If I could do such.
> Perhaps this is still too generic, and three dots should be the
> minimum... but that's what the sandboxing will hopefully tell us. And
> part of the sandboxing will also hopefully tell us if this works well as
> a meta -- I absolutely and wholeheartedly agree that the rule
> _by_itself_ is not a good spam indicator at all... but combined with
> other indicators, it might well be.
;-)
> Grant, how many of your legit emails would hit the above rule, requiring
> more than one letter (i.e., more than just a middle initial) between the
> dots?
I don't know. I'm re-running the command to scan my mailbox extracting
From: addresses. (I'm logging to a file this time.) I'll do some
analysis and let you know.
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by Amir Caspi <ce...@3phase.com>.
On Dec 20, 2018, at 5:13 PM, Noel Butler <no...@ausics.net> wrote:
> I have to agree with Grant, two dots is crazy low, you might as well score at one dot. A lot of emails are firstname.initial.surname even many government departments in this part of the world use two dot format.
>
I never intended for the rule to be applied on its own, but far more likely that it would become part of a meta rule with other spammy indicators. That said, you're absolutely right -- I interact with a bunch of gov folks and forgot about the middle initial being commonplace in the address. Typically that middle part is just one letter for the initial, so one could change the rule to require at least two word characters between the dots. That is:
header AC_FROM_MANY_DOTS From =~ /<(?:\w{2,}\.){2,}\w+@/
John, could you update the sandbox rule to the above? That should whittle down FPs. I'd recommend leaving it as 2 letters, though, since a number of spammy addresses are things like john.at.amazon or some such like that.
Perhaps this is still too generic, and three dots should be the minimum... but that's what the sandboxing will hopefully tell us. And part of the sandboxing will also hopefully tell us if this works well as a meta -- I absolutely and wholeheartedly agree that the rule _by_itself_ is not a good spam indicator at all... but combined with other indicators, it might well be.
Grant, how many of your legit emails would hit the above rule, requiring more than one letter (i.e., more than just a middle initial) between the dots?
Thanks.
--- Amir
Re: Proposed rule for too many dots in From
Posted by Noel Butler <no...@ausics.net>.
On 21/12/2018 09:52, Grant Taylor wrote:
> On 12/20/2018 03:11 PM, Amir Caspi wrote:
>
>> Two or more dots in the From username seems to be rather spammy (and we've talked about it before on the list).
>
> I feel obligated to comment that my wife's email address (Gmail) has two dots in it. (Gmail is it's own can of worms for dots as they strip them, and other issues with Gmail.) As do a number of other people that I exchange email with.
I have to agree with Grant, two dots is crazy low, you might as well
score at one dot. A lot of emails are firstname.initial.surname even
many government departments in this part of the world use two dot
format.
--
Kind Regards,
Noel Butler
This Email, including any attachments, may contain legally privileged
information, therefore remains confidential and subject to copyright
protected under international law. You may not disseminate, discuss, or
reveal, any part, to anyone, without the authors express written
authority to do so. If you are not the intended recipient, please notify
the sender then delete all copies of this message including attachments,
immediately. Confidentiality, copyright, and legal privilege are not
waived or lost by reason of the mistaken delivery of this message. Only
PDF [1] and ODF [2] documents accepted, please do not send proprietary
formatted documents
Links:
------
[1] http://www.adobe.com/
[2] http://en.wikipedia.org/wiki/OpenDocument
Re: Proposed rule for too many dots in From
Posted by Grant Taylor <gt...@tnetconsulting.net>.
On 12/20/2018 03:11 PM, Amir Caspi wrote:
> Two or more dots in the From username seems to be rather spammy (and
> we've talked about it before on the list).
I feel obligated to comment that my wife's email address (Gmail) has two
dots in it. (Gmail is it's own can of worms for dots as they strip
them, and other issues with Gmail.) As do a number of other people that
I exchange email with.
> Would you mind sandboxing this test rule to see if it would be helpful
> as a main rule? I get a lot of spam locally that hits this...
>
> header AC_FROM_MANY_DOTS From =~ /<(?:\w+\.){2,}\w+@/
> describe AC_FROM_MANY_DOTS Two or more periods in the From username
>
> We could, of course, increase to three or more dots... maybe the three-dot
> version would score higher on its own, but the two-dot could be better
> in combo... not sure.
Can't SpamAssassin add something to the score for each dot?
I just checked and my 249,000+ message corpus has 2,600+ message with
two or more dots in the user part of the email address (From:addr).
--
Grant. . . .
unix || die
Re: Proposed rule for too many dots in From
Posted by John Hardin <jh...@impsec.org>.
On Thu, 20 Dec 2018, Amir Caspi wrote:
> John, would you mind sandboxing a rule?
>
> Two or more dots in the From username seems to be rather spammy (and we've talked about it before on the list). Would you mind sandboxing this test rule to see if it would be helpful as a main rule? I get a lot of spam locally that hits this...
>
> header AC_FROM_MANY_DOTS From =~ /<(?:\w+\.){2,}\w+@/
> describe AC_FROM_MANY_DOTS Two or more periods in the From username
>
> We could, of course, increase to three or more dots... maybe the three-dot version would score higher on its own, but the two-dot could be better in combo... not sure.
>
> Hopefully it's helpful...
>
> Cheers.
>
> --- Amir
Can you also provide a spample? Thanks!
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
"Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never
does quite what I want. I wish Christopher Robin was here."
-- Peter da Silva in a.s.r
-----------------------------------------------------------------------
5 days until Christmas