You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Dominic Raferd <do...@timedicer.co.uk> on 2021/02/20 07:58:19 UTC

Catch subtly-different Reply-To domain

Is there a rule to catch cases where the domain of the Reply-To header 
is a subtle variant on that in the To header. Take this (real) example 
from a phishing email sent yesterday:

From: "Karen Howard" <ka...@interfacefm.com>
Reply-To: "Karen Howard" <ka...@intrefacefm.com>

I realise that other elements of the address can be different without 
being a reliable spam indicator but I think that interfacefm.com -> 
intrefacefm.com are so similar and yet different that they should be 
worth a few points. But I can't think how to write such a rule myself.


Re: Catch subtly-different Reply-To domain

Posted by John Hardin <jh...@impsec.org>.
On Mon, 22 Feb 2021, RW wrote:

> On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
> John Hardin wrote:
>
>> On Sun, 21 Feb 2021, John Hardin wrote:
>>
>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>
>>>> Michael's suggestion is interesting. There is a github project
>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>> will see if there is a way to apply it in this situation. Thanks
>>>> to all for their input.
>>>
>>> It would have to be a plugin, and there's a CPAN module for
>>> calculating Levenshtein numbers so most of the heavy lifting is
>>> already done.
>>
>> Sigh. Ignore that, that's exactly what it is. I need to stop replying
>> so quickly to stuff.
>
> I don't think there was anything wrong in pointing out that it's
> available from CPAN.
>
> There is also a Damerau–Levenshtein version which is probably a better
> choice as the transposition of two adjacent characters counts as 1
> difference rather than 2.

I was more sighing about: "allowing ... to be ... used in SA" "It would 
have to be a plugin"

:)

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Today: George Washington's 289th Birthday

Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 22/02/2021 15:45, Dominic Raferd wrote:
> On 22/02/2021 15:05, RW wrote:
>>
>>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>>>>> Michael's suggestion is interesting. There is a github project
>>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>>> will see if there is a way to apply it in this situation. Thanks
>>>>> to all for their input.
>>>>
>> There is also a Damerau–Levenshtein version which is probably a better
>> choice as the transposition of two adjacent characters counts as 1
>> difference rather than 2.
> That sounds better, but I don't know how to employ it to make a rule for
> SA. My idea is to compare the domain part of the 'From' and 'Reply-To'
> addresses, scoring for a close but not exact match (maybe
> Damerau–Levenshtein between 1 and 3). The same logic could also be used
> to compare the domain part of the 'From' to a list of domains that are
> prone to impersonation (and don't have DMARC policy with
> p=reject|quarantine).

I have now implemented this using the (updated) code at 
https://github.com/fmbla/spamassassin-levenshtein. This was super-easy 
as the new LEVENSHTEIN_REPLY rule does exactly what I need - I just 
added the 3 files to /etc/spamassassin and added 1 line to 
/etc/spamassassin/z_local.cf:

score LEVENSHTEIN_REPLY 4

My thanks to the coder! Now I need a real-world case to see it in action...



Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 22/02/2021 15:05, RW wrote:
> On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
> John Hardin wrote:
>
>> On Sun, 21 Feb 2021, John Hardin wrote:
>>
>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>>>> Michael's suggestion is interesting. There is a github project
>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>> will see if there is a way to apply it in this situation. Thanks
>>>> to all for their input.
>>> It would have to be a plugin, and there's a CPAN module for
>>> calculating Levenshtein numbers so most of the heavy lifting is
>>> already done.
>> Sigh. Ignore that, that's exactly what it is. I need to stop replying
>> so quickly to stuff.
> I don't think there was anything wrong in pointing out that it's
> available from CPAN.
>
> There is also a Damerau–Levenshtein version which is probably a better
> choice as the transposition of two adjacent characters counts as 1
> difference rather than 2.
That sounds better, but I don't know how to employ it to make a rule for 
SA. My idea is to compare the domain part of the 'From' and 'Reply-To' 
addresses, scoring for a close but not exact match (maybe 
Damerau–Levenshtein between 1 and 3). The same logic could also be used 
to compare the domain part of the 'From' to a list of domains that are 
prone to impersonation (and don't have DMARC policy with 
p=reject|quarantine).


Re: Catch subtly-different Reply-To domain

Posted by RW <rw...@googlemail.com>.
On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
John Hardin wrote:

> On Sun, 21 Feb 2021, John Hardin wrote:
> 
> > On Sun, 21 Feb 2021, Dominic Raferd wrote:

> >> Michael's suggestion is interesting. There is a github project
> >> allowing Levenshtein numbers to be calculated and used in SA, I
> >> will see if there is a way to apply it in this situation. Thanks
> >> to all for their input.  
> >
> > It would have to be a plugin, and there's a CPAN module for
> > calculating Levenshtein numbers so most of the heavy lifting is
> > already done.  
> 
> Sigh. Ignore that, that's exactly what it is. I need to stop replying
> so quickly to stuff.

I don't think there was anything wrong in pointing out that it's
available from CPAN.

There is also a Damerau–Levenshtein version which is probably a better
choice as the transposition of two adjacent characters counts as 1
difference rather than 2.




Re: Catch subtly-different Reply-To domain

Posted by John Hardin <jh...@impsec.org>.
On Sun, 21 Feb 2021, John Hardin wrote:

> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>
>> On 21/02/2021 20:09, Benny Pedersen wrote:
>>> On 2021-02-21 19:44, Dominic Raferd wrote:
>>> 
>>>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>>>> they can intercept incoming replies.
>>>> 
>>>> I stand corrected; but as they specify p=none, the mail must still pass.
>>> 
>>> in what way should it pass ?
>>> 
>>> dmarc tests spf, dkim, and opendmarc from github trunk validates arc 
>>> chains aswell, there is no garenti that anything pass
>>> 
>>> only sendgrid maked that mistake, sorry sendgrid
>> 
>> p=none is an instruction from the domain controller *not* to reject emails 
>> from their domain even when they fail DMARC testing. So the end result is 
>> that this mail should pass through DMARC testing.
>> 
>> DMARC is a red herring here. My original question wouldn't be relevant if 
>> the sending domain had an enforced DMARC policy (p=quarantine|reject), but 
>> they don't.
>> 
>> Michael's suggestion is interesting. There is a github project allowing 
>> Levenshtein numbers to be calculated and used in SA, I will see if there is 
>> a way to apply it in this situation. Thanks to all for their input.
>
> It would have to be a plugin, and there's a CPAN module for calculating 
> Levenshtein numbers so most of the heavy lifting is already done.

Sigh. Ignore that, that's exactly what it is. I need to stop replying so 
quickly to stuff.

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Avatar: the highest grossing Pocahontas remake ever. -- Chris Sauer
-----------------------------------------------------------------------
  Tomorrow: George Washington's 289th Birthday

Re: Catch subtly-different Reply-To domain

Posted by John Hardin <jh...@impsec.org>.
On Sun, 21 Feb 2021, Dominic Raferd wrote:

> On 21/02/2021 20:09, Benny Pedersen wrote:
>> On 2021-02-21 19:44, Dominic Raferd wrote:
>> 
>>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>>> they can intercept incoming replies.
>>> 
>>> I stand corrected; but as they specify p=none, the mail must still pass.
>> 
>> in what way should it pass ?
>> 
>> dmarc tests spf, dkim, and opendmarc from github trunk validates arc chains 
>> aswell, there is no garenti that anything pass
>> 
>> only sendgrid maked that mistake, sorry sendgrid
>
> p=none is an instruction from the domain controller *not* to reject emails 
> from their domain even when they fail DMARC testing. So the end result is 
> that this mail should pass through DMARC testing.
>
> DMARC is a red herring here. My original question wouldn't be relevant if the 
> sending domain had an enforced DMARC policy (p=quarantine|reject), but they 
> don't.
>
> Michael's suggestion is interesting. There is a github project allowing 
> Levenshtein numbers to be calculated and used in SA, I will see if there is a 
> way to apply it in this situation. Thanks to all for their input.

It would have to be a plugin, and there's a CPAN module for calculating 
Levenshtein numbers so most of the heavy lifting is already done.


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Avatar: the highest grossing Pocahontas remake ever. -- Chris Sauer
-----------------------------------------------------------------------
  Tomorrow: George Washington's 289th Birthday

Re: Catch subtly-different Reply-To domain

Posted by Benny Pedersen <me...@junc.eu>.
On 2021-02-21 23:00, Dominic Raferd wrote:

> p=none is an instruction from the domain controller *not* to reject
> emails from their domain even when they fail DMARC testing. So the end
> result is that this mail should pass through DMARC testing.

remember dmarc can pass on spf pass only, even if dkim fail

this is properbly not what so many expect it can

> DMARC is a red herring here. My original question wouldn't be relevant
> if the sending domain had an enforced DMARC policy
> (p=quarantine|reject), but they don't.

good, but there is still no garenti that spf or dkim pass or even arc 
seal is giving pass results

> Michael's suggestion is interesting. There is a github project
> allowing Levenshtein numbers to be calculated and used in SA, I will
> see if there is a way to apply it in this situation. Thanks to all for
> their input.

like spamassassin dkim_valid_ef is not dkim standard, want sid-milter 
results to be used in dmarc ?

there is some low hanging fruits that should not be used anytime



Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 21/02/2021 20:09, Benny Pedersen wrote:
> On 2021-02-21 19:44, Dominic Raferd wrote:
>
>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>> they can intercept incoming replies.
>>
>> I stand corrected; but as they specify p=none, the mail must still pass.
>
> in what way should it pass ?
>
> dmarc tests spf, dkim, and opendmarc from github trunk validates arc 
> chains aswell, there is no garenti that anything pass
>
> only sendgrid maked that mistake, sorry sendgrid

p=none is an instruction from the domain controller *not* to reject 
emails from their domain even when they fail DMARC testing. So the end 
result is that this mail should pass through DMARC testing.

DMARC is a red herring here. My original question wouldn't be relevant 
if the sending domain had an enforced DMARC policy 
(p=quarantine|reject), but they don't.

Michael's suggestion is interesting. There is a github project allowing 
Levenshtein numbers to be calculated and used in SA, I will see if there 
is a way to apply it in this situation. Thanks to all for their input.


Re: Catch subtly-different Reply-To domain

Posted by Benny Pedersen <me...@junc.eu>.
On 2021-02-21 19:44, Dominic Raferd wrote:

>> Presumably interfacefm.com has been hacked, but not to the extent that
>> they can intercept incoming replies.
> 
> I stand corrected; but as they specify p=none, the mail must still 
> pass.

in what way should it pass ?

dmarc tests spf, dkim, and opendmarc from github trunk validates arc 
chains aswell, there is no garenti that anything pass

only sendgrid maked that mistake, sorry sendgrid

Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 21/02/2021 17:37, RW wrote:
> On Sun, 21 Feb 2021 17:00:32 +0000
> Dominic Raferd wrote:
>
>> On 21/02/2021 16:20, Benny Pedersen wrote:
>>> On 2021-02-21 17:00, RW wrote:
>>>> On Sun, 21 Feb 2021 14:04:20 +0000
>>>> Dominic Raferd wrote:
>>>>   
>>>>> On 21/02/2021 13:56, RW wrote:
>>>>   
>>>>>>>> From: "Karen Howard" <ka...@interfacefm.com>
>>>>>>>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>
>>>>   
>>>>> Yes this mail passed DMARC
>>>> How did it pass DMARC when it has the domain being spoofed in the
>>>> from header?
>>> both domains can have dmarc, but only from header is dmarc tested
>>>
>>> and dkim can sign reply-to
>> and interfacefm.com (like most domains) does not publish a DMARC
>> policy, so it must pass
> But it does:
>
> $ dig +short txt _dmarc.interfacefm.com
> "v=DMARC1; p=none; rua=mailto:postmaster@interfacefm.com"
>
> Presumably interfacefm.com has been hacked, but not to the extent that
> they can intercept incoming replies.

I stand corrected; but as they specify p=none, the mail must still pass.


Re: Catch subtly-different Reply-To domain

Posted by RW <rw...@googlemail.com>.
On Sun, 21 Feb 2021 17:00:32 +0000
Dominic Raferd wrote:

> On 21/02/2021 16:20, Benny Pedersen wrote:
> > On 2021-02-21 17:00, RW wrote:  
> >> On Sun, 21 Feb 2021 14:04:20 +0000
> >> Dominic Raferd wrote:
> >>  
> >>> On 21/02/2021 13:56, RW wrote:  
> >>  
> >>> >>> From: "Karen Howard" <ka...@interfacefm.com>
> >>> >>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>  
> >>  
> >>> Yes this mail passed DMARC  
> >>
> >> How did it pass DMARC when it has the domain being spoofed in the
> >> from header?  
> >
> > both domains can have dmarc, but only from header is dmarc tested
> >
> > and dkim can sign reply-to  
> and interfacefm.com (like most domains) does not publish a DMARC
> policy, so it must pass


But it does:

$ dig +short txt _dmarc.interfacefm.com
"v=DMARC1; p=none; rua=mailto:postmaster@interfacefm.com"

Presumably interfacefm.com has been hacked, but not to the extent that
they can intercept incoming replies.


Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 21/02/2021 16:20, Benny Pedersen wrote:
> On 2021-02-21 17:00, RW wrote:
>> On Sun, 21 Feb 2021 14:04:20 +0000
>> Dominic Raferd wrote:
>>
>>> On 21/02/2021 13:56, RW wrote:
>>
>>> >>> From: "Karen Howard" <ka...@interfacefm.com>
>>> >>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>
>>
>>> Yes this mail passed DMARC
>>
>> How did it pass DMARC when it has the domain being spoofed in the from
>> header?
>
> both domains can have dmarc, but only from header is dmarc tested
>
> and dkim can sign reply-to
and interfacefm.com (like most domains) does not publish a DMARC policy, 
so it must pass

Re: Catch subtly-different Reply-To domain

Posted by Benny Pedersen <me...@junc.eu>.
On 2021-02-21 17:00, RW wrote:
> On Sun, 21 Feb 2021 14:04:20 +0000
> Dominic Raferd wrote:
> 
>> On 21/02/2021 13:56, RW wrote:
> 
>> >>> From: "Karen Howard" <ka...@interfacefm.com>
>> >>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>
> 
>> Yes this mail passed DMARC
> 
> How did it pass DMARC when it has the domain being spoofed in the from
> header?

both domains can have dmarc, but only from header is dmarc tested

and dkim can sign reply-to

Re: Catch subtly-different Reply-To domain

Posted by RW <rw...@googlemail.com>.
On Sun, 21 Feb 2021 14:04:20 +0000
Dominic Raferd wrote:

> On 21/02/2021 13:56, RW wrote:

> >>> From: "Karen Howard" <ka...@interfacefm.com>
> >>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>  
  
> Yes this mail passed DMARC

How did it pass DMARC when it has the domain being spoofed in the from
header?

Re: Catch subtly-different Reply-To domain

Posted by Dominic Raferd <do...@timedicer.co.uk>.
On 21/02/2021 13:56, RW wrote:
> On Sun, 21 Feb 2021 11:28:51 +0100
> Michael Storz wrote:
>
>> Am 2021-02-20 08:58, schrieb Dominic Raferd:
>>> Is there a rule to catch cases where the domain of the Reply-To
>>> header is a subtle variant on that in the To header. Take this
>>> (real) example from a phishing email sent yesterday:
>>>
>>> From: "Karen Howard" <ka...@interfacefm.com>
>>> Reply-To: "Karen Howard" <ka...@intrefacefm.com>
>> Use the "Damerau–Levenshtein distance" to calcutate the similarity.
>> Since long I was interested to try this, but never found the time.
> Did you have particular use in mind for that? The example above doesn't
> seem all that useful as a phishing technique as it will fail DMARC.
>
> My suspicion  is that they are trying to exploit mail systems that
> haven't yet adopted DMARC checking and that interfacefm.com was chosen
> for its SPF record:
>
> v=spf1 +a +mx +a:ns1.c57578.sgvps.net include:_spf.mailspamprotection.com
>
> There's no -all or ~all on the end.
Yes this mail passed DMARC and it is cases like this that I want to 
catch. 99% of domains have not implemented full DMARC with 
p=quarantine|reject, so one can't rely on it (although it has a valuable 
role).

Re: Catch subtly-different Reply-To domain

Posted by RW <rw...@googlemail.com>.
On Sun, 21 Feb 2021 11:28:51 +0100
Michael Storz wrote:

> Am 2021-02-20 08:58, schrieb Dominic Raferd:
> > Is there a rule to catch cases where the domain of the Reply-To
> > header is a subtle variant on that in the To header. Take this
> > (real) example from a phishing email sent yesterday:
> > 
> > From: "Karen Howard" <ka...@interfacefm.com>
> > Reply-To: "Karen Howard" <ka...@intrefacefm.com>

> Use the "Damerau–Levenshtein distance" to calcutate the similarity. 
> Since long I was interested to try this, but never found the time.

Did you have particular use in mind for that? The example above doesn't
seem all that useful as a phishing technique as it will fail DMARC.

My suspicion  is that they are trying to exploit mail systems that
haven't yet adopted DMARC checking and that interfacefm.com was chosen
for its SPF record:

v=spf1 +a +mx +a:ns1.c57578.sgvps.net include:_spf.mailspamprotection.com

There's no -all or ~all on the end.


Re: Catch subtly-different Reply-To domain

Posted by Michael Storz <Mi...@lrz.de>.
Am 2021-02-20 08:58, schrieb Dominic Raferd:
> Is there a rule to catch cases where the domain of the Reply-To header
> is a subtle variant on that in the To header. Take this (real) example
> from a phishing email sent yesterday:
> 
> From: "Karen Howard" <ka...@interfacefm.com>
> Reply-To: "Karen Howard" <ka...@intrefacefm.com>
> 
> I realise that other elements of the address can be different without
> being a reliable spam indicator but I think that interfacefm.com ->
> intrefacefm.com are so similar and yet different that they should be
> worth a few points. But I can't think how to write such a rule myself.

Use the "Damerau–Levenshtein distance" to calcutate the similarity. 
Since long I was interested to try this, but never found the time.

Michael