You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by francis picabia <fp...@gmail.com> on 2014/10/27 19:01:50 UTC
Re: spamassassin rule to combat phishing
On Fri, Sep 19, 2014 at 2:59 PM, John Hardin <jh...@impsec.org> wrote:
> On Fri, 19 Sep 2014, francis picabia wrote:
>
> On Tue, Sep 16, 2014 at 5:27 PM, John Hardin <jh...@impsec.org> wrote:
>>
>> On Tue, 16 Sep 2014, francis picabia wrote:
>>>
>>> Hello,
>>>
>>>>
>>>> We just received the most authentic looking phishing I've seen. It was
>>>> professionally written, included a nice signature in the style used by
>>>> people at my workplace, and the target link was an exact replica of an
>>>> ezproxy website we run.
>>>>
>>>> The URL domain was only different by a few letters. I'm thinking we
>>>> will
>>>> see more of these. So here is a question perhaps someone can solve and
>>>> many of us can benefit from...
>>>>
>>>> How can I make a uri rule which matches
>>>>
>>>> example.com.junk/
>>>> but does not match
>>>> example.com/
>>>>
>>>>
>>> uri URI_EXAMPLE_EXTRA m;^https?://(?:www\.)?example\.com[^/?];i
>>>
>>
>>
>> That's a great one liner. I'm glad I asked. Thank you for this.
>>
>
> Warning: I did not actually test it. Please test it before putting it into
> production.
>
>
Yes, understood. I did test and it seemed to work OK.
However another spoofed message was received today and the rule
did not capture it.
If I want to detect something in the form of:
random_server.example.com.junk
I need to wildcard the first bit. Would that be:
uri URI_EXAMPLE_EXTRA m;^https?://(?:.*\.)?example\.com[^/?];i
I don't understand what the question mark and colon does inside the ( )
I thought it followed an optional char or expression. Should it be
like this?
uri URI_EXAMPLE_EXTRA m;^https?://(.*\.)?example\.com[^/?];i
Re: spamassassin rule to combat phishing
Posted by francis picabia <fp...@gmail.com>.
On Wed, Oct 29, 2014 at 10:27 AM, francis picabia <fp...@gmail.com>
wrote:
> I've tested the rule:
>
> uri URI_MYDOMAIN_PHISH
> m;^https?://(?:[^./]+\.)*example\.com[^/?];i
>
>
> is catching this sample newletter link:
>
> Oct 29 09:38:50.368 [24608] dbg: rules: ran uri rule
> URI_MYDOMAIN_PHISH ======> got hit: "http://example.com&"
>
> Complete email body content in test of newsletter link:
>
> <a target="_blank"
> href="http://www.environmental-expert.com/redirectnewsletter_login.asp?UR=
> L=http://www.environmental-expert.com&loginemail=user@example.com&loginc=
> ode=123456&utm_source=Articles_Waste_Recycling_01112014&utm_medium=em=
> ail&utm_campaign=newsletters&utm_content=logoclick"><img
> src="http://www.environmental-expert.com/newsletter/images/logo_dark_smal=
> l.gif"
> width="200" height="83" border="0"></a>
>
>
> I wonder how the RE can be tweaked to not match this case?
> I still don't understand the ?: part.
>
I don't know if it is the best solution, but adding & to the non-matching
clause has helped for the false positve and still catches the phishing
example:
uri URI_MYDOMAIN_PHISH m;^https?://(?:[^./]+\.)*example\.com[^/?&];i
Re: spamassassin rule to combat phishing
Posted by francis picabia <fp...@gmail.com>.
I've tested the rule:
uri URI_MYDOMAIN_PHISH
m;^https?://(?:[^./]+\.)*example\.com[^/?];i
is catching this sample newletter link:
Oct 29 09:38:50.368 [24608] dbg: rules: ran uri rule
URI_MYDOMAIN_PHISH ======> got hit: "http://example.com&"
Complete email body content in test of newsletter link:
<a target="_blank"
href="http://www.environmental-expert.com/redirectnewsletter_login.asp?UR=
L=http://www.environmental-expert.com&loginemail=user@example.com&loginc=
ode=123456&utm_source=Articles_Waste_Recycling_01112014&utm_medium=em=
ail&utm_campaign=newsletters&utm_content=logoclick"><img
src="http://www.environmental-expert.com/newsletter/images/logo_dark_smal=
l.gif"
width="200" height="83" border="0"></a>
I wonder how the RE can be tweaked to not match this case?
I still don't understand the ?: part.
Re: spamassassin rule to combat phishing
Posted by francis picabia <fp...@gmail.com>.
On Tue, Oct 28, 2014 at 11:47 AM, francis picabia <fp...@gmail.com>
wrote:
>
>
> On Mon, Oct 27, 2014 at 4:55 PM, John Hardin <jh...@impsec.org> wrote:
>
>> On Mon, 27 Oct 2014, francis picabia wrote:
>>
>> uri URI_EXAMPLE_EXTRA m;^https?://(?:www\.)?example\.com[^/?];i
>>>>>>
>>>>>
>>> However another spoofed message was received today and the rule
>>> did not capture it.
>>>
>>> If I want to detect something in the form of:
>>> random_server.example.com.junk
>>> I need to wildcard the first bit. Would that be:
>>>
>>> uri URI_EXAMPLE_EXTRA m;^https?://(?:.*\.)?example\.com[^/?];i
>>>
>>> I don't understand what the question mark and colon does inside the ( )
>>> I thought it followed an optional char or expression. Should it be
>>> like this?
>>>
>>> uri URI_EXAMPLE_EXTRA m;^https?://(.*\.)?example\.com[^/?];i
>>>
>>
>> (?:) means "group, don't remember the match". () remembers what's matched
>> for future use in the RE (e.g. to check for repeated strings like
>> "abcabcabcabc".
>>
>> Try this:
>>
>> uri URI_EXAMPLE_EXTRA m;^https?://(?:[^./]+\.)*example\.com[^/?];i
>>
>>
> Once again, thanks for the RE coding.
>
> I found a false positive it captured with my attempt at this :
>
> <a href="
> http://www.newslettersite.com/redirectnewsletter_login.asp?URL=http://www.secondsite.com/PYB/contact_us.asp&loginemail=user@example.com&logincode=123456&utm_source=Articles_Air_01112014&utm_medium=email&utm_campaign=newsletter&utm_content=contactus
> "
>
> I've tested your rule with that and it does not tag for the above.
> Great. Hopefully useful to others facing domain spoofs in phishing.
>
> I thought this was a representative test case, but apparently
there is something triggering a false positive when the
email is a newsletter which embeds a user's email within URLs.
In the sample I've seen, there are 34 such possible links which may have
triggered the issue, but I don't know which.
I ran the quarantined sample through spamassassin -D and it shows:
Oct 28 16:24:01.391 [28945] dbg: rules: ran uri rule URI_MYDOMAIN_PHISH
======> got hit: "http://example.com&"
On prior lines in the trace I see other uri rules getting hits, but it
seems to be about different URLs. The entire body of the email is base64
encoded. Extracting that part and running base64 -d I am not finding
the hit described by SA trace.
This is my method:
zcat spam-jUVZBDml0wS5.gz | grep 'http://example.com'
So the URL is not in the non-base64 part.
zcat spam-jUVZBDml0wS5.gz > /tmp/spamfull
cp /tmp/spamfull /tmp/spam64
vi /tmp/spam64 (to remove headers)
base64 -d /tmp/spam64 | grep 'http://example.com'
(no matchs)
Double checked with:
spamassassin -D -lint < /tmp/spamfull 2>&1 | grep http://example.com
nothing is output except the line above with URI_MYDOMAIN_PHISH.
Is there any suggestion on how to nail down where the match is happening?
Re: spamassassin rule to combat phishing
Posted by francis picabia <fp...@gmail.com>.
On Mon, Oct 27, 2014 at 4:55 PM, John Hardin <jh...@impsec.org> wrote:
> On Mon, 27 Oct 2014, francis picabia wrote:
>
> uri URI_EXAMPLE_EXTRA m;^https?://(?:www\.)?example\.com[^/?];i
>>>>>
>>>>
>> However another spoofed message was received today and the rule
>> did not capture it.
>>
>> If I want to detect something in the form of:
>> random_server.example.com.junk
>> I need to wildcard the first bit. Would that be:
>>
>> uri URI_EXAMPLE_EXTRA m;^https?://(?:.*\.)?example\.com[^/?];i
>>
>> I don't understand what the question mark and colon does inside the ( )
>> I thought it followed an optional char or expression. Should it be
>> like this?
>>
>> uri URI_EXAMPLE_EXTRA m;^https?://(.*\.)?example\.com[^/?];i
>>
>
> (?:) means "group, don't remember the match". () remembers what's matched
> for future use in the RE (e.g. to check for repeated strings like
> "abcabcabcabc".
>
> Try this:
>
> uri URI_EXAMPLE_EXTRA m;^https?://(?:[^./]+\.)*example\.com[^/?];i
>
>
Once again, thanks for the RE coding.
I found a false positive it captured with my attempt at this :
<a href="
http://www.newslettersite.com/redirectnewsletter_login.asp?URL=http://www.secondsite.com/PYB/contact_us.asp&loginemail=user@example.com&logincode=123456&utm_source=Articles_Air_01112014&utm_medium=email&utm_campaign=newsletter&utm_content=contactus
"
I've tested your rule with that and it does not tag for the above.
Great. Hopefully useful to others facing domain spoofs in phishing.
Re: spamassassin rule to combat phishing
Posted by John Hardin <jh...@impsec.org>.
On Mon, 27 Oct 2014, francis picabia wrote:
>>>> uri URI_EXAMPLE_EXTRA m;^https?://(?:www\.)?example\.com[^/?];i
>
> However another spoofed message was received today and the rule
> did not capture it.
>
> If I want to detect something in the form of:
> random_server.example.com.junk
> I need to wildcard the first bit. Would that be:
>
> uri URI_EXAMPLE_EXTRA m;^https?://(?:.*\.)?example\.com[^/?];i
>
> I don't understand what the question mark and colon does inside the ( )
> I thought it followed an optional char or expression. Should it be
> like this?
>
> uri URI_EXAMPLE_EXTRA m;^https?://(.*\.)?example\.com[^/?];i
(?:) means "group, don't remember the match". () remembers what's matched
for future use in the RE (e.g. to check for repeated strings like
"abcabcabcabc".
Try this:
uri URI_EXAMPLE_EXTRA m;^https?://(?:[^./]+\.)*example\.com[^/?];i
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...the Fates notice those who buy chainsaws...
-- www.darwinawards.com
-----------------------------------------------------------------------
4 days until Halloween