You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/06/03 13:21:58 UTC

[Bug 4386] New: New rule suggestion: detect mismatched URIs and onMouseOver

http://bugzilla.spamassassin.org/show_bug.cgi?id=4386

           Summary: New rule suggestion: detect mismatched URIs and
                    onMouseOver
           Product: Spamassassin
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jifl-sabugzilla@jifvik.org


Many spams are widely propagated phishing attempts. One common characteristic of
these is that they not infrequently contain a URI that differs from the URI
purported URI contents, and/or contains things to disguise the URI.

So for example (based on a real one, but not verbatim):
<a href="http://12.98.176.54/billing.ebay.com"
onMouseOver="status='https://billing.ebay.com/'; return
true">http://billing.ebay.com/</a>

In this example SA could pick up on two things: SA could detect that the link
contents are themselves in the form of a URI ("http(s?)://" would do), and then
that the href in the link refers to a URL that differs from that URI. Secondly
the use of onMouseOver in a mail is probably a good indicator of a suspicious
e-mail, thus warranting some extra score.

Sometimes attempts are made to add unnecessary escape chars to obscure the URIs,
which SA does already have a rule to detect, but if this rule were added, it
would want to able to operate on the decoded URIs.

So I suggest since SA can grok URIs, perhaps this could now be detected?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386


dpoon@ocf.berkeley.edu changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dpoon@ocf.berkeley.edu






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386





------- Additional Comments From felicity@apache.org  2005-06-03 08:03 -------
Subject: Re:   New: New rule suggestion: detect mismatched URIs and onMouseOver

On Fri, Jun 03, 2005 at 04:21:58AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> So for example (based on a real one, but not verbatim):
> <a href="http://12.98.176.54/billing.ebay.com"
> onMouseOver="status='https://billing.ebay.com/'; return
> true">http://billing.ebay.com/</a>

3.1 already has a rule to flag this specific type of href setup.

> In this example SA could pick up on two things: SA could detect that the link
> contents are themselves in the form of a URI ("http(s?)://" would do), and then
> that the href in the link refers to a URL that differs from that URI. Secondly

It's not that simple.  This has been discussed numerous times already
on users@ and other tickets.  In short, testing shows that assuming
the anchor text URI and the href URI match in ham but not in spam is
completely not valid and FPs wildly.

> the use of onMouseOver in a mail is probably a good indicator of a suspicious
> e-mail, thus warranting some extra score.

Perhaps, I haven't tested any OMO rules.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386





------- Additional Comments From jifl-sabugzilla@jifvik.org  2005-06-03 09:25 -------
> That's actually what I was talking about.  The FP rate is horrible.

I'm surprised, but I'll defer to real testing!

> I have 0 hits for my personal ham, but several FPs on my hamtrap mails:
> CNN, go.com, Simon & Schuster, etc.

Only 1 for me for 20000 personal and "commercial" ham mails over the past year -
which was an advert to renew virus scanner support which albeit legitimate was
wholly unsurprisingly classifed as spam if you could see it's content.

Maybe what you say might mean that the regexp should match an onMouseOver option
that also mentions "status". But I'm not sure that's enough. I see some spams
that obfuscate with calling javascript subroutines instead.

But even with just plain matching of onmouseover, it would probably be able to
confer some modest contribution to the score. After all, mails get 0.1 just for
being HTML! Just like many rules that risk FPs. So the goal must surely be low
FP, not 0 FP.

OOI by contrast, an archive of spams over just 5 months resulted in over 200
hits for "onmouseover".

But of course, more usefully, this would give some protection to the gullible.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386





------- Additional Comments From dpoon@ocf.berkeley.edu  2005-06-03 06:05 -------
I'd like to suggest this regex as a starting point for a rule:

/\<\s*a\b[^>]*\bhref\s*=\s*"?http(s?:\/\/[^\s">]+)[^\s">]*\>\s*(?:<(?!\/a\b)[^>]*>)*\s*http(?!\1)/i


The onmouseover status is harder to catch.  I've also seen phish attempts of the form
  <form action="http://realurl">
    <input type="submit" value="http://displayurl">
  </form>

To catch that case, I propose this rule:

/\<\s*form\b[^>]*\baction="?http(s?:\/\/[^\s">]+)[^\s">]*\>\s*(?:<(?!\/form\b)[^>]*>)*\s*\<
\s*input\b[^>]*value\s*=\s*"?http(?!\1)/i



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4386





------- Additional Comments From tech2@i-is.com  2006-04-25 20:26 -------
Can someone put this in a sandbox for testing?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386





------- Additional Comments From felicity@apache.org  2005-06-03 08:40 -------
Subject: Re:  New rule suggestion: detect mismatched URIs and onMouseOver

On Fri, Jun 03, 2005 at 08:22:39AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> but checking just the protocol/host part of the URI would probably be
> sufficient. The chance of FPs then would seem much smaller for legitimate ham.

That's actually what I was talking about.  The FP rate is horrible.

> I don't think it would be required to match the whole extract with the use of
> status etc. I think the presence of onmouseover anywhere is probably a
> sufficient indicator that this mail is irregular.

I have 0 hits for my personal ham, but several FPs on my hamtrap mails:
CNN, go.com, Simon & Schuster, etc.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4386





------- Additional Comments From kjetilho@ifi.uio.no  2006-03-10 01:17 -------
Created an attachment (id=3407)
 --> (http://issues.apache.org/SpamAssassin/attachment.cgi?id=3407&action=view)
plugin to test for URIs with mismatched anchor text and HREF

perhaps this plugin is useful?	it allows some local customisation, at least.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4386





------- Additional Comments From jm@jmason.org  2006-04-25 21:02 -------
fwiw, I'm not entirely sure if the sandbox stuff supports testing plugins right
now :(

it *may* do, since I added that "tryplugin" change, but I'm not sure.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4386





------- Additional Comments From schulz@adi.com  2006-03-10 14:07 -------
With the possible exception of the submitted plugin, this bug seems to be
a duplicate of bug 4255.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4386] New rule suggestion: detect mismatched URIs and onMouseOver

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4386





------- Additional Comments From jifl-sabugzilla@jifvik.org  2005-06-03 08:22 -------
Theo wrote:
> Jifl wrote:
> > So for example (based on a real one, but not verbatim):
> > <a href="http://12.98.176.54/billing.ebay.com"
> > onMouseOver="status='https://billing.ebay.com/'; return
> > true">http://billing.ebay.com/</a>
>
> 3.1 already has a rule to flag this specific type of href setup.

Cool! I'm "only" using the latest released version.

>> In this example SA could pick up on two things: SA could detect that the link
>> contents are themselves in the form of a URI ("http(s?)://" would do), and then
>> that the href in the link refers to a URL that differs from that URI. Secondly
> It's not that simple.  This has been discussed numerous times already
> on users@ and other tickets.  In short, testing shows that assuming
> the anchor text URI and the href URI match in ham but not in spam is
> completely not valid and FPs wildly.

I wasn't able to find any similar tickets in an earlier query before I submitted
this, and am not on the users list sorry. So sorry if I'm repeating something,
but checking just the protocol/host part of the URI would probably be
sufficient. The chance of FPs then would seem much smaller for legitimate ham.

Derek wrote:
> The onmouseover status is harder to catch. 

I don't think it would be required to match the whole extract with the use of
status etc. I think the presence of onmouseover anywhere is probably a
sufficient indicator that this mail is irregular.

And thanks for the forms suggestion and regexps. It's difficult to imagine ham
that wanted to legitimately use such a construct.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.