You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/02/24 06:24:31 UTC

[Bug 4150] New: Provide some form of associative list of URI and anchor text

http://bugzilla.spamassassin.org/show_bug.cgi?id=4150

           Summary: Provide some form of associative list of URI and anchor
                    text
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: enhancement
          Priority: P3
         Component: spamassassin
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: lwilton@earthlink.net


It would be helpful to be able to write rules (or evals, or whatever) that 
could compare the URI and the associated anchor text.  For instance, it is 
common in a phish spam to see a uri of http://<dotquad> and an associated 
anchor of https://some.secure.site/Login.  Simply comparing the http: to https: 
is a pretty good spam flag.  There are other interesting tests that can be 
performed by rubbing the uri against the anchor.

Currently this can (sometimes) be accomplished using rawbody and full tests.  
But having a dedicated method of comparing a known uri to the anchor text (with 
choice of raw or rendered) could likely improve both accuracy and efficiency.

According to Theo in Bug=3976:

Yeah, we make both available separately right now, but there's no
correlation between the two pieces of data.  However, this suggestion
would be better serviced via another RFE ticket since it's not related
to the current one.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4150] Provide some form of associative list of URI and anchor text

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4150


felicity@kluge.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Future                      |3.1.0




------- Additional Comments From felicity@kluge.net  2005-02-23 21:43 -------
I think this would be pretty trivial to implement for plugins/eval code.
Unless I'm missing something, it's about 3 lines of perl in HTML.pm. ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 4150] Provide some form of associative list of URI and anchor text

Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4150


felicity@kluge.net changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED




------- Additional Comments From felicity@kluge.net  2005-02-24 11:25 -------
ok, code committed.  r155227

There's HTML metadata per message, "anchor" is the anchor text per URI. 
"uri_anchor_index" is now a hash of uris which as a value has an array of
indexes into "anchor".  ie:

<a href="http://foo.com/">foo</a>
<a href="http://bar.com/">foo</a>
<a href="http://bar.com/">bar</a>

"anchor" is:

0: foo
1: foo
2: bar

uri_anchor_index is:

http://foo.com/ => [ 0 ],
http://bar.com/ => [ 1, 2 ]



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.