You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2005/02/24 06:24:31 UTC
[Bug 4150] New: Provide some form of associative list of URI and anchor text
http://bugzilla.spamassassin.org/show_bug.cgi?id=4150
Summary: Provide some form of associative list of URI and anchor
text
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Other
OS/Version: other
Status: NEW
Severity: enhancement
Priority: P3
Component: spamassassin
AssignedTo: dev@spamassassin.apache.org
ReportedBy: lwilton@earthlink.net
It would be helpful to be able to write rules (or evals, or whatever) that
could compare the URI and the associated anchor text. For instance, it is
common in a phish spam to see a uri of http://<dotquad> and an associated
anchor of https://some.secure.site/Login. Simply comparing the http: to https:
is a pretty good spam flag. There are other interesting tests that can be
performed by rubbing the uri against the anchor.
Currently this can (sometimes) be accomplished using rawbody and full tests.
But having a dedicated method of comparing a known uri to the anchor text (with
choice of raw or rendered) could likely improve both accuracy and efficiency.
According to Theo in Bug=3976:
Yeah, we make both available separately right now, but there's no
correlation between the two pieces of data. However, this suggestion
would be better serviced via another RFE ticket since it's not related
to the current one.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 4150] Provide some form of associative list of URI and anchor text
Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4150
felicity@kluge.net changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|Future |3.1.0
------- Additional Comments From felicity@kluge.net 2005-02-23 21:43 -------
I think this would be pretty trivial to implement for plugins/eval code.
Unless I'm missing something, it's about 3 lines of perl in HTML.pm. ;)
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 4150] Provide some form of associative list of URI and anchor text
Posted by bu...@bugzilla.spamassassin.org.
http://bugzilla.spamassassin.org/show_bug.cgi?id=4150
felicity@kluge.net changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Additional Comments From felicity@kluge.net 2005-02-24 11:25 -------
ok, code committed. r155227
There's HTML metadata per message, "anchor" is the anchor text per URI.
"uri_anchor_index" is now a hash of uris which as a value has an array of
indexes into "anchor". ie:
<a href="http://foo.com/">foo</a>
<a href="http://bar.com/">foo</a>
<a href="http://bar.com/">bar</a>
"anchor" is:
0: foo
1: foo
2: bar
uri_anchor_index is:
http://foo.com/ => [ 0 ],
http://bar.com/ => [ 1, 2 ]
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.