You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Marc Perkel <ma...@perkel.com> on 2004/04/04 17:34:25 UTC

Improving SpamCopURI

I've gotten very few but some false positives on this rule. And - I 
can't tell what link produced the false positive and what to do about it 
- which is a separate issue to address.

The false positives are political in nature - mostly anti-war or 
anti-bush stuff.

However - many of these messages link to a variety of sites. From what I 
understand - the way the current rule works is if ANY link matches then 
the rule is triggered. Makes me wonder if there's a way to look at a 
situation where if most of the links are not positive - the rules isn't 
tripped.

Or - any other ideas or thoughts?

Wondering about a white link version of this. No specifif ideas yet 
though. But pushing for more white rules.

Just trying to make a really good rule better.


Re: Improving SpamCopURI

Posted by Daniel Quinlan <qu...@pathname.com>.
Jeff Chan <je...@surbl.org> writes:

> At this point the best solution is probably for you to send me
> the FP domains to manually whitelist at:  whitelist at surbl dot org
> Please send any you find if you get a chance.

Note that political spam is not uncommon.  Some people may be
legitimately subscribed (or want the mail), but many people get spams
from political organizations, candidates, etc.
 
>> However - many of these messages link to a variety of sites. From what I 
>> understand - the way the current rule works is if ANY link matches then 
>> the rule is triggered. Makes me wonder if there's a way to look at a 
>> situation where if most of the links are not positive - the rules isn't 
>> tripped.

Spammers are already inserting fake and innocent links to throw off
checkers.

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Re: Improving SpamCopURI

Posted by Jeff Chan <je...@surbl.org>.
On Monday, April 5, 2004, 7:36:47 AM, Marc Perkel wrote:
> Also - is there a way to feed back to the system new URIs for the list?
> A URI reporting system?

There is no way to report URIs directly to SURBL currently.  The
best way is to report them in spams to SpamCop.  It's indirect
but does the right thing if enough people do likewise and report
the same domain a few more times.

That said, I'm reworking the thresholding and retention system
to probably make the threshold much lower for known spam domain
IPs and Name servers as Daniel Quinlan suggested.

After watching the data for a while I think a longer general
retention of say 10 days might be a good idea to catch reports
over more than a week.  For known spam gang domains/name
servers/IPs we could make the retention a whole lot longer.
And domains that get dozens to hundreds of reports should
probably also be watched a lot longer using a longer retention.
Domains that get reported most probably deserve the most
attention through longer retention and perhaps a lower
inclusion threshold.

We would get external "known bad guys" data from other RBLs in
order to adjust thresholds and expirations, but the inclusion of
a domain in SURBL would still be triggered by SpamCop URI
reports.  But the trigger point would be lower for "bad guys".
This was a good suggestion from Daniel.

Are there any RBLs that are widely regarded as good indicators of
spam gang/spamhaus IPs other than SBLs?

Also, can anyone help us set up (or know where we can set up)
a discussion forum for SURBL?  We'd like to use it as a "star
chamber" for anti-spam veterans to join us in judging incoming
spam domains reaching the threshold to decide whether they belong
to spammers or are a false alarm and should be whitelisted.
We could also have blacklist recommendations and other discussion
there.  At this point we may need the help a community could
bring to help run things with SURBL.

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://www.surbl.org/


Re: Improving SpamCopURI

Posted by Loren Wilton <lw...@earthlink.net>.
> Also - is there a way to feed back to the system new URIs for the list? 
> A URI reporting system?

Sure - SpamCop!  If it makes it through it must need to be reported, nie?

        Loren


Re: Improving SpamCopURI

Posted by Marc Perkel <ma...@perkel.com>.
One thing that would be nice would be the ability for us to add out own 
URLs to our own private list. That way when something sneaks through we 
can add that any they go away - permanently.

Also - is there a way to feed back to the system new URIs for the list? 
A URI reporting system?


Re: Improving SpamCopURI

Posted by Jeff Chan <je...@surbl.org>.
On Sunday, April 4, 2004, 8:34:25 AM, Marc Perkel wrote:
> I've gotten very few but some false positives on this rule. And - I 
> can't tell what link produced the false positive and what to do about it 
> - which is a separate issue to address.

> The false positives are political in nature - mostly anti-war or 
> anti-bush stuff.

At this point the best solution is probably for you to send me
the FP domains to manually whitelist at:  whitelist at surbl dot org
Please send any you find if you get a chance.

> However - many of these messages link to a variety of sites. From what I 
> understand - the way the current rule works is if ANY link matches then 
> the rule is triggered. Makes me wonder if there's a way to look at a 
> situation where if most of the links are not positive - the rules isn't 
> tripped.

Yes any one URI matched against a spam domain will trigger
the rule for a given message.  One could implement your suggested
solution by scoring links within a message, requiring a certain
threshold or making the final score an accumulation or average
of the individual link scores within a single message having
multiple links.

One issue with that might be that an average could reward Joe Jobs
or invisible links.  In other words, a spammer could beat an
averaging rule by having say 20 invisible links to legitimate
sites and one visible link to their spam site.   I'm sure other
people have other ideas on this....

Regarding white rules, the best one I can think of is whether
a message is signed by someone on your public key ring....
That should get the message an automatic final score of 0.
(Not sure why public key encryption has not taken off for mail.
It seems entirely logical and useful to me....)

Jeff C.
-- 
Jeff Chan
mailto:jeffc@surbl.org-nospam
http://www.surbl.org/