You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Marc Kool <M....@vioro.nl> on 2004/07/20 15:09:47 UTC

Re: sex.surbl.org

David Hooton wrote:

snip

>>>How does anyone else feel about turning this list of sex sites
>>>into a SURBL?
>>>
>>>Jeff C.
>>
>>
>>+1 for me. Keeping it a seperate list is a great idea. Gives admins the
>>choice. Especially ISPs who may only wish to tag spam, but allow customers
>>to look at the occasional naughty boom boom kissy kissy :)
>>
>>--Chris
> 
> 
> Has anyone got any FP stats on this data while using it in Squidguard?  
> 
> It looks like very useful data, but how is it managed?
> 
> Could be very intresting data to have a trial of at least.

Fabrice Prigent of the University of Toulouse maintains the database and told 
me that he has an automated mechanism to verify contributed domains and that
he verifies contributed domains himself in case of any doubt.

I am a contributor to the database and the weekly scan for adult sites
produces anything between 500 and 5000 domains per week.  The set of scripts
that I wrote have been tuned for 18 months and I have stopped verifying the
list of domains that it produces, since I have not seen false positives for a long time.
The scripts use a scoring method and by checking the medium score domains
I usually get a bunch of false negatives (adult sites not rated as adult by the scripts)
that are also contributed to the database.

in short: I believe the quality is very high and new versions can be downloaded daily by ftp.

off topic: those of you who have a mother language other than English or Dutch, has 
computing power (1 recent Intel CPU), Unix and bandwidth (1 mbit) can receive the scripts
to add your mother language and find adult sites in this language.

-Marc