You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Giampaolo Tomassoni <g....@libero.it> on 2007/02/25 14:41:55 UTC

Using levenshtein distance to report spam

Dears,

I'm actually using SA (through amavisd-new) + postfix + postgrey and user's accounts are on a postgresql database. I'm also using dcc, pyzor, razor and spamcop plugins.

I see some spam is attempting to go to unexistant mailboxes, perhaps in an the hope the MX is configured with catchalls. I was thinking that probably it could be interesting to detect mail going to unexistant mailboxes and have them reported to bayes, dcc, pyzor, razor and spamcop.

Since even legitimate mails may be misdirected, ie: when one of your users misconfigures the reply e-mail address in his/her MUA or when he/she communicates a mailbox to someone by voice, I wouldn't like to adopt the pragmatic approach of regarding every and each e-mail directed to an unexistant destinator as spam. I would instead prefer to use a more heuristic approach, like, in example, handling as SPAM only e-mails to destinators whose mailbox name is at least a given distance (say 3?) from every alias name in that domain.

That said:

	1) did anybody you setup anything like this?

	2) which may be the unwanted side-effects?

	3) do you think this kind of feature would be somehow useful?

Thanks,

-----------------------------------
Giampaolo Tomassoni - IT Consultant
Piazza VIII Aprile 1948, 4
I-53044 Chiusi (SI) - Italy
Ph: +39-0578-21100

MAI inviare una e-mail a:
NEVER send an e-mail to:
 rainbowl@tomassoni.eu


RE: Using levenshtein distance to report spam

Posted by Giampaolo Tomassoni <g....@libero.it>.
From: Raul Dias [mailto:raul@dias.com.br]
> 
> You need a lot more of bandwidth/memory/cpu power and disk IO to do it.

Maybe I need more bw. Why memory and cpu? I may spool spam detected this way and my spamtrap script may despool and report it. There will be only one running spamtrap script.

giampaolo


> 
> -Raul Dias
> 


Re: Using levenshtein distance to report spam

Posted by Raul Dias <ra...@dias.com.br>.
You need a lot more of bandwidth/memory/cpu power and disk IO to do it.

-Raul Dias