You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Mike Spamassassin <sa...@ernstoff.net> on 2005/03/15 16:00:51 UTC

Is there such a test?

I have just received spam from <Esmeralda Bouchard> vgguvqpnvj@france.com.
Is there a test which identifies that the description (Esmeralada
Bouchard) bears no resemblance to the given sender's address?
Similarly I sometimes receive spam mail to my email address but with a
completely unrecognisable description.

Are there any tests to identify these discrepancies between the email
addresses and their descriptions?


Re: Is there such a test?

Posted by Yang Xiao <yx...@gmail.com>.
Explain how does "Mike Spamassassin" describe sa@emstoff.net, what's
the resemblence there?

Yang


On Tue, 15 Mar 2005 15:00:51 -0000 (GMT), Mike Spamassassin
<sa...@ernstoff.net> wrote:
> I have just received spam from <Esmeralda Bouchard> vgguvqpnvj@france.com.
> Is there a test which identifies that the description (Esmeralada
> Bouchard) bears no resemblance to the given sender's address?
> Similarly I sometimes receive spam mail to my email address but with a
> completely unrecognisable description.
> 
> Are there any tests to identify these discrepancies between the email
> addresses and their descriptions?
> 
>

Re: Is there such a test?

Posted by Kai Schaetzl <ma...@conactive.com>.
Mike Spamassassin wrote on Tue, 15 Mar 2005 16:24:20 -0000 (GMT):

> Point taken, but I still think it would be a valid test. 
> Like all SpamAssassin tests it should only be one of many indicators. 
> In particular all the ones that I receive I would expect to have "Mike" or 
> "Michael" in the description of my email address. 
> I would also like to be able to pick out those from "Microsoft Support" 
> which are not from microsoft.com and other typical phishing mails.
>

You can just create your own rules to do that. In general I doubt that this 
rule does any good and rules like "no real name" already exist and if I look 
at them I see that they would generate a lot of false positives when taken as 
spammy.

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




Re: Is there such a test?

Posted by Loren Wilton <lw...@earthlink.net>.
> I would also like to be able to pick out those from "Microsoft Support"
> which are not from microsoft.com and other typical phishing mails.

Now there you are on easier ground.  SARE has several rules to catch phish
that are based on this sort of thing.

        Loren


Re: Is there such a test?

Posted by Yang Xiao <yx...@gmail.com>.
Alright, I'm developing such a test.
For American/Anglo-Sexon names, it will do random comparason with the
Webster Dictionary for FLast, FirshL, First.Last Last.F, Last.First
and spell check them all.
For Indian names, it will search the Yahoo movie Database.
For French Names, we will append "Freedom Fries" then do search number 1.
For Chinese names we will address-rewrite everybody to
Bruce.Lee@Kungfu.com and pipe to /dev/null
.....


Problem solved!

Yang

Re: Is there such a test?

Posted by Keith Ivey <kc...@cpcug.org>.
Mike Spamassassin wrote:

> I'd take that bet.
> While you are almost certainly correct with the likes of those who
> subscribe to this group, who often have multiple email addresses,
> out there in name@isp land, and hotmail world, most people have a single
> email address strongly related to their name.

I'll kick in another $5 for the bet.  The Hotmail, AOL, Yahoo 
world is if anything more likely to have addresses that have 
nothing to do with names, because the username spaces are so 
fullthat people are forced to be creative and give themselves 
addresses related to their hobbies or favorite sprts teams.

I think the person you're responding to was generous in 
expecting the test might hit 80% spam and 20% ham.  My bet is 
that it would be closer to 50%, assuming you're able to come up 
with a definition of "address is related to name".

-- 
Keith C. Ivey <kc...@cpcug.org>
Washington, DC

Re: Is there such a test?

Posted by Matt Kettler <mk...@evi-inc.com>.
Mike Spamassassin wrote:

>I'd take that bet.
>While you are almost certainly correct with the likes of those who
>subscribe to this group, who often have multiple email addresses,
>out there in name@isp land, and hotmail world, most people have a single
>email address strongly related to their name.
>  
>
Really? My yahoo address is shwalker.geo@yahoo.com. Not very well 
related to "Matt Kettler".

Most people *try* to have one strongly related to their name, but often 
they fail to be available so they default back to something related to 
their interests. One friend of mine who is a skiing buff uses 
"doubleblack" as an address.  Others intentionally choose something 
unrelated to their name in order to gain privacy. Another friend 
intentionally uses the all-text spelling of his jersey number from high 
school sports as his email. i.e. "thirtyone"

I'd say this is actually much more common in the hotmail world than in 
the admin world. There's too much name collision for easy things like 
"mkettler" to be available on hotmail.

>Back to the original question:
>Regardless of whether anyone thinks it is a good test or not, has anyone
>yet created such a test?
>  
>

I doubt it. It's not exactly a straight-forward thing to do. You'd need 
to have some kind of fuzzy-match algorithm, because there's so many 
different ways to convert a name to an email.

I use descriptors like:
"Matt" "Matt Kettler" "Matthew Kettler" and "Matthew E Kettler"

I use emails like:
matt@ mattk@, mkettler@, mekettler@ mattkettler@ mkettler73@  kettlerm@

Trying to map all of the above combinations to each other is kind of 
tough. Particularly mappings like "Matt" to kettlerm@


Re: Is there such a test?

Posted by Mike Spamassassin <sa...@ernstoff.net>.
I'd take that bet.
While you are almost certainly correct with the likes of those who
subscribe to this group, who often have multiple email addresses,
out there in name@isp land, and hotmail world, most people have a single
email address strongly related to their name.

Back to the original question:
Regardless of whether anyone thinks it is a good test or not, has anyone
yet created such a test?

> Mike Spamassassin wrote:
>
>>Point taken, but I still think it would be a valid test.
>>Like all SpamAssassin tests it should only be one of many indicators.
>>
>
> No, not really. There's a minimum useful S/O ratio for spam rules.
>
> I'd bet $5.00 that this rule would have a S/O under 0.80 in the
> corpus.(ie: no more 80% of it's hits were spam, and at least 20% were ham)
>
>



Re: Is there such a test?

Posted by Mike Spamassassin <sa...@ernstoff.net>.
Point taken, but I still think it would be a valid test.
Like all SpamAssassin tests it should only be one of many indicators.
In particular all the ones that I receive I would expect to have "Mike" or
"Michael" in the description of my email address.
I would also like to be able to pick out those from "Microsoft Support"
which are not from microsoft.com and other typical phishing mails.

> At 10:00 AM 3/15/2005, Mike Spamassassin wrote:
>>I have just received spam from <Esmeralda Bouchard>
>> vgguvqpnvj@france.com.
>>Is there a test which identifies that the description (Esmeralada
>>Bouchard) bears no resemblance to the given sender's address?
>
> No.. It's quite common for normal people to have that.
>
> For example, take a look at Theo Van Dinter's email address. The only
> letters in common between his name and his email username are t,i, and e.
> (The username part is "felicity", and the domain has no resemblance to his
> name either.. "kludge")
>
> And what about Paul Shupak, who uses "List Mail User" as a description,
> and
> "track" as a username?
>
> Or these other combinations from this mailing lists (domains removed to
> reduce harvesting problems)
>
> "Ben Wylie"     sasssin@
>   "Kai Schaetzl"   maillists@
> "Matt Yackley"   sare@
> "Matthias Keller" linux@
>
>
>



Re: Is there such a test?

Posted by Kai Schaetzl <ma...@conactive.com>.
Oh, my second of fame :-)

Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com
IE-Center: http://ie5.de & http://msie.winware.org




Re: Is there such a test?

Posted by Matt Kettler <mk...@comcast.net>.
At 10:00 AM 3/15/2005, Mike Spamassassin wrote:
>I have just received spam from <Esmeralda Bouchard> vgguvqpnvj@france.com.
>Is there a test which identifies that the description (Esmeralada
>Bouchard) bears no resemblance to the given sender's address?

No.. It's quite common for normal people to have that.

For example, take a look at Theo Van Dinter's email address. The only 
letters in common between his name and his email username are t,i, and e. 
(The username part is "felicity", and the domain has no resemblance to his 
name either.. "kludge")

And what about Paul Shupak, who uses "List Mail User" as a description, and 
"track" as a username?

Or these other combinations from this mailing lists (domains removed to 
reduce harvesting problems)

"Ben Wylie"     sasssin@
  "Kai Schaetzl"   maillists@
"Matt Yackley"   sare@
"Matthias Keller" linux@



Re: Is there such a test?

Posted by Loren Wilton <lw...@earthlink.net>.
> I have just received spam from <Esmeralda Bouchard> vgguvqpnvj@france.com.
> Is there a test which identifies that the description (Esmeralada
> Bouchard) bears no resemblance to the given sender's address?

No.  Because there is no possibly way of knowing that rat993@flub.net really
isn't "Johnny P. Spammer".

> Similarly I sometimes receive spam mail to my email address but with a
> completely unrecognisable description.

This one can be done on an individual basis, sometimes.  It relies on you
having a standard format for the stuff in quotes.  You have to allow for
friends that will put the stuff in quotes in a somewhat different form.  But
it can't be done (at this time, at least) as a standard test.

        Loren