You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Steve Prior <sp...@geekster.com> on 2004/02/27 21:53:47 UTC

How about "Confidentiality assured"

I just got a spam about diplomas which didn't get caught and
noticed the phrase "Confidentiality assured" in the email.
It seems to me like that phrase should be worth something,
I can't think of a lot of reason why that would be in ham, but
sounds like something that would be found in drug, financial, and
educational spam.

Any thoughts?
Steve


Re: How about "Confidentiality assured"

Posted by Daniel Quinlan <qu...@pathname.com>.
Rich Puhek <rp...@etnsystems.com> writes:

> You may be on to something there:
> 
> body	T_CONFIDENTIALITY1	/Confidentiality\sassured/i
> body	T_CONFIDENTIALITY2	/\bConfidentiality\b/i

Seems like Bayes territory to me.

0.997         15          0 1077783028  Confidence
0.995          9          0 1077726419  confidant
0.989          4          0 1077930310  confide
0.989          4          0 1077919123  confidante
0.978          2          0 1077245378  CONFIDENCE
0.970         51          5 1077917889  Confidentiality
0.958          1          0 1077931164  self-confidence
0.958          1          0 1077931160  confident!
0.936          5          1 1077861218  Confidential
0.928        273         70 1077937447  confidentiality
0.889        309        127 1077939518  confidence
0.853        120         68 1077933499  confident
0.752         12         13 1077679452  CONFIDENTIAL
0.686        378        572 1077935350  confidential
0.400          1          5 1077704032  confidently
0.026          0          2 1077731615  CONFIDENTIALITY
0.007          0          8 1077824775  confidentially

The numbers are good for some, but not quite what you'd expect for all.

It does remind me that we should look into using 2+ token sequences
(which is on the developer wishlist).

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Re: How about "Confidentiality assured"

Posted by Rich Puhek <rp...@etnsystems.com>.
Steve Prior wrote:

> I just got a spam about diplomas which didn't get caught and
> noticed the phrase "Confidentiality assured" in the email.
> It seems to me like that phrase should be worth something,
> I can't think of a lot of reason why that would be in ham, but
> sounds like something that would be found in drug, financial, and
> educational spam.
> 
> Any thoughts?
> Steve

You may be on to something there:

body	T_CONFIDENTIALITY1	/Confidentiality\sassured/i
body	T_CONFIDENTIALITY2	/\bConfidentiality\b/i

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
   0.947   1.0302   0.0000    1.000   0.96    0.01  T_CONFIDENTIALITY2
   0.063   0.0687   0.0000    1.000   0.96    0.01  T_CONFIDENTIALITY1

--Rich



Re[2]: How about "Confidentiality assured"

Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Loren, others,

Saturday, February 28, 2004, 12:20:37 AM, you wrote:

>> > I just got a spam about diplomas which didn't get caught and
>> > noticed the phrase "Confidentiality assured" in the email.

>> Well,  I can think of several reasons why confidential and
>> confidentiality etc may be used in legit emails.
>>
>> I receive many emails with confidentiality disclaimers attached
>> at the bottom. I'm sure you may have seen a few. Disclaimers
>> that say the contents of this email are confidential blah blah blah.

>> I know a rule based on those phrases would be give me a huge
>> number of false positives.

LW> A rule looking for "confidential" would certainly cause problems.  I think I
LW> have yet to see one of those that includes "Confidentiality assured", or
LW> even "Confidentiality".  A rule for the first one is unlikely to cause a
LW> great deal of problem for anyone other than spammers, and even the second
LW> one could be helpful if given a fairly low score.  I believe someone ran
LW> those both through a corpus test, and they got very low, if any, ham hits.

Don't know if mine are the results you're talking about, but,
a) my "confidential" phrase rule does hit ham, and so I've had to be
careful with its score (1.584 of 9, equivalent to 0.880 of 5). I found NO
ham matching "confidentiality assured", and so can score that twice as
high.

body      RM_bpn_Confidential    /(?:total(?:ly)?|VERY|strictly|high(?:est|ly)?|utmost) CONFIDEN(?:ce|T(?:AI|IA)L)/i
describe  RM_bpn_Confidential    says this is very confidential
score     RM_bpn_Confidential    1.584  # 409s/6h of 97268 corpus (79437s/17831h) 01/24/04
                                        # ham: membership list, survey confidentiality, 
body      RM_bpn_Confidential2   /\bconfidential(?:ity)? assured/i
describe  RM_bpn_Confidential2   says this is very confidential
score     RM_bpn_Confidential2   3.000  # 616s/0h of 106556 corpus (87320s/19236h) 02/27/04

Bob Menschel




Re: How about "Confidentiality assured"

Posted by Steve Prior <sp...@geekster.com>.
Yeah, I don't think either of these words by themselves would
indicate anything, but "Confidentiality assured" seems to be
a sure sign.

Steve

Loren Wilton wrote:

>>>I just got a spam about diplomas which didn't get caught and
>>>noticed the phrase "Confidentiality assured" in the email.
> 
> 
>>Well,  I can think of several reasons why confidential and
>>confidentiality etc may be used in legit emails.
>>
>>I receive many emails with confidentiality disclaimers attached
>>at the bottom. I'm sure you may have seen a few. Disclaimers
>>that say the contents of this email are confidential blah blah blah.
> 
> 
>>I know a rule based on those phrases would be give me a huge
>>number of false positives.
> 
> 
> A rule looking for "confidential" would certainly cause problems.  I think I
> have yet to see one of those that includes "Confidentiality assured", or
> even "Confidentiality".  A rule for the first one is unlikely to cause a
> great deal of problem for anyone other than spammers, and even the second
> one could be helpful if given a fairly low score.  I believe someone ran
> those both through a corpus test, and they got very low, if any, ham hits.
> 
>         Loren


Re: How about "Confidentiality assured"

Posted by Loren Wilton <lw...@earthlink.net>.
> > I just got a spam about diplomas which didn't get caught and
> > noticed the phrase "Confidentiality assured" in the email.

> Well,  I can think of several reasons why confidential and
> confidentiality etc may be used in legit emails.
>
> I receive many emails with confidentiality disclaimers attached
> at the bottom. I'm sure you may have seen a few. Disclaimers
> that say the contents of this email are confidential blah blah blah.

> I know a rule based on those phrases would be give me a huge
> number of false positives.

A rule looking for "confidential" would certainly cause problems.  I think I
have yet to see one of those that includes "Confidentiality assured", or
even "Confidentiality".  A rule for the first one is unlikely to cause a
great deal of problem for anyone other than spammers, and even the second
one could be helpful if given a fairly low score.  I believe someone ran
those both through a corpus test, and they got very low, if any, ham hits.

        Loren


Re: How about "Confidentiality assured"

Posted by John Hardin <jo...@aproposretail.com>.
On Fri, 2004-02-27 at 23:49, Mike McMullen wrote:
> I receive many emails with confidentiality disclaimers attached
> at the bottom. I'm sure you may have seen a few. Disclaimers
> that say the contents of this email are confidential blah blah blah.

...as if email were in any way a confidential communications medium.

Sheesh.

--
John Hardin  KA7OHZ                           
Internal Systems Administrator/Guru               voice: (425) 672-1304
Apropos Retail Management Systems, Inc.             fax: (425) 672-0192
-----------------------------------------------------------------------
  Failure to plan ahead on someone else's part does not constitute an
  emergency on my part.
                                  - David W. Barts in a.s.r
-----------------------------------------------------------------------
 Tomorrow: ICQ Corp goes away - have you installed Jabber yet?


Re: How about "Confidentiality assured"

Posted by Mike McMullen <ml...@loanprocessing.net>.

> I just got a spam about diplomas which didn't get caught and
> noticed the phrase "Confidentiality assured" in the email.
> It seems to me like that phrase should be worth something,
> I can't think of a lot of reason why that would be in ham, but
> sounds like something that would be found in drug, financial, and
> educational spam.
> 
> Any thoughts?
> Steve
>

Well,  I can think of several reasons why confidential and 
confidentiality etc may be used in legit emails. 

I receive many emails with confidentiality disclaimers attached
at the bottom. I'm sure you may have seen a few. Disclaimers
that say the contents of this email are confidential blah blah blah.

When you deal with financial institutions such as retail and 
wholesale mortgage lenders and legal firms you see those types
of phrases all the time. 

I know a rule based on those phrases would be give me a huge
number of false positives.

Of course, your milage may vary. ;-)

Mike