You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Gary Smith <ga...@primeexalia.com> on 2004/07/15 16:07:23 UTC

Spam oddity. How does this impact bayes.

This email seemed to trigger the spam filter the first time so I'll try
it again.

 

-----Original Message-----
From: Gary Smith [mailto:gary@primeexalia.com] 
Sent: Wednesday, July 14, 2004 11:56 PM
To: spamassassin-users@incubator.apache.org
Subject: [Suspected SPAM] Adult entertainment spam oddity. How does this
impact bayes.

 

While looking at some of the spam that I have been receiving I have
noticed jokes associated with them.  One thing that caught my attention
is that I received this joke years ago.  But my general question is how
will this impact bayes?  The bulk of the language is straight forward
and this scored almost enough to be added to bayes.  Is it possible that
this might lead to some corruption/poisoning in bayes?

*  Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: auto-learn?
ham=0.1, spam=12, body-hits=11.273, head-hits=2.419

If you want the original email I'll forward it to you on request.

<SNIP>

A blonde began a job as an elementary school counselor and she was eager
to help. One day during recess she noticed a girl standing by herself on
one side of a playing field while the rest of the kids enjoyed a game of
soccer at the other. The blonde approached and asked if she was all
right. 

A little while later, however, Sandy noticed the girl was in the same
spot, still by herself. Approaching again, Sandy offered, "Would you
like me to be your friend?" 

The girl hesitated, then said, "Okay," looking at the woman
suspiciously. Feeling she was making progress, the blonde then asked,
"Why are you standing here all alone?" 

"Because," the little girl said with great exasperation, "I'm the
goalie!"

</SNIP>

Here is the long winded debug print of maillog for this email

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: logmsg: info: setuid to
filter succeeded 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: info: setuid to filter
succeeded 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug:
read_scoreonly_config: cannot open "/dev/nul/.spamassassin/user_prefs":
No such file or directory 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: user has changed 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777
untie-ing 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777 tie-ing
to DB file R/O /etc/mail/spamassassin/bayes/bayes_toks 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777 tie-ing
to DB file R/O /etc/mail/spamassassin/bayes/bayes_seen 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: bayes: found bayes db
version 2 

Jul 14 16:01:31 vjo-lxutil-07 spamd[12777]: debug: Score set 3 chosen. 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: is Net::DNS::Resolver
available? yes 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: running header regexp
tests; score so far=0 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: running body-text
per-line regexp tests; score so far=0.646 

Jul 14 16:01:32 vjo-lxutil-07 postfix/smtpd[12461]: disconnect from
unknown[202.104.237.157]

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes corpus size:
nspam = 16361, nham = 5214 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: uri tests: Done uriRE


Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for *p = "U*ThreeFootSquirter D*cibertig.com D*com" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for *F = "U*ThreeFootSquirter D*cibertig.com D*com" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for To = "U*gary.smith D*primeexalia.com D*com" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for *x = "Mailer Software (rev. 01/15/2004)" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for *m = " rqqakrazqscrxpce mail cibertig com " 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for MIME-version = "1.0" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: tokenize: header
tokens for Content-type = "multipart/alternative;
boundary="opfnpwcmrkendsex"" 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'N:index_NN.gif' => 0.999359223300971 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'N:index_NN.jpg' => 0.999321585903084 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'ape' =>
0.999059063136456 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'grid' =>
0.998952380952381 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'index_02.jpg' => 0.998720221606648 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'index_01.gif' => 0.998720221606648 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'N:gtNNNN' => 0.998514469453376 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'gmi' =>
0.998514469453376 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'qkmn' =>
0.998514469453376 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'usbutton.gif' => 0.99846511627907 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'soccer'
=> 0.997810426540284 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'approached' => 0.996723404255319 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'sk:nlx1zro' => 0.996473282442748 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token
'N:sk:nlxNzro' => 0.996473282442748 

Jul 14 16:01:32 vjo-lxutil-07 spamd[12777]: debug: bayes token 'p.gif'
=> 0.994296296296296 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'recess'
=> 0.993492957746479 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'H*x:2004' => 0.989768564317507 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'UD:jpg'
=> 0.970517578569698 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'HMIME-version:1.0' => 0.029744669657406 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'N:HMIME-version:N.N' => 0.029744669657406 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'UD:smith' => 0.968539175770508 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'Gushing'
=> 0.958 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'gary.smith' => 0.949626083151851 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'noticed'
=> 0.0515134762371545 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'UD:php'
=> 0.946967488045025 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'amazing'
=> 0.93664435426729 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'H*x:Mailer' => 0.931095311383256 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'UD:gif'
=> 0.925587029391136 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token
'elementary' => 0.912801322235964 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'eager'
=> 0.897984435510395 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'enjoyed'
=> 0.894223381225676 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'I'm' =>
0.126906076836342 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes token 'Okay' =>
0.15036323315329 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes: score =
0.999937039949426 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777
untie-ing 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777
untie-ing db_toks 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: bayes: 12777
untie-ing db_seen 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: Razor2 is not
available 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: running raw-body-text
per-line regexp tests; score so far=1.346 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: running uri tests;
score so far=3.119 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: uri tests: Done uriRE


Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: running full-text
regexp tests; score so far=11.919 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: Razor2 is not
available 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: Pyzor is not
available: pyzor not found 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: DCCifd is not
available: no r/w dccifd socket found. 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: DCC is not available:
no executable dccproc found. 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: all '*To' addrs:
gary.smith@primeexalia.com 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: DNS MX records found:
1 

Jul 14 16:01:33 vjo-lxutil-07 spamd[12777]: debug: RBL: success for 1 of
1 queries 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: running meta tests;
score so far=11.919 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: auto-learn? ham=0.1,
spam=12, body-hits=11.273, head-hits=2.419 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: auto-learn: currently
using scoreset 3.  recomputing score based on scoreset 1. 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: Score set 1 chosen. 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: auto-learn: original
score: 11.919, recomputed score: 11.82 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: Score set 3 chosen. 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: auto-learn? no:
inside auto-learn thresholds 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: debug: is spam? score=17.319
required=4.8
tests=BAYES_99,HTML_MESSAGE,J_CHICKENPOX_45,SARE_BOUNDARY_LC,SARE_HTML_G
IF_SHORT,SARE_HTML_P_BREAKcb,SARE_HTML_USL_A,SPAMCOP_URI_RBL,WS_URI_RBL 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: logmsg: identified spam
(17.3/4.8) for filter:120 in 2.0 seconds, 3337 bytes. 

Jul 14 16:01:34 vjo-lxutil-07 spamd[12777]: identified spam (17.3/4.8)
for filter:120 in 2.0 seconds, 3337 bytes.