You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Matías López Bergero <ml...@udesa.edu.ar> on 2005/02/11 17:48:18 UTC
diferent rules hitting with diferent users
Hi,
I just got a spam message in my inbox without being flagged by SA, and I
find it very curios because it was a clear spam message. Wen I examine
the headers to see the score assigned by SA I get more surprised
____
Content analysis details: (0.4 points, 5.0 required)
____
pts rule name description
---- ----------------------
--------------------------------------------------
0.2 HTML_MESSAGE BODY: HTML included in message
0.2 HTML_FONT_BIG BODY: HTML tag for a big font size
WTF?
I copy the message a feed SA with it
X-Spam-Status: No, score=2.4 required=5.0 tests=BAYES_99,HTML_80_90,
HTML_FONT_BIG,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MIME_QP_LONG_LINE
autolearn=no version=3.0.2
OK, here it hit the rules that it was supposed to hit. But is still
getting a low score.
What can be causing this different scores??
I thought that maybe the running user name was the problem...
So I run SA before as root, now I su to the user name which receive the
mail the first time and guess what...
X-Spam-Status: No, score=0.4 required=5.0 tests=HTML_FONT_BIG,HTML_MESSAGE
autolearn=no version=3.0.2
WTH?
I run it with another user and I get different results
X-Spam-Status: No, score=1.8 required=5.0 tests=DCC_CHECK,HTML_FONT_BIG,
HTML_MESSAGE autolearn=no version=3.0.2
What is going on here? Why the matching rules are different for each
user? Any ideas?
BR,
Matías.
Re: diferent rules hitting with diferent users
Posted by Matias Lopez Bergero <ml...@udesa.edu.ar>.
Robert Menschel wrote:
> Hello Matías,
>
> Friday, February 11, 2005, 1:35:19 PM, you wrote:
>
>
>>>It's also hitting BAYES_99, which means your global Bayes database
>>>(since you're running as root) thinks for sure this is NOT spam.
>>>You've got a problem with the emails you've been learning in the past.
>
>
> MLB> This are bad news.
> MLB> What could happened?? This can be caused only by a bad training?
> MLB> I have stored all the mail that I feed sa-learn with.
> MLB> What can I do to debug and solve this problem??
>
> The problem here is that since you use the central database as root,
> but not as the user, root's execution of spamassassin is able to see
> this, but the user is not.
I'm going to configure the bayes site-wide and see if I achieve a better
result.
> As for the other differences, I don't know why you're getting the
> different rule hits.
>
> If you save the "spamassassin -D" output from each of the three users
> to files, and use diff to compare them, what differences are there?
That's exactly what I did,
The only differences that I got belongs to the output of razor or dcc.
[root@anubis tmp]# spamassassin -D < /tmp/raro 2>&1 | grep .cf > rootsacheck
[root@anubis tmp]# su - mlopezb
-bash-2.05b$ spamassassin -D < /tmp/raro 2>&1 | grep .cf > mlbsacheck
-bash-2.05b$ logout
[root@anubis tmp]# diff /tmp/rootsacheck ~mlopezb/mlbsacheck
43c43,44
< Feb 14 12:24:49.585575 check[31720]: [ 6] shock.cloudmark.com is a
Catalogue Server srl 5060; computed min_cf=6, Server se: C8
---
> Feb 14 12:25:10.727032 check[31797]: [ 6] pride.cloudmark.com is a
Catalogue Server srl 5060; computed min_cf=6, Server se: C8
> Feb 14 12:25:11.234550 check[31797]: [ 6] pride.cloudmark.com is a
Catalogue Server srl 5060; computed min_cf=6, debug: Using results from
Razor v2.67
49d49
< Feb 14 12:24:50.109628 check[31720]: [ 6] shock.cloudmark.com is a
Catalogue Server srl 5060; computed min_cf=6, Server se: C8
51c51
< Feb 14 12:24:50.713196 check[31720]: [ 6] mail 1.0 e=4
sig=5Zkyq938lxN0VuKPy1icYZcRSLMA: Is spam: cf 40 >= min_cf 6
---
> Feb 14 12:25:11.844318 check[31797]: [ 6] mail 1.0 e=4
sig=5Zkyq938lxN0VuKPy1icYZcRSLMA: Is spam: cf 40 >= min_cf 6
[root@anubis tmp]#
Thanks a lot for your help Bob.
BR,
Matías.
Re[2]: diferent rules hitting with diferent users
Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Matías,
Friday, February 11, 2005, 1:35:19 PM, you wrote:
>> It's also hitting BAYES_99, which means your global Bayes database
>> (since you're running as root) thinks for sure this is NOT spam.
>> You've got a problem with the emails you've been learning in the past.
MLB> This are bad news.
MLB> What could happened?? This can be caused only by a bad training?
MLB> I have stored all the mail that I feed sa-learn with.
MLB> What can I do to debug and solve this problem??
Sorry -- I've been working too hard. BAYES_99 says "most definitely
this IS spam." Your Bayes database here is OK.
The problem here is that since you use the central database as root,
but not as the user, root's execution of spamassassin is able to see
this, but the user is not.
As for the other differences, I don't know why you're getting the
different rule hits.
If you save the "spamassassin -D" output from each of the three users
to files, and use diff to compare them, what differences are there?
Bob Menschel
Re: diferent rules hitting with diferent users
Posted by Matías López Bergero <ml...@udesa.edu.ar>.
Robert Menschel wrote:
> Hello Matías,
>
> Friday, February 11, 2005, 8:48:18 AM, you wrote:
>
> MLB> I just got a spam message in my inbox without being flagged by SA, and I
> MLB> find it very curios because it was a clear spam message. Wen I examine
> MLB> the headers to see the score assigned by SA I get more surprised
>
> MLB> Content analysis details: (0.4 points, 5.0 required)
> MLB> 0.2 HTML_MESSAGE BODY: HTML included in message
> MLB> 0.2 HTML_FONT_BIG BODY: HTML tag for a big font size
>
> MLB> I copy the message a feed SA with it
> MLB> X-Spam-Status: No, score=2.4 required=5.0 tests=BAYES_99,HTML_80_90,
> MLB> HTML_FONT_BIG,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MIME_QP_LONG_LINE
> MLB> autolearn=no version=3.0.2
>
> MLB> OK, here it hit the rules that it was supposed to hit. But is still
> MLB> getting a low score.
>
> It's also hitting BAYES_99, which means your global Bayes database
> (since you're running as root) thinks for sure this is NOT spam.
> You've got a problem with the emails you've been learning in the past.
This are bad news.
What could happened?? This can be caused only by a bad training?
I have stored all the mail that I feed sa-learn with.
What can I do to debug and solve this problem??
> MLB> What can be causing this different scores??
>
> MLB> I thought that maybe the running user name was the problem...
> MLB> So I run SA before as root, now I su to the user name which receive the
> MLB> mail the first time and guess what...
>
> MLB> X-Spam-Status: No, score=0.4 required=5.0
> MLB> tests=HTML_FONT_BIG,HTML_MESSAGE
> MLB> autolearn=no version=3.0.2
>
> This suggests that maybe the user's user_prefs file is turning off
> some rules, or maybe you've got a configuration problem such that the
> user's emails aren't going against the same rules files.
>
> If you run spamassassin -D on this email from root, and from your two
> users, are the lists of *.cf rules files used the same?
Yes, Exactly the same.
I had never touched those user preferences.
What can be happening?
I have moved the sare rules from the /usr/share/spamassassin dir to
/etc/mail/spamassassin dir this morning, but I restarted spamd and the
checking are happening after this, so if one user if affected, the other
one should do, besides, the spamassassin -D command output is showing
the same .cf files.
Thanks for your help Bob.
BR,
Matías.
Re: diferent rules hitting with diferent users
Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Matías,
Friday, February 11, 2005, 8:48:18 AM, you wrote:
MLB> I just got a spam message in my inbox without being flagged by SA, and I
MLB> find it very curios because it was a clear spam message. Wen I examine
MLB> the headers to see the score assigned by SA I get more surprised
MLB> Content analysis details: (0.4 points, 5.0 required)
MLB> 0.2 HTML_MESSAGE BODY: HTML included in message
MLB> 0.2 HTML_FONT_BIG BODY: HTML tag for a big font size
MLB> I copy the message a feed SA with it
MLB> X-Spam-Status: No, score=2.4 required=5.0 tests=BAYES_99,HTML_80_90,
MLB> HTML_FONT_BIG,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MIME_QP_LONG_LINE
MLB> autolearn=no version=3.0.2
MLB> OK, here it hit the rules that it was supposed to hit. But is still
MLB> getting a low score.
It's also hitting BAYES_99, which means your global Bayes database
(since you're running as root) thinks for sure this is NOT spam.
You've got a problem with the emails you've been learning in the past.
MLB> What can be causing this different scores??
MLB> I thought that maybe the running user name was the problem...
MLB> So I run SA before as root, now I su to the user name which receive the
MLB> mail the first time and guess what...
MLB> X-Spam-Status: No, score=0.4 required=5.0
MLB> tests=HTML_FONT_BIG,HTML_MESSAGE
MLB> autolearn=no version=3.0.2
This suggests that maybe the user's user_prefs file is turning off
some rules, or maybe you've got a configuration problem such that the
user's emails aren't going against the same rules files.
If you run spamassassin -D on this email from root, and from your two
users, are the lists of *.cf rules files used the same?
Bob Menschel
Re: diferent rules hitting with diferent users
Posted by Matías López Bergero <ml...@udesa.edu.ar>.
Matt Kettler wrote:
> At 11:48 AM 2/11/2005, Matías López Bergero wrote:
>
>> I copy the message a feed SA with it
>>
>> X-Spam-Status: No, score=2.4 required=5.0 tests=BAYES_99,HTML_80_90,
>> HTML_FONT_BIG,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MIME_QP_LONG_LINE
>> autolearn=no version=3.0.2
>>
>> OK, here it hit the rules that it was supposed to hit. But is still
>> getting a low score.
>>
>> What can be causing this different scores??
>
>
> FIrst, the big difference is BAYES_99 hitting for the one user, and no
> bayes at all for the other.. That's a pretty significant difference.
I think that this it's happening because of some access denied at the db :-P
I'm seeing lots of this kind of error messages in my mail log file:
spamd[848]: bayes: bayes db version 0 is not able to be used, aborting!
at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore/DBM.pm
line 160, <GEN1564> line 69.
Michael Parker thinks that this is a pretty good indication that SA
couldn't get a lock on the bayes db files.
http://article.gmane.org/gmane.mail.spam.spamassassin.general/63264
I'm running spamd as root, and training the bayes as root, so the db at
root's home must be the problem.
Currently reading the setting up site-wide Bayesian filtering SA wiki.
I guess that is going to be my next move.
> The other hits would seem to indicate the messages are formatted
> differently. In particular MIME_QP_LONG_LINE suggests the message has
> been re-encoded.
>
> Define, exactly, what "copy the message and feed SA with it" means.
> Specifically, where are you copying from, how are you copying, and where
> are you copying to? Has the message been touched by any other servers,
> or a mail client inbetween?
I got the email via pop using mozilla email client. It stores the
messages in an mbox format, so I copy the mail to a separated "folder"
in my mozilla inbox, and copy the folder using scp to my mail server.
$ scp .mozilla/[...]/Inbox.sbd/SPAM.sbd/raro matias@mail:~
And then at the mail server I scanned the file as root and the other
users like I said.
$ spamassassin --mbox < raro
The differences appear in the scanning results appears at the mail
server wen the file was already copied, so it's not a truncated message
problem IMHO.
BR,
Matías.
Re: diferent rules hitting with diferent users
Posted by Matt Kettler <mk...@evi-inc.com>.
At 11:48 AM 2/11/2005, Matías López Bergero wrote:
>I copy the message a feed SA with it
>
>X-Spam-Status: No, score=2.4 required=5.0 tests=BAYES_99,HTML_80_90,
> HTML_FONT_BIG,HTML_IMAGE_RATIO_08,HTML_MESSAGE,MIME_QP_LONG_LINE
> autolearn=no version=3.0.2
>
>OK, here it hit the rules that it was supposed to hit. But is still
>getting a low score.
>
>What can be causing this different scores??
FIrst, the big difference is BAYES_99 hitting for the one user, and no
bayes at all for the other.. That's a pretty significant difference.
The other hits would seem to indicate the messages are formatted
differently. In particular MIME_QP_LONG_LINE suggests the message has been
re-encoded.
Define, exactly, what "copy the message and feed SA with it" means.
Specifically, where are you copying from, how are you copying, and where
are you copying to? Has the message been touched by any other servers, or a
mail client inbetween?