You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jorge Valdes <jv...@intercom.com.sv> on 2007/03/28 19:01:44 UTC

Feature Request

Hi,

I have been using SA since ~2.60 and because I work for an ISP, I need 
to be more tolerant than most with regards to handling email. With this 
in mind I have made a few modifications to the STOCK version and have to 
manually patch with every upgrade, so here are some of the modifications 
I have made (patch included), in hopes they make it into the newer 
versions or help others:

1.- Mail::SpamAssassin::Client

Originally this really useful class only allowed for connection via TCP 
sockets. I have (borrowing code :)) modified it so that it also handles 
connection via Unix sockets.

2.- Mail::SpamAssassin::Config

++ Added Option 'report_score'.

The reason for this addition is that in order to allow user prefs, with 
regards to handling of SPAM, in order for SpamAssassin to rewrite the 
message, instead of doing it by checking the message score versus the 
'required_score' it now does it against 'report_score', thus offering 
more flexibility with the handling of false positives/negatives, as 
illustrated with these settings:

rewrite_header Subject [SPAM][_SCORE_]
report_safe 1 (default)
required_score 5.0
report_score 7.0

Message with score < 5.0 points, message is ham
 - normal processing
Message with  5.0 <= score <= 7.0, message is spam:
 - only subject is modified to indicate this fact
Message with score >= 7.0, message is spam:
 - subject is modified to indicate this fact
 - message is rewritten as specified by report_safe

3.- Mail::SpamAssassin::PerMsgStatus

++ Added Method is_report().

This can be used in the same manner as is_spam() method.

++ Added Method get_scores_of_tests_hit().

This can help with debugging by seeing the point score associated with 
the tests that hit.

++ Added Method get_report_score().

To retrieve the configured report_score setting.

++ Modified Precision

Generally, scores have three digit precision, but when reporting, 
sometimes the score is rounded to an integer or shown with only one 
digit precision. This can sometimes lead to confusion and rounding 
errors, so I modified reports to show three digit precision as the norm 
and use two digit precision when scores > 10, and integers only when 
scores > 100 (whitelist or GTUBE).

4.- spamd/spamc

Allowing user preferences in an ISP environment can be troublesome, 
specially when you have virtual users, there is no place to store each 
user's preferences, unless you go the SQL route.  One of the most common 
changes users make is to raise/lower the thresholds for spam detection, 
so I have modified the source to allow the following additional options 
to spamc:

++  -m value   Use value as required_score instead of default.
++  -M value   Use value as report_score instead of default.

By allowing spamc to pass these values to a modified spamd that can 
understand and modify these configuration options on a per scan basis, 
any user (even virtual ones) can get treated differently with regards to 
required_score and report_score, without the need to read a 
configuration file. If specified, these values are passed as headers, to 
spamd, thus extending the current spamd protocol so that it understands 
these headers, and modifies the child's SpamAssassin object respectively.

Howto modify calls to spamd with the correct spamc arguments is left as 
an exercise to the user...

5.- spamd

I also noticed that for those OS that allow it, $0 is changed for each 
child in order to differentiate children from it's parent.  In order to 
better monitor what each child is doing, I have modified spamd to also 
place the number of scans processed by each child as well as the status 
in $0 so that when monitored by 'top', we can see what each child is doing:

  PID PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7785 17   0 95932  80m 2844 S   20  8.5   0:43.73 spamd child:  42/75 
processing a.b.c.d
 7787 16   0 86004  71m 2832 S    0  7.5   0:22.65 spamd child:  19/75 done
32447 16   0 83832  69m 2832 S    0  7.4   0:19.21 spamd child:  20/75 done
 8385 16   0 75108  60m 2776 S    0  6.4   0:03.07 spamd child:   5/75 done
 9100 19   0 74348  58m 2412 S    0  6.2   0:00.91 spamd child:   0/75 
initialized


where a.b.c.d is the IP address from the machine who sent the message we 
need to "check", and will be 127.0.0.1 when done via Unix Socket.

-- 
Jorge Valdes
jvaldes@intercom.com.sv



sa-learn question

Posted by "J." <sw...@yahoo.com>.
After about a year of running sa-learn from a user that isn't the user
that spamd is running under, I've changed things to allow login to the
spamd user account (qscand). So, will running this command now work for
helping train my bayes files?

sa-learn --showdots --mbox --spam /home/domainmail/mail/Spam
sa-learn --showdots --mbox --ham /home/domainmail/mail/Ham

All mail for our domain is funneled into one user account and I'll move
false positives and false negs into the right pine folders every few
days.

Thanks.


 
____________________________________________________________________________________
Looking for earth-friendly autos? 
Browse Top Cars by "Green Rating" at Yahoo! Autos' Green Center.
http://autos.yahoo.com/green_center/

Re: Feature Request

Posted by Loren Wilton <lw...@earthlink.net>.
> I have been using SA since ~2.60 and because I work for an ISP, I need
> to be more tolerant than most with regards to handling email. With this
> in mind I have made a few modifications to the STOCK version and have to
> manually patch with every upgrade, so here are some of the modifications
> I have made (patch included), in hopes they make it into the newer
> versions or help others:

Suggest you open one or more Bugzilla enhancement tickets for these patches. 
They will probably get lost in the noise here.  In Bugzilla they will at 
least be in the system.

        Loren