You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Kyle Quillen <kq...@wifi7.com> on 2007/01/03 17:50:27 UTC

White Listing

Hello all,

I am looking for an easy way for my spamassassin to relearn messages
marked as spam that users would like to get.  Would it be safe and avoid
bayesian poisoning if I were to setup an email box such as
nonspam@domain.com and have users forward nonspam emails to this email
address and then learn it as ham?

Thanks 
Q 


RE: White Listing

Posted by Dan Horne <da...@taisweb.net>.
Below is a link to archive posts by myself explaining how we do this.
Basically forward as attachment feeds to a script that strips out the
attachment and stores it.  Separate cron job sa-learns the stored
messages.  The main script could probably be modified to feed sa-learn
directly, cutting out the need for the cron job.

http://www.nabble.com/sa-learn-and-POP3-accounts-tf2424315.html#a6783285

Also, here are some notes I had regarding this:

> 1) user forwards spam message AS ATTACHMENT to a pre-defined email 
> address
> 

I tell my users to forward as attachment to report-spam@example.com.

> 2) postfix pipes emails to this address to the modified script via 
> local alias
> 

I am using virtual users.  I had to make sure that postfix knows how to
handle local aliases.  From my main.cf:

alias_maps = hash:/etc/aliases

...pointing to the local aliases file.  Then within that file, I set up
a local alias to pipe all input to the script.  From my /etc/aliases:

spam-bayes:     "| /etc/scripts/strip_attached_messages.pl"

... Be sure to run the command 'newaliases' after updating the aliases
file.  Then you use virtual_alias_maps to set the
"report-spam@example.com" address to forward to the alias you set up.  I
use MySQL for my virtual_alias_maps, but if you use a file it would have
something like:

report-spam@example.com		spam-bayes

That will forward all emails sent to testing@example.com to the
spam-bayes alias, which will in turn pipe them into your script.

> 3) the script strips out all attachments defined as content-type:
> message/*
> 
<...>
> 5) a separate cron script then runs on a schedule to pipe all messages

> in /tmp/spam into sa-learn and delete them afterwards
> 
> Need to setup the crontab to call this script

My cron script:
------------------------------------
#!/bin/sh

/usr/local/bin/sa-learn --spam --username=vscan /tmp/spam/ /bin/rm
/tmp/spam/*
------------------------------------

--Username=vscan because I am using a single bayes database for all
mail, rather than individual bayes db's for each user.  This method
wouldn't work for individual bayes setups.  My crontab line:

53      1       *       *       *       root
/etc/scripts/train-bayes.sh

... To run it once per day at 1:53 am.  I get a nice email every morning
to root which says:

Learned tokens from 102 message(s) (102 message(s) examined)

The only thing to configure in the script is the path where you want the
attached messages stored until your sa-learn script runs.  I save mine
to /tmp/spam/, and that's where the train-bayes.sh script looks for
them.

Hope this helps.  It has been working very well for me so far.

-DH
 

> -----Original Message-----
> From: Alexander Veit [mailto:list@nezwerg.de] 
> Sent: Wednesday, January 03, 2007 3:33 PM
> To: users@spamassassin.apache.org
> Subject: Re: White Listing
> 
> Nigel Frankcom wrote:
> > Forwarding is not a good idea, it adds and or changes the 
> headers in 
> > the mail.
> 
> Forward as attachment(s) could be a solution since original 
> mail headers are kept intact. I've asked a similar question 
> on this list some days ago, but nobody could say if there's a 
> common practice how to feed such messages into spamassassin 
> on the server.
> 
> > There have been several systems discussed in the last few 
> months using 
> > IMAP, it may be worth digging through the archives for them.
> 
> Sounds like misusing IMAP ;-)
> 
> --
> Cheers,
> Alex
> 
> 

CONFIDENTIALITY NOTICE:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
 
SPAM-FREE 1.0(2476)



Re: White Listing

Posted by Alexander Veit <li...@nezwerg.de>.
Nigel Frankcom wrote:
> Forwarding is not a good idea, it adds and or changes the headers in
> the mail.

Forward as attachment(s) could be a solution since original mail headers 
are kept intact. I've asked a similar question on this list some days 
ago, but nobody could say if there's a common practice how to feed such 
messages into spamassassin on the server.

> There have been several systems discussed in the last few months using
> IMAP, it may be worth digging through the archives for them.

Sounds like misusing IMAP ;-)

-- 
Cheers,
Alex


Re: White Listing

Posted by Nigel Frankcom <ni...@blue-canoe.net>.
On Wed, 03 Jan 2007 11:50:27 -0500, Kyle Quillen <kq...@wifi7.com>
wrote:

>Hello all,
>
>I am looking for an easy way for my spamassassin to relearn messages
>marked as spam that users would like to get.  Would it be safe and avoid
>bayesian poisoning if I were to setup an email box such as
>nonspam@domain.com and have users forward nonspam emails to this email
>address and then learn it as ham?
>
>Thanks 
>Q 

Forwarding is not a good idea, it adds and or changes the headers in
the mail.

There have been several systems discussed in the last few months using
IMAP, it may be worth digging through the archives for them.

There are specific methods of whitelisting particular addresses or
domains within SA.

http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#whitelist_and_blacklist_options

KR

Nigel

Re: White Listing

Posted by Theo Van Dinter <fe...@apache.org>.
On Wed, Jan 03, 2007 at 12:51:09PM -0800, Bret Miller wrote:
> There was a script posted a while back as an example of how you could
[...]
> my @message = <STDIN>;
[...]
> my $msg = Mail::SpamAssassin::Message->new(
>      {
>        'message' => \@message,
>      }

fwiw, Message will read from STDIN by default, so you can just call
Message->new() and it'll DTRT for you. :)

-- 
Randomly Selected Tagline:
"She taught me Cuban, which is a lot like Spanish only without as many
 words for luxury items." - Emo Philips

Re: White Listing

Posted by maillist <ma...@emailacs.com>.
Bret Miller wrote:
>> I am looking for an easy way for my spamassassin to relearn messages
>> marked as spam that users would like to get.  Would it be 
>> safe and avoid
>> bayesian poisoning if I were to setup an email box such as
>> nonspam@domain.com and have users forward nonspam emails to this email
>> address and then learn it as ham?
>>     
>
> There was a script posted a while back as an example of how you could
> detach "forward as attachment" messages into a folder for learning. I
> don't remember the author, but I'm reposting the script since it could
> be useful here. 
>
> WARNING: lines may wrap
> _________________________
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> my @message = <STDIN>;
> my $path = "/tmp/spam/";
>
> use Mail::SpamAssassin::Message;
> use Data::UUID;
>
> my $msg = Mail::SpamAssassin::Message->new(
>      {
>        'message' => \@message,
>      }
> ) || die "Message error?";
>
> foreach my $p ($msg->find_parts(qr/^message\b/i, 0)) {
>      eval {
>             no warnings ;
>             my $type = $p->{'type'};
>             my $ug = new Data::UUID;
>             my $uuid1 = $ug->create_str();
>             my $attachname = $path . $uuid1 . ".eml";
>             open OUT, ">", "$attachname" || die "Can't write file
> $attachname:$!";
>             binmode OUT;
>             print OUT $p->decode();
>      };
> }
> __END__
> ________________________________
>
>
>
>
>   
There is a script that ships with spamassassin, it's called "mboxsplit", 
and it rocks.  It is in the tools directory.  It breaks the mbox into 
files named 1, 2, 3, 4, 5.....  It rocks.

-=Aubrey=-


RE: White Listing

Posted by Bret Miller <br...@wcg.org>.
> I am looking for an easy way for my spamassassin to relearn messages
> marked as spam that users would like to get.  Would it be 
> safe and avoid
> bayesian poisoning if I were to setup an email box such as
> nonspam@domain.com and have users forward nonspam emails to this email
> address and then learn it as ham?

There was a script posted a while back as an example of how you could
detach "forward as attachment" messages into a folder for learning. I
don't remember the author, but I'm reposting the script since it could
be useful here. 

WARNING: lines may wrap
_________________________

#!/usr/bin/perl

use strict;
use warnings;

my @message = <STDIN>;
my $path = "/tmp/spam/";

use Mail::SpamAssassin::Message;
use Data::UUID;

my $msg = Mail::SpamAssassin::Message->new(
     {
       'message' => \@message,
     }
) || die "Message error?";

foreach my $p ($msg->find_parts(qr/^message\b/i, 0)) {
     eval {
            no warnings ;
            my $type = $p->{'type'};
            my $ug = new Data::UUID;
            my $uuid1 = $ug->create_str();
            my $attachname = $path . $uuid1 . ".eml";
            open OUT, ">", "$attachname" || die "Can't write file
$attachname:$!";
            binmode OUT;
            print OUT $p->decode();
     };
}
__END__
________________________________