You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Dan Horne <da...@taisweb.net> on 2006/10/11 17:41:25 UTC

sa-learn and POP3 accounts

I have a working SA install using Bayes.  My webmail users can "report
spam" that makes it into their inboxes, and those .eml files get copied
into a mailbox that gets regular scans by sa-learn.  No problem so far.
 
However most of my users are POP3 users.  I know that if they just
forward false-negatives to a spam mailbox it can potentially confuse the
Bayes db.  I've read that even forwarding as an attachment is no good.
Is there a solution available by which a POP3 user can report a message
as spam by some means and get that message properly into Bayes?  

Dan Horne
Web Services Administrator
TAIS / Wilcox Travel Agency
support@taisweb.net 

 

CONFIDENTIALITY NOTICE:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
 
SPAM-FREE 1.0(2476)

RE: sa-learn and POP3 accounts

Posted by Dan Horne <da...@taisweb.net>.

> -----Original Message-----
> From: Dan Horne [mailto:dan@taisweb.net] 
> Sent: Thursday, October 12, 2006 3:29 PM
> To: John DeYoung
> Cc: users@spamassassin.apache.org
> Subject: RE: sa-learn and POP3 accounts
> 
> Actually, based on another recent thread here ("Parsing 
> Email") I was able to take the perl script that Vincent Li 
> posted and modify it to work the way I want:

Oh yeah, before I forget, thanks to Vincent Li for posting the original
perl script that I (lightly) modified.  It was exactly what I needed.

CONFIDENTIALITY NOTICE:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
 
SPAM-FREE 1.0(2476)

RE: sa-learn and POP3 accounts

Posted by Dan Horne <da...@taisweb.net>.

Actually, based on another recent thread here ("Parsing Email") I was
able to take the perl script that Vincent Li posted and modify it to
work the way I want:

1) user forwards spam message AS ATTACHMENT to a pre-defined email
address
2) postfix pipes emails to this address to the modified script via local
alias
3) the script strips out all attachments defined as content-type:
message/*
4) It then generates a UUID and saves the attached messages to /tmp/spam
as {UUID}.eml (each message as a unique file name)
5) a separate cron script then runs on a schedule to pipe all messages
in /tmp/spam into sa-learn and delete them afterwards

This method has a couple points that I really like:
 1)it intelligently handles multiple attached messages
 2)it ignores (discards) messages that don't have any messages attached

I'm posting it here, it uses perl and Mail::SpamAssassin::Message as
Vincent Li's original script did.  It also uses Data::UUID to generate
the unique file names.  All you need to do is make sure those perl
modules are installed and change the $path variable to point to the
location you want the attachments to be saved (and make sure everything
has the correct permissions of course).

I'm not a perl guru, so I didn't want to take the chance of trying to
run sa-learn directly from this script, though it'd probably be pretty
easy to do (any hints?).  If anyone sees any problems with what I have
set up, I'd appreciate a nudge in the right direction.

WARNING: lines may wrap
_________________________

#!/usr/bin/perl

use strict;
use warnings;

my @message = <STDIN>;
my $path = "/tmp/spam/";

use Mail::SpamAssassin::Message;
use Data::UUID;

my $msg = Mail::SpamAssassin::Message->new(
     {
       'message' => \@message,
     }
) || die "Message error?";

foreach my $p ($msg->find_parts(qr/^message\b/i, 0)) {
     eval {
            no warnings ;
            my $type = $p->{'type'};
            my $ug = new Data::UUID;
            my $uuid1 = $ug->create_str();
            my $attachname = $path . $uuid1 . ".eml";
            open OUT, ">", "$attachname" || die "Can't write file
$attachname:$!";
            binmode OUT;
            print OUT $p->decode();
     };
}
__END__
________________________________

	From: John DeYoung [mailto:john@techsuperpowers.com] 
	Sent: Thursday, October 12, 2006 1:16 PM
	To: Dan Horne
	Cc: users@spamassassin.apache.org
	Subject: Re: sa-learn and POP3 accounts
	
	we're in the same boat - our solution is to recommend that our
users store copies of messages on the server for 1-7 days; then, when a
message isn't caught, it's still sitting in their inbox, which they can
noodle with via webmail.

	we run per-user, and use an IMAP folder for junkmail, so it's
never seen by POP3 users, but it's always there via webmail.  it's
probably pretty common, since i managed to think of it on my own...

	i suppose if you're running site-wide, you could have users
redirect to whichever account you've set up to receive low-scored spam.
someone who knows more might point out some flaw in there somewhere, but
it seems plausible to me, at least.

	best,
	-john.
	-- 

	John DeYoung

	Tech Superpowers, Inc.

	


CONFIDENTIALITY NOTICE:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
 
SPAM-FREE 1.0(2476)

Re: sa-learn and POP3 accounts

Posted by John DeYoung <jo...@techsuperpowers.com>.

On Oct 11, 2006, at 11:41 AM, Dan Horne wrote:

> I have a working SA install using Bayes.  My webmail users can  
> "report spam" that makes it into their inboxes, and those .eml  
> files get copied into a mailbox that gets regular scans by sa- 
> learn.  No problem so far.
>
> However most of my users are POP3 users.  I know that if they just  
> forward false-negatives to a spam mailbox it can potentially  
> confuse the Bayes db.  I've read that even forwarding as an  
> attachment is no good.  Is there a solution available by which a  
> POP3 user can report a message as spam by some means and get that  
> message properly into Bayes?

we're in the same boat - our solution is to recommend that our users  
store copies of messages on the server for 1-7 days; then, when a  
message isn't caught, it's still sitting in their inbox, which they  
can noodle with via webmail.

we run per-user, and use an IMAP folder for junkmail, so it's never  
seen by POP3 users, but it's always there via webmail.  it's probably  
pretty common, since i managed to think of it on my own...

i suppose if you're running site-wide, you could have users redirect  
to whichever account you've set up to receive low-scored spam.   
someone who knows more might point out some flaw in there somewhere,  
but it seems plausible to me, at least.

best,
-john.
-- 
John DeYoung
Tech Superpowers, Inc.