You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Robert Nicholson <ro...@elastica.com> on 2006/07/30 18:46:44 UTC

Retagging false positives?

So what is the best strategy to retag a false positive when the  
server uses imap Maildir folders.


Re: Retagging false positives?

Posted by John Andersen <js...@pen.homeip.net>.
On Sunday 30 July 2006 22:37, Logan Shaw wrote:
> On Sun, 30 Jul 2006, Loren Wilton wrote:
> > If you know how to run SA to relearn the message, why not just use SA to
> > strip the headers off the message?  It certainly knows how to do that,
> > and I'm pretty sure it will output the clean file.
>
> Because if I am understanding this right (not certain of that
> at all), his goal is to clean up the mess that is made when
> a message is tagged as a false positive and the problem he is
> facing in achieving that goal is finding the message once it's
> further down in the chain (i.e. after it has been delivered
> to an IMAP folder).
>
>    - Logan

Well if he knows its a false positive, it must be because the user
is already looking at the message.  That is why everyone here
suggested he move it to another folder and use that folder as
input to the un-do process, what ever that process may be.
All the sa-whatever scripts work on entire folders as well as
individual messages.

-- 
_____________________________________
John Andersen

Re: Retagging false positives?

Posted by Robert Nicholson <ro...@elastica.com>.
Yes that's correct. when you relearn a message for bayes you don't  
need to write a new message at all. You just update the db based on  
the contents of the message as HAM or SPAM. that's easy because it's  
easy to query the message back using IMAP and the message-id. But I  
don't think IMAP will easily let you  rewrite the message. I say  
don't think because I honestly haven't looked too hard at the IMAP  
apis yet. I use Mail::Audit to store the message as it arrives in a  
maildir folder. and I use Mail::IMAPClient to read the message to  
relearn based on it's message id.

On Jul 31, 2006, at 1:37 AM, Logan Shaw wrote:

> On Sun, 30 Jul 2006, Loren Wilton wrote:
>> If you know how to run SA to relearn the message, why not just use  
>> SA to strip the headers off the message?  It certainly knows how  
>> to do that, and I'm pretty sure it will output the clean file.
>
> Because if I am understanding this right (not certain of that
> at all), his goal is to clean up the mess that is made when
> a message is tagged as a false positive and the problem he is
> facing in achieving that goal is finding the message once it's
> further down in the chain (i.e. after it has been delivered
> to an IMAP folder).
>
>   - Logan

Re: Retagging false positives?

Posted by Logan Shaw <ls...@emitinc.com>.
On Sun, 30 Jul 2006, Loren Wilton wrote:
> If you know how to run SA to relearn the message, why not just use SA to 
> strip the headers off the message?  It certainly knows how to do that, and 
> I'm pretty sure it will output the clean file.

Because if I am understanding this right (not certain of that
at all), his goal is to clean up the mess that is made when
a message is tagged as a false positive and the problem he is
facing in achieving that goal is finding the message once it's
further down in the chain (i.e. after it has been delivered
to an IMAP folder).

   - Logan

Re: Retagging false positives?

Posted by Loren Wilton <lw...@earthlink.net>.
If you know how to run SA to relearn the message, why not just use SA to 
strip the headers off the message?  It certainly knows how to do that, and 
I'm pretty sure it will output the clean file.

        Loren


Re: Retagging false positives?

Posted by Robert Nicholson <ro...@elastica.com>.
yeah I have a method of correcting bayes whereby I will send myself  
the message id of the message and it will then use that using IMAP to  
fetch the message and then relearn it using SA's API's at the perl  
level but I'm just looking for something that can actually strip the  
headers of the message. When you relearn you don't actually need to  
touch the contents of the actual message but I have rules on the  
client that colorize SPAM so I also want the original headers to be  
removed. Preferably without needing to run a script knowing the  
actual filename itself. I suppose I could write something that hunted  
down the message by message Id and then stripped out the headers and  
rewrite the file but I was hoping for a more elegant solution than that.

On Jul 30, 2006, at 5:08 PM, mouss wrote:

> Robert Nicholson wrote:
>> The issue for me is that I need to strip out the SA headers from  
>> the message after it's tagged as SPAM. But the problem is always  
>> that you cannot get the physical filename from the actual contents  
>> of the message itself.
>
> Use a special folder for false positives  
> (.Filter.Error, .Junk.Error or whatever), then you don't care for  
> filenames...
>
> now, you have two choices:
>
> 1- mv all FPs from the FP folder to an intermediary directory
> 2- run sa-learn on the intermediary directory
> 3- modify the headers in each file in the intermediary directory  
> (formail/reformail may help)
> 4- redeliver the modified files (maildrop/procmail)
>
> After step 2, you may optionally rerun spamassassin -t (or  
> spamc...) to check if SA will still tag the message as spam or not,  
> and if it's still an FP, you may want to log the rules that are  
> triggered so that you change them if they catch many FPs on your  
> site. This however requires more work than it may appear.
>

Re: Retagging false positives?

Posted by mouss <us...@free.fr>.
Robert Nicholson wrote:
> The issue for me is that I need to strip out the SA headers from the 
> message after it's tagged as SPAM. But the problem is always that you 
> cannot get the physical filename from the actual contents of the 
> message itself.

Use a special folder for false positives (.Filter.Error, .Junk.Error or 
whatever), then you don't care for filenames...

now, you have two choices:

1- mv all FPs from the FP folder to an intermediary directory
2- run sa-learn on the intermediary directory
3- modify the headers in each file in the intermediary directory 
(formail/reformail may help)
4- redeliver the modified files (maildrop/procmail)

After step 2, you may optionally rerun spamassassin -t (or spamc...) to 
check if SA will still tag the message as spam or not, and if it's still 
an FP, you may want to log the rules that are triggered so that you 
change them if they catch many FPs on your site. This however requires 
more work than it may appear.


Re: Retagging false positives?

Posted by John Andersen <js...@pen.homeip.net>.
On Sunday 30 July 2006 09:22, Robert Nicholson wrote:
> The issue for me is that I need to strip out the SA headers from the
> message after it's tagged as SPAM. But the problem is always that you
> cannot get the physical filename from the actual contents of the
> message itself.

Why not move all false positives to a directory reserved for 
that purpose and run 

sa-learn --ham <insert dir name here>

-- 
_____________________________________
John Andersen

Re: Retagging false positives?

Posted by Robert Nicholson <ro...@elastica.com>.
The issue for me is that I need to strip out the SA headers from the  
message after it's tagged as SPAM. But the problem is always that you  
cannot get the physical filename from the actual contents of the  
message itself.

On Jul 30, 2006, at 11:46 AM, Robert Nicholson wrote:

> So what is the best strategy to retag a false positive when the  
> server uses imap Maildir folders.
>