You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2007/04/17 10:40:53 UTC

Re: sa-learn: have i seen this before?

Faisal -- could you open an enhancement request on the SpamAssassin
bugzilla?  This would be a useful feature.

--j.

Faisal N Jawdat writes:
> On Apr 16, 2007, at 9:34 PM, Matt Kettler wrote:
> > Try to learn it, if it comes back with something to the affect of:
> > "learned from 0 messages, processed 1.." then it's already been  
> > learned.
> 
> this seems to be the common suggestion.
> 
> it has a couple drawbacks, as i see it:
> 
> 1.  it's relatively cpu-intensive if i want to do it all the time  
> (e.g. scan my spam folder to learn only the messages which haven't  
> already been learned)
> 
> 2.  which way do i learn it.
> 
> to step back a bit, my final goal is to be able to figure out which  
> messages in a folder haven't been learned, and learn only those.  in  
> the ideal situation i can also figure out (ahead of time), whether a  
> learned message was learned as ham or spam.
> 
> this may be semi-impossible.
> 
> on the other hand, what can i learn from the headers?
> 
> e.g. it looks like autolearn=[something] will tell me about the  
> autolearner, but is there anything for manual learns?
> 
> where i'm going with all this:
> 
> i can run a cron job to learn the contents of different mailboxes on  
> a regular basis.  what i do now is have a TrainSpam and TrainHam  
> mailbox, and when something gets misfiled (in Spam or any ham folder)  
> i just move it in there.  every 5 minutes a cron job goes through and  
> scans things appropriately. <http://www.faisal.com/software/sa- 
> harvest/quicktrain.html>
> 
> first, i'd like to be able to do that within the mailboxes rather  
> than using special mailboxes.
> 
> second, i'd like to be able to key off junk mail flags set by the  
> client (thunderbird, apple mail).  i'm using dovecot, so it's a  
> fairly simple matter of parsing Maildir filenames, but to do it right  
> i need to combine the knowledge with what spamassassin thinks.
> 
> i might just go write a dovecot plugin to do this in real-time, but  
> i'm not feeling the motivation to break the mail server with a  
> misplaced pointer.
> 
> -faisal