You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by itdelany <it...@delany.com.ar> on 2006/11/02 14:08:34 UTC

Processed Spam, what to do?

Hi :)

I successfully processed ham and spam emails with sa-learn, throught spam
and ham mail accounts, now, i will wait for users to send me new spam
messages to rich the bayesian filter.
What is the best to do with the old processed spam messages? deleted them o
re-apply the learn on them with the new messages?

Thanks
-- 
View this message in context: http://www.nabble.com/Processed-Spam%2C-what-to-do--tf2559659.html#a7133188
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Processed Spam, what to do?

Posted by Matt Kettler <mk...@verizon.net>.
itdelany wrote:
> Hi :)
>
> I successfully processed ham and spam emails with sa-learn, throught spam
> and ham mail accounts, now, i will wait for users to send me new spam
> messages to rich the bayesian filter.
> What is the best to do with the old processed spam messages? deleted them o
> re-apply the learn on them with the new messages?
>   

Delete them. sa-learn will ignore them if you try to train them again.


Re: Processed Spam, what to do?

Posted by "John D. Hardin" <jh...@impsec.org>.
On Thu, 2 Nov 2006, itdelany wrote:

> To backup learning files, do i only have to copy bayes_seen and bayes_toks
> right ?

I was speaking of backing up the original messages.

Backing up the bayes_* files would let you restore the database to a
particular point in time, which is useful if you know that it went bad
at a particular point in time. That would save you re-learning from
scratch up to that point in time. You'd restore the old bayes_* files,
examine the corpa (saved original messages) past that point to correct
erroneus classifications (e.g. a user dropped a bunch of spams in the
ham folder), and then re-learn from that point forward to bring it
current.

Does that make sense?

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The first time I saw a bagpipe, I thought the player was torturing
  an octopus. I was amazed they could scream so loudly.
                                        -- cat_herder_5263 on Y! SCOX
-----------------------------------------------------------------------
 5 days until the campaign ads stop


Re: Processed Spam, what to do?

Posted by itdelany <it...@delany.com.ar>.
I already deleted them based on Matt's answer, but your point is good.. I'll
keep some of them with the 2nd learning.

To backup learning files, do i only have to copy bayes_seen and bayes_toks
right ?

thanks


John D. Hardin wrote:
> 
> 
> 
> It depends on the size and whether you are doing purely manual
> training.
> 
> I believe in keeping them around (though aged or saved in an archive
> directory, so that it doesn't try to re-learn them every time) in case
> I need to retrain from scratch for some reason.
> 
> My nightly learning script (posted here, check the archives) ignores
> message files that haven't been modified in the last three days, and
> I rotate the files where users save messages-to-be-learned monthly, so
> that at most sa-learn only examines one month of messages per user,
> regardless of how large the corpus gets.
> 
> 'course, I only have four users...
> 
> --
>  John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
>  jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
>  key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>   The first time I saw a bagpipe, I thought the player was torturing
>   an octopus. I was amazed they could scream so loudly.
>                                         -- cat_herder_5263 on Y! SCOX
> -----------------------------------------------------------------------
>  5 days until the campaign ads stop
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Processed-Spam%2C-what-to-do--tf2559659.html#a7142792
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


R: Processed Spam, what to do?

Posted by Giampaolo Tomassoni <g....@libero.it>.
> Da: John D. Hardin [mailto:jhardin@impsec.org]
> 
> 'course, I only have four users...
> 

Wow! I though I was the tinniest here: I got around 60 (can't get the exact number: they're too many :) ).

-----------------------------------
Giampaolo Tomassoni - IT Consultant
Piazza VIII Aprile 1948, 4
I-53044 Chiusi (SI) - Italy
Ph: +39-0578-21100

MAI inviare una e-mail a:
NEVER send an e-mail to:
 rainbowl@tomassoni.eu


Re: Processed Spam, what to do?

Posted by "John D. Hardin" <jh...@impsec.org>.
On Thu, 2 Nov 2006, itdelany wrote:

> I successfully processed ham and spam emails with sa-learn, throught spam
> and ham mail accounts, now, i will wait for users to send me new spam
> messages to rich the bayesian filter.
> What is the best to do with the old processed spam messages? deleted them o
> re-apply the learn on them with the new messages?

It depends on the size and whether you are doing purely manual
training.

I believe in keeping them around (though aged or saved in an archive
directory, so that it doesn't try to re-learn them every time) in case
I need to retrain from scratch for some reason.

My nightly learning script (posted here, check the archives) ignores
message files that haven't been modified in the last three days, and
I rotate the files where users save messages-to-be-learned monthly, so
that at most sa-learn only examines one month of messages per user,
regardless of how large the corpus gets.

'course, I only have four users...

--
 John Hardin KA7OHZ    ICQ#15735746    http://www.impsec.org/~jhardin/
 jhardin@impsec.org    FALaholic #11174    pgpk -a jhardin@impsec.org
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  The first time I saw a bagpipe, I thought the player was torturing
  an octopus. I was amazed they could scream so loudly.
                                        -- cat_herder_5263 on Y! SCOX
-----------------------------------------------------------------------
 5 days until the campaign ads stop