You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Daniel Cañas <da...@unity.ncsu.edu> on 2005/02/14 20:50:42 UTC

sa-learn ham from my emails

I have over 2000 emails that I have as ham and would like to feed to 
sa-learn..

The emails are all mine (that is they are addresed to me) is this a 
problem for sa-learn?

Will it learn the headers and mark my email address as a token for 
ham... causing bayes to not work correctly for my address?

I have legit spam that I want to learn but I am afraid to do it if I 
don't have corresponding number of ham.

I guess the question is:
Is feeding a bunch of emails addressed to a single person into sa-learn 
a good thing to do?

Re: sa-learn ham from my emails

Posted by Daniel Cañas <da...@unity.ncsu.edu>.

On Feb 14, 2005, at 2:54 PM, Jim Maul wrote:

> Daniel Cañas wrote:
>> I have over 2000 emails that I have as ham and would like to feed to 
>> sa-learn..
>> The emails are all mine (that is they are addresed to me) is this a 
>> problem for sa-learn?
>> Will it learn the headers and mark my email address as a token for 
>> ham... causing bayes to not work correctly for my address?
>> I have legit spam that I want to learn but I am afraid to do it if I 
>> don't have corresponding number of ham.
>> I guess the question is:
>> Is feeding a bunch of emails addressed to a single person into 
>> sa-learn a good thing to do?
>
> I dont believe this to be an issue.  Not to mention that feeding the 
> same number of ham/spam is not necessary either.  Many people have 
> bayes databases with largely different spam/ham numbers.
>

cool.. Thanks...

> -Jim
>
>
>

Re: sa-learn ham from my emails

Posted by Jim Maul <jm...@elih.org>.

Daniel Cañas wrote:
> I have over 2000 emails that I have as ham and would like to feed to 
> sa-learn..
> 
> The emails are all mine (that is they are addresed to me) is this a 
> problem for sa-learn?
> 
> Will it learn the headers and mark my email address as a token for 
> ham... causing bayes to not work correctly for my address?
> 
> I have legit spam that I want to learn but I am afraid to do it if I 
> don't have corresponding number of ham.
> 
> I guess the question is:
> Is feeding a bunch of emails addressed to a single person into sa-learn 
> a good thing to do?

I dont believe this to be an issue.  Not to mention that feeding the 
same number of ham/spam is not necessary either.  Many people have bayes 
databases with largely different spam/ham numbers.

-Jim

Re: sa-learn ham from my emails

Posted by Thomas Arend <ml...@arend-whv.info>.

Am Montag, 14. Februar 2005 23:13 schrieb Daniel Cañas:
> On Feb 14, 2005, at 3:34 PM, Thomas Arend wrote:
> > Am Montag, 14. Februar 2005 20:50 schrieb Daniel Cañas:
> >> I have over 2000 emails that I have as ham and would like to feed to
> >> sa-learn..
> >
[..]

> >> I have legit spam that I want to learn but I am afraid to do it if I
> >> don't have corresponding number of ham.
> >
> > To my opinion and expirience this is bullshit.
>
> Cool.. this is good to know as I can collect tons of spam.

I have a ratio of 1 : 40 and bayes works fine.


Thomas
-- 
icq:133073900
http://www.t-arend.de

Re: sa-learn ham from my emails

Posted by Daniel Cañas <da...@unity.ncsu.edu>.

On Feb 14, 2005, at 3:34 PM, Thomas Arend wrote:

> Am Montag, 14. Februar 2005 20:50 schrieb Daniel Cañas:
>> I have over 2000 emails that I have as ham and would like to feed to
>> sa-learn..
>
> You should train them as ham.

That is my plan

>
>>
>> The emails are all mine (that is they are addresed to me) is this a
>> problem for sa-learn?
>
> Where is the problem? If they are not for you, why did you get them?
>>
>> Will it learn the headers and mark my email address as a token for
>> ham... causing bayes to not work correctly for my address?
>
> The address will be one token. If you feed spam to sa-learn your 
> address will
> be also a token for spam. But bayes does not work only on one token.
>
>>
>> I have legit spam that I want to learn but I am afraid to do it if I
>> don't have corresponding number of ham.
>
> To my opinion and expirience this is bullshit.

Cool.. this is good to know as I can collect tons of spam.

>
>> I guess the question is:
>> Is feeding a bunch of emails addressed to a single person into 
>> sa-learn
>> a good thing to do?
>
> Why not? I run spamassassin on a single user system. You can have an
> individual database for every user or a common db for all users. In 
> the last
> case you should train spam not only for one user.
>

I just switched to sitewide bayes and the spam I train is addressed to 
different users.
Mostly non-existent users on my system whose mail is forwarded to the 
admin account.

>
> Thomas
> -- 
> icq:133073900
> http://www.t-arend.de

Re: sa-learn ham from my emails

Posted by Thomas Arend <ml...@arend-whv.info>.

Am Montag, 14. Februar 2005 20:50 schrieb Daniel Cañas:
> I have over 2000 emails that I have as ham and would like to feed to
> sa-learn..

You should train them as ham.

>
> The emails are all mine (that is they are addresed to me) is this a
> problem for sa-learn?

Where is the problem? If they are not for you, why did you get them?

>
> Will it learn the headers and mark my email address as a token for
> ham... causing bayes to not work correctly for my address?

The address will be one token. If you feed spam to sa-learn your address will 
be also a token for spam. But bayes does not work only on one token.

>
> I have legit spam that I want to learn but I am afraid to do it if I
> don't have corresponding number of ham.

To my opinion and expirience this is bullshit.

> I guess the question is:
> Is feeding a bunch of emails addressed to a single person into sa-learn
> a good thing to do?

Why not? I run spamassassin on a single user system. You can have an 
individual database for every user or a common db for all users. In the last 
case you should train spam not only for one user.


Thomas
-- 
icq:133073900
http://www.t-arend.de