You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Philipp Ewald <ph...@digionline.de> on 2019/10/22 10:21:45 UTC

Question about Bayes implementation

Hi folks,

at this point i split all my SPAM mail to get the attachment to create a 
hash table. (but this is not my point)

Its also possible to split my SPAM into html/text, plain/text and 
headers to.
Debian package: ripmime

Now i ask myself:
If i learn spamassassin with my mails should i learn with whole mail or 
can i split them and learn only plain/text part? ore wich part would be 
"the best" to learn?

thanks for help

kind regards
-- 
Philipp Ewald
Administrator

Re: Autolearn HAM with spamscore 996

Posted by John Hardin <jh...@impsec.org>.

On Tue, 22 Oct 2019, RW wrote:

> If you are in a position to train manually, I think it's best to
> turn-off auto-learning.

+1

Auto-learn is primarily for large sites with a diverse user base (e.g. an 
ISP).


-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The third basic rule of firearms safety:
   Keep your booger hook off the bang switch!
-----------------------------------------------------------------------
  936 days since the first commercial re-flight of an orbital booster (SpaceX)

Re: Autolearn HAM with spamscore 996

Posted by RW <rw...@googlemail.com>.

On Tue, 22 Oct 2019 18:31:18 +0200
Philipp Ewald wrote:

> First thanks for help, i will train them with current mail.
> 
> my Amavis configuration found my Attachment and score this with SPAM 
> score 999 but auto learn ignore this....
> 
...

> did i miss something? can someone help me?
> 
> google "auto learn amavis spamassassin" its really tricky to find 
> something helpful.

There are some sanity checks on auto-learning, in this case there is the
rule that there has to be at least 3 points from header-based rules and
3 from body-based rules.

If you are in a position to train manually, I think it's best to
turn-off auto-learning.

Autolearn HAM with spamscore 996

Posted by Philipp Ewald <ph...@digionline.de>.

First thanks for help, i will train them with current mail.

my Amavis configuration found my Attachment and score this with SPAM 
score 999 but auto learn ignore this....

X-Spam-Flag: YES
X-Spam-Score: 996.7
X-Spam-Level: 
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
X-Spam-Status: Yes, score=996.7 tagged_above=-9999 required=5
	tests=[AV:NSFW.UNOFFICIAL=999, RCVD_IN_DNSWL_MED=-2.3]
	autolearn=ham autolearn_force=no

Test with GTUBE:

X-Spam-Flag: YES
X-Spam-Score: 997.7
X-Spam-Level: 
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
X-Spam-Status: Yes, score=997.7 tagged_above=-9999 required=5
	tests=[GTUBE=1000, RCVD_IN_DNSWL_MED=-2.3]
	autolearn=no autolearn_force=no

Amavis config:
/etc/amavis/conf.d/50-user

@virus_name_to_spam_score_maps =
   (new_RE(  # the order matters!
     [ qr'NSFW.UNOFFICIAL' => 999],
));

did i miss something? can someone help me?

google "auto learn amavis spamassassin" its really tricky to find 
something helpful.

kind regards
Philipp

On 22.10.19 15:56, RW wrote:
> 
> Train on the actual email.
> 

-- 
Philipp Ewald
Administrator

Re: Question about Bayes implementation

Posted by RW <rw...@googlemail.com>.

On Tue, 22 Oct 2019 12:21:45 +0200
Philipp Ewald wrote:

> Hi folks,
> 
> at this point i split all my SPAM mail to get the attachment to
> create a hash table. (but this is not my point)
> 
> Its also possible to split my SPAM into html/text, plain/text and 
> headers to.
> Debian package: ripmime
> 
> Now i ask myself:
> If i learn spamassassin with my mails should i learn with whole mail
> or can i split them and learn only plain/text part? ore wich part
> would be "the best" to learn?

Train on the actual email.