You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by sp...@incubator.apache.org on 2004/04/09 21:21:39 UTC

[SpamAssassin Wiki] Updated: BayesInSpamAssassin

   Date: 2004-04-09T12:21:39
   Editor: 128.252.242.1 <>
   Wiki: SpamAssassin Wiki
   Page: BayesInSpamAssassin
   URL: http://wiki.apache.org/spamassassin/BayesInSpamAssassin

   no comment

Change Log:

------------------------------------------------------------------------------
@@ -27,3 +27,10 @@
 You can only invoke spamassassin using ''spamassassin -r'' on single files. This is fine for "mbox" spam mailboxes which are all contained in one file. However, for "maildir" directories, you will need to run ''spamassassin -r'' on each message individually. If you are not sure which format you have, look at your mail directory. If you see one or more files and each file contains one or more messages, you have "mbox" format. If you see directories containing files, each file name is a long string of numbers, letters, and punctuation, and each file contains one email message, you have "maildir" format.
 
 If you have "maildir" mailboxes, running ''spamassassin -r'' multiple times can be tedious for large numbers of spam. So you can use this ["report_spam.pl"] script to run it for you. The script is written in perl. You can save the script to your spamassassin computer and then run it using ''report_spam.pl your_spam_directory''. Each message in your_spam_directory will then be learned in bayes '''and''' reported to the checksum services.
+
+= Possible Future Directions =
+(InSanity)
+
+Bayesian+Dictionary analysis
+
+It may be worthwhile to develop a Bayesian checker which checks for the proportion of dictionary words vs. non-dictionary words.  This may quickly assist in identifying messages that utilize Bayesian avoidance techniques with punctuation/spacing interspersed through commonly identified words.  Possible adjunct to the current methods.