You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by jm...@apache.org on 2007/07/25 14:52:48 UTC
svn commit: r559437 [5/13] - in /spamassassin/site/full/3.2.x: ./ doc/
Added: spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf.txt
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf.txt?view=auto&rev=559437
==============================================================================
--- spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf.txt (added)
+++ spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf.txt Wed Jul 25 05:52:42 2007
@@ -0,0 +1,1667 @@
+NAME
+ Mail::SpamAssassin::Conf - SpamAssassin configuration file
+
+SYNOPSIS
+ # a comment
+
+ rewrite_header Subject *****SPAM*****
+
+ full PARA_A_2_C_OF_1618 /Paragraph .a.{0,10}2.{0,10}C. of S. 1618/i
+ describe PARA_A_2_C_OF_1618 Claims compliance with senate bill 1618
+
+ header FROM_HAS_MIXED_NUMS From =~ /\d+[a-z]+\d+\S*@/i
+ describe FROM_HAS_MIXED_NUMS From: contains numbers mixed in with letters
+
+ score A_HREF_TO_REMOVE 2.0
+
+ lang es describe FROM_FORGED_HOTMAIL Forzado From: simula ser de hotmail.com
+
+ lang pt_BR report O programa detetor de Spam ZOE [...]
+
+DESCRIPTION
+ SpamAssassin is configured using traditional UNIX-style configuration
+ files, loaded from the "/usr/share/spamassassin" and
+ "/etc/mail/spamassassin" directories.
+
+ The following web page lists the most important configuration settings
+ used to configure SpamAssassin; novices are encouraged to read it first:
+
+ http://wiki.apache.org/spamassassin/ImportantInitialConfigItems
+
+FILE FORMAT
+ The "#" character starts a comment, which continues until end of line.
+ NOTE: if the "#" character is to be used as part of a rule or
+ configuration option, it must be escaped with a backslash. i.e.: "\#"
+
+ Whitespace in the files is not significant, but please note that
+ starting a line with whitespace is deprecated, as we reserve its use for
+ multi-line rule definitions, at some point in the future.
+
+ Currently, each rule or configuration setting must fit on one-line;
+ multi-line settings are not supported yet.
+
+ File and directory paths can use "~" to refer to the user's home
+ directory, but no other shell-style path extensions such as globing or
+ "~user/" are supported.
+
+ Where appropriate below, default values are listed in parentheses.
+
+USER PREFERENCES
+ The following options can be used in both site-wide ("local.cf") and
+ user-specific ("user_prefs") configuration files to customize how
+ SpamAssassin handles incoming email messages.
+
+ SCORING OPTIONS
+ required_score n.nn (default: 5)
+ Set the score required before a mail is considered spam. "n.nn" can
+ be an integer or a real number. 5.0 is the default setting, and is
+ quite aggressive; it would be suitable for a single-user setup, but
+ if you're an ISP installing SpamAssassin, you should probably set
+ the default to be more conservative, like 8.0 or 10.0. It is not
+ recommended to automatically delete or discard messages marked as
+ spam, as your users will complain, but if you choose to do so, only
+ delete messages with an exceptionally high score such as 15.0 or
+ higher. This option was previously known as "required_hits" and that
+ name is still accepted, but is deprecated.
+
+ score SYMBOLIC_TEST_NAME n.nn [ n.nn n.nn n.nn ]
+ Assign scores (the number of points for a hit) to a given test.
+ Scores can be positive or negative real numbers or integers.
+ "SYMBOLIC_TEST_NAME" is the symbolic name used by SpamAssassin for
+ that test; for example, 'FROM_ENDS_IN_NUMS'.
+
+ If only one valid score is listed, then that score is always used
+ for a test.
+
+ If four valid scores are listed, then the score that is used depends
+ on how SpamAssassin is being used. The first score is used when both
+ Bayes and network tests are disabled (score set 0). The second score
+ is used when Bayes is disabled, but network tests are enabled (score
+ set 1). The third score is used when Bayes is enabled and network
+ tests are disabled (score set 2). The fourth score is used when
+ Bayes is enabled and network tests are enabled (score set 3).
+
+ Setting a rule's score to 0 will disable that rule from running.
+
+ If any of the score values are surrounded by parenthesis '()', then
+ all of the scores in the line are considered to be relative to the
+ already set score. ie: '(3)' means increase the score for this rule
+ by 3 points in all score sets. '(3) (0) (3) (0)' means increase the
+ score for this rule by 3 in score sets 0 and 2 only.
+
+ If no score is given for a test by the end of the configuration, a
+ default score is assigned: a score of 1.0 is used for all tests,
+ except those who names begin with 'T_' (this is used to indicate a
+ rule in testing) which receive 0.01.
+
+ Note that test names which begin with '__' are indirect rules used
+ to compose meta-match rules and can also act as prerequisites to
+ other rules. They are not scored or listed in the 'tests hit'
+ reports, but assigning a score of 0 to an indirect rule will disable
+ it from running.
+
+ WHITELIST AND BLACKLIST OPTIONS
+ whitelist_from add@ress.com
+ Used to whitelist sender addresses which send mail that is often
+ tagged (incorrectly) as spam.
+
+ Use of this setting is not recommended, since it blindly trusts
+ the message, which is routinely and easily forged by spammers
+ and phish senders. The recommended solution is to instead use
+ "whitelist_auth" or other authenticated whitelisting methods, or
+ "whitelist_from_rcvd".
+
+ Whitelist and blacklist addresses are now file-glob-style
+ patterns, so "friend@somewhere.com", "*@isp.com", or
+ "*.domain.net" will all work. Specifically, "*" and "?" are
+ allowed, but all other metacharacters are not. Regular
+ expressions are not used for security reasons.
+
+ Multiple addresses per line, separated by spaces, is OK.
+ Multiple "whitelist_from" lines is also OK.
+
+ The headers checked for whitelist addresses are as follows: if
+ "Resent-From" is set, use that; otherwise check all addresses
+ taken from the following set of headers:
+
+ Envelope-Sender
+ Resent-Sender
+ X-Envelope-From
+ From
+
+ In addition, the "envelope sender" data, taken from the SMTP
+ envelope data where this is available, is looked up. See
+ "envelope_sender_header".
+
+ e.g.
+
+ whitelist_from joe@example.com fred@example.com
+ whitelist_from *@example.com
+
+ unwhitelist_from add@ress.com
+ Used to override a default whitelist_from entry, so for example
+ a distribution whitelist_from can be overridden in a local.cf
+ file, or an individual user can override a whitelist_from entry
+ in their own "user_prefs" file. The specified email address has
+ to match exactly the address previously used in a whitelist_from
+ line.
+
+ e.g.
+
+ unwhitelist_from joe@example.com fred@example.com
+ unwhitelist_from *@example.com
+
+ whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
+ Use this to supplement the whitelist_from addresses with a check
+ against the Received headers. The first parameter is the address
+ to whitelist, and the second is a string to match the relay's
+ rDNS.
+
+ This string is matched against the reverse DNS lookup used
+ during the handover from the internet to your internal network's
+ mail exchangers. It can either be the full hostname, or the
+ domain component of that hostname. In other words, if the host
+ that connected to your MX had an IP address that mapped to
+ 'sendinghost.spamassassin.org', you should specify
+ "sendinghost.spamassassin.org" or just "spamassassin.org" here.
+
+ Note that this requires that "internal_networks" be correct. For
+ simple cases, it will be, but for a complex network you may get
+ better results by setting that parameter.
+
+ e.g.
+
+ whitelist_from_rcvd joe@example.com example.com
+ whitelist_from_rcvd *@axkit.org sergeant.org
+
+ def_whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
+ Same as "whitelist_from_rcvd", but used for the default
+ whitelist entries in the SpamAssassin distribution. The
+ whitelist score is lower, because these are often targets for
+ spammer spoofing.
+
+ whitelist_allows_relays add@ress.com
+ Specify addresses which are in "whitelist_from_rcvd" that
+ sometimes send through a mail relay other than the listed ones.
+ By default mail with a From address that is in
+ "whitelist_from_rcvd" that does not match the relay will trigger
+ a forgery rule. Including the address in
+ "whitelist_allows_relay" prevents that.
+
+ Whitelist and blacklist addresses are now file-glob-style
+ patterns, so "friend@somewhere.com", "*@isp.com", or
+ "*.domain.net" will all work. Specifically, "*" and "?" are
+ allowed, but all other metacharacters are not. Regular
+ expressions are not used for security reasons.
+
+ Multiple addresses per line, separated by spaces, is OK.
+ Multiple "whitelist_allows_relays" lines is also OK.
+
+ The specified email address does not have to match exactly the
+ address previously used in a whitelist_from_rcvd line as it is
+ compared to the address in the header.
+
+ e.g.
+
+ whitelist_allows_relays joe@example.com fred@example.com
+ whitelist_allows_relays *@example.com
+
+ unwhitelist_from_rcvd add@ress.com
+ Used to override a default whitelist_from_rcvd entry, so for
+ example a distribution whitelist_from_rcvd can be overridden in
+ a local.cf file, or an individual user can override a
+ whitelist_from_rcvd entry in their own "user_prefs" file.
+
+ The specified email address has to match exactly the address
+ previously used in a whitelist_from_rcvd line.
+
+ e.g.
+
+ unwhitelist_from_rcvd joe@example.com fred@example.com
+ unwhitelist_from_rcvd *@axkit.org
+
+ blacklist_from add@ress.com
+ Used to specify addresses which send mail that is often tagged
+ (incorrectly) as non-spam, but which the user doesn't want. Same
+ format as "whitelist_from".
+
+ unblacklist_from add@ress.com
+ Used to override a default blacklist_from entry, so for example
+ a distribution blacklist_from can be overridden in a local.cf
+ file, or an individual user can override a blacklist_from entry
+ in their own "user_prefs" file. The specified email address has
+ to match exactly the address previously used in a blacklist_from
+ line.
+
+ e.g.
+
+ unblacklist_from joe@example.com fred@example.com
+ unblacklist_from *@spammer.com
+
+ whitelist_to add@ress.com
+ If the given address appears as a recipient in the message
+ headers (Resent-To, To, Cc, obvious envelope recipient, etc.)
+ the mail will be whitelisted. Useful if you're deploying
+ SpamAssassin system-wide, and don't want some users to have
+ their mail filtered. Same format as "whitelist_from".
+
+ There are three levels of To-whitelisting, "whitelist_to",
+ "more_spam_to" and "all_spam_to". Users in the first level may
+ still get some spammish mails blocked, but users in
+ "all_spam_to" should never get mail blocked.
+
+ The headers checked for whitelist addresses are as follows: if
+ "Resent-To" or "Resent-Cc" are set, use those; otherwise check
+ all addresses taken from the following set of headers:
+
+ To
+ Cc
+ Apparently-To
+ Delivered-To
+ Envelope-Recipients
+ Apparently-Resent-To
+ X-Envelope-To
+ Envelope-To
+ X-Delivered-To
+ X-Original-To
+ X-Rcpt-To
+ X-Real-To
+
+ more_spam_to add@ress.com
+ See above.
+
+ all_spam_to add@ress.com
+ See above.
+
+ blacklist_to add@ress.com
+ If the given address appears as a recipient in the message
+ headers (Resent-To, To, Cc, obvious envelope recipient, etc.)
+ the mail will be blacklisted. Same format as "blacklist_from".
+
+ whitelist_auth add@ress.com
+ Used to specify addresses which send mail that is often tagged
+ (incorrectly) as spam. This is different from "whitelist_from"
+ and "whitelist_from_rcvd" in that it first verifies that the
+ message was sent by an authorized sender for the address, before
+ whitelisting.
+
+ Authorization is performed using one of the installed
+ sender-authorization schemes: SPF (using
+ "Mail::SpamAssassin::Plugins::SPF"), Domain Keys (using
+ "Mail::SpamAssassin::Plugins::DomainKeys"), or DKIM (using
+ "Mail::SpamAssassin::Plugins::DKIM"). Note that those plugins
+ must be active, and working, for this to operate.
+
+ Using "whitelist_auth" is roughly equivalent to specifying
+ duplicate "whitelist_from_spf", "whitelist_from_dk", and
+ "whitelist_from_dkim" lines for each of the addresses specified.
+
+ e.g.
+
+ whitelist_auth joe@example.com fred@example.com
+ whitelist_auth *@example.com
+
+ def_whitelist_auth add@ress.com
+ Same as "whitelist_auth", but used for the default whitelist
+ entries in the SpamAssassin distribution. The whitelist score is
+ lower, because these are often targets for spammer spoofing.
+
+ unwhitelist_auth add@ress.com
+ Used to override a "whitelist_auth" entry. The specified email
+ address has to match exactly the address previously used in a
+ "whitelist_auth" line.
+
+ e.g.
+
+ unwhitelist_auth joe@example.com fred@example.com
+ unwhitelist_auth *@example.com
+
+ BASIC MESSAGE TAGGING OPTIONS
+ rewrite_header { subject | from | to } STRING
+ By default, suspected spam messages will not have the "Subject",
+ "From" or "To" lines tagged to indicate spam. By setting this
+ option, the header will be tagged with "STRING" to indicate that
+ a message is spam. For the From or To headers, this will take
+ the form of an RFC 2822 comment following the address in
+ parantheses. For the Subject header, this will be prepended to
+ the original subject. Note that you should only use the _REQD_
+ and _SCORE_ tags when rewriting the Subject header if
+ "report_safe" is 0. Otherwise, you may not be able to remove the
+ SpamAssassin markup via the normal methods. More information
+ about tags is explained below in the TEMPLATE TAGS section.
+
+ Parentheses are not permitted in STRING if rewriting the From or
+ To headers. (They will be converted to square brackets.)
+
+ If "rewrite_header subject" is used, but the message being
+ rewritten does not already contain a "Subject" header, one will
+ be created.
+
+ A null value for "STRING" will remove any existing rewrite for
+ the specified header.
+
+ add_header { spam | ham | all } header_name string
+ Customized headers can be added to the specified type of
+ messages (spam, ham, or "all" to add to either). All headers
+ begin with "X-Spam-" (so a "header_name" Foo will generate a
+ header called X-Spam-Foo). header_name is restricted to the
+ character set [A-Za-z0-9_-].
+
+ "string" can contain tags as explained below in the TEMPLATE
+ TAGS section. You can also use "\n" and "\t" in the header to
+ add newlines and tabulators as desired. A backslash has to be
+ written as \\, any other escaped chars will be silently removed.
+
+ All headers will be folded if fold_headers is set to 1. Note:
+ Manually adding newlines via "\n" disables any further automatic
+ wrapping (ie: long header lines are possible). The lines will
+ still be properly folded (marked as continuing) though.
+
+ You can customize existing headers with add_header (only the
+ specified subset of messages will be changed).
+
+ See also "clear_headers" for removing headers.
+
+ Here are some examples (these are the defaults, note that
+ Checker-Version can not be changed or removed):
+
+ add_header spam Flag _YESNOCAPS_
+ add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
+ add_header all Level _STARS(*)_
+ add_header all Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on _HOSTNAME_
+
+ remove_header { spam | ham | all } header_name
+ Headers can be removed from the specified type of messages
+ (spam, ham, or "all" to remove from either). All headers begin
+ with "X-Spam-" (so "header_name" will be appended to "X-Spam-").
+
+ See also "clear_headers" for removing all the headers at once.
+
+ Note that X-Spam-Checker-Version is not removable because the
+ version information is needed by mail administrators and
+ developers to debug problems. Without at least one header, it
+ might not even be possible to determine that SpamAssassin is
+ running.
+
+ clear_headers
+ Clear the list of headers to be added to messages. You may use
+ this before any add_header options to prevent the default
+ headers from being added to the message.
+
+ Note that X-Spam-Checker-Version is not removable because the
+ version information is needed by mail administrators and
+ developers to debug problems. Without at least one header, it
+ might not even be possible to determine that SpamAssassin is
+ running.
+
+ report_safe ( 0 | 1 | 2 ) (default: 1)
+ if this option is set to 1, if an incoming message is tagged as
+ spam, instead of modifying the original message, SpamAssassin
+ will create a new report message and attach the original message
+ as a message/rfc822 MIME part (ensuring the original message is
+ completely preserved, not easily opened, and easier to recover).
+
+ If this option is set to 2, then original messages will be
+ attached with a content type of text/plain instead of
+ message/rfc822. This setting may be required for safety reasons
+ on certain broken mail clients that automatically load
+ attachments without any action by the user. This setting may
+ also make it somewhat more difficult to extract or view the
+ original message.
+
+ If this option is set to 0, incoming spam is only modified by
+ adding some "X-Spam-" headers and no changes will be made to the
+ body. In addition, a header named X-Spam-Report will be added to
+ spam. You can use the remove_header option to remove that header
+ after setting report_safe to 0.
+
+ See report_safe_copy_headers if you want to copy headers from
+ the original mail into tagged messages.
+
+ LANGUAGE OPTIONS
+ ok_locales xx [ yy zz ... ] (default: all)
+ This option is used to specify which locales are considered OK
+ for incoming mail. Mail using the character sets that are
+ allowed by this option will not be marked as possibly being spam
+ in a foreign language.
+
+ If you receive lots of spam in foreign languages, and never get
+ any non-spam in these languages, this may help. Note that all
+ ISO-8859-* character sets, and Windows code page character sets,
+ are always permitted by default.
+
+ Set this to "all" to allow all character sets. This is the
+ default.
+
+ The rules "CHARSET_FARAWAY", "CHARSET_FARAWAY_BODY", and
+ "CHARSET_FARAWAY_HEADERS" are triggered based on how this is
+ set.
+
+ Examples:
+
+ ok_locales all (allow all locales)
+ ok_locales en (only allow English)
+ ok_locales en ja zh (allow English, Japanese, and Chinese)
+
+ Note: if there are multiple ok_locales lines, only the last one
+ is used.
+
+ Select the locales to allow from the list below:
+
+ en - Western character sets in general
+ ja - Japanese character sets
+ ko - Korean character sets
+ ru - Cyrillic character sets
+ th - Thai character sets
+ zh - Chinese (both simplified and traditional) character sets
+
+ normalize_charset ( 0 | 1) (default: 0)
+ Whether to detect character sets and normalize message content
+ to Unicode. Requires the Encode::Detect module, HTML::Parser
+ version 3.46 or later, and Perl 5.8.5 or later.
+
+ NETWORK TEST OPTIONS
+ trusted_networks ip.add.re.ss[/mask] ... (default: none)
+ What networks or hosts are 'trusted' in your setup. Trusted in
+ this case means that relay hosts on these networks are
+ considered to not be potentially operated by spammers, open
+ relays, or open proxies. A trusted host could conceivably relay
+ spam, but will not originate it, and will not forge header data.
+ DNS blacklist checks will never query for hosts on these
+ networks.
+
+ See "http://wiki.apache.org/spamassassin/TrustPath" for more
+ information.
+
+ MXes for your domain(s) and internal relays should also be
+ specified using the "internal_networks" setting. When there are
+ 'trusted' hosts that are not MXes or internal relays for your
+ domain(s) they should only be specified in "trusted_networks".
+
+ If a "/mask" is specified, it's considered a CIDR-style
+ 'netmask', specified in bits. If it is not specified, but less
+ than 4 octets are specified with a trailing dot, that's
+ considered a mask to allow all addresses in the remaining
+ octets. If a mask is not specified, and there is not trailing
+ dot, then just the single IP address specified is used, as if
+ the mask was "/32".
+
+ If a network or host address is prefaced by a "!" the network or
+ host will be excluded (or included) in a first listed match
+ fashion.
+
+ Note: 127/8 is always included in trusted_networks, regardless
+ of your config.
+
+ Examples:
+
+ trusted_networks 192.168/16 # all in 192.168.*.*
+ trusted_networks 212.17.35.15 # just that host
+ trusted_networks !10.0.1.5 10.0.1/24 # all in 10.0.1.* but not 10.0.1.5
+
+ This operates additively, so a "trusted_networks" line after
+ another one will result in all those networks becoming trusted.
+ To clear out the existing entries, use "clear_trusted_networks".
+
+ If "trusted_networks" is not set and "internal_networks" is, the
+ value of "internal_networks" will be used for this parameter.
+
+ If neither "trusted_networks" or "internal_networks" is set, a
+ basic inference algorithm is applied. This works as follows:
+
+ * If the 'from' host has an IP address in a private (RFC 1918)
+ network range, then it's trusted
+
+ * If there are authentication tokens in the received header,
+ and the previous host was trusted, then this host is also
+ trusted
+
+ * Otherwise this host, and all further hosts, are consider
+ untrusted.
+
+ clear_trusted_networks
+ Empty the list of trusted networks.
+
+ internal_networks ip.add.re.ss[/mask] ... (default: none)
+ What networks or hosts are 'internal' in your setup. Internal
+ means that relay hosts on these networks are considered to be
+ MXes for your domain(s), or internal relays. This uses the same
+ format as "trusted_networks", above.
+
+ This value is used when checking 'dial-up' or dynamic IP address
+ blocklists, in order to detect direct-to-MX spamming.
+
+ Trusted relays that accept mail directly from dial-up
+ connections should not be listed in "internal_networks". List
+ them only in "trusted_networks".
+
+ If "trusted_networks" is set and "internal_networks" is not, the
+ value of "trusted_networks" will be used for this parameter.
+
+ If neither "trusted_networks" or "internal_networks" is set, no
+ addresses will be considered local; in other words, any relays
+ past the machine where SpamAssassin is running will be
+ considered external.
+
+ Every entry in "internal_networks" must appear in
+ "trusted_networks"; in other words, "internal_networks" is
+ always a subset of the trusted set.
+
+ Note: 127/8 is always included in internal_networks, regardless
+ of your config.
+
+ clear_internal_networks
+ Empty the list of internal networks.
+
+ msa_networks ip.add.re.ss[/mask] ... (default: none)
+ The networks or hosts are acting as MSAs in your setup. MSA
+ means that the relay hosts on these networks accept mail from
+ your own users and authenticates them appropriately. These
+ relays will never accept mail from hosts that aren't
+ authenticated in some way. Examples of authentication include,
+ IP lists, SMTP AUTH, POP-before-SMTP, etc.
+
+ All relays found in the message headers after the MSA relay will
+ take on the same trusted and internal classifcations as the MSA
+ relay itself, as defined by your *trusted_networks* and
+ *internal_networks* configuration.
+
+ For example, if the MSA relay is trusted and internal so will
+ all of the relays that precede it.
+
+ When using msa_networks to identify an MSA it is recommended
+ that you treat that MSA as both trusted and internal. When an
+ MSA is not included in msa_networks you should treat the MSA as
+ trusted but not internal, however if the MSA is also acting as
+ an MX or intermediate relay you must always treat it as both
+ trusted and internal and ensure that the MSA includes visible
+ auth tokens in its Received header to identify submission
+ clients.
+
+ Warning: Never include an MSA that also acts as an MX (or is
+ also an intermediate relay for an MX) or otherwise accepts mail
+ from non-authenticated users in msa_networks. Doing so will
+ result in unknown external relays being trusted.
+
+ clear_msa_networks
+ Empty the list of msa networks.
+
+ always_trust_envelope_sender ( 0 | 1 ) (default: 0)
+ Trust the envelope sender even if the message has been passed
+ through one or more trusted relays. See also
+ "envelope_sender_header".
+
+ skip_rbl_checks ( 0 | 1 ) (default: 0)
+ By default, SpamAssassin will run RBL checks. If your ISP
+ already does this for you, set this to 1.
+
+ dns_available { yes | test[: name1 name2...] | no } (default: test)
+ By default, SpamAssassin will query some default hosts on the
+ internet to attempt to check if DNS is working or not. The
+ problem is that it can introduce some delay if your network
+ connection is down, and in some cases it can wrongly guess that
+ DNS is unavailable because the test connections failed.
+ SpamAssassin includes a default set of 13 servers, among which 3
+ are picked randomly.
+
+ You can however specify your own list by specifying
+
+ dns_available test: domain1.tld domain2.tld domain3.tld
+
+ Please note, the DNS test queries for NS records.
+
+ SpamAssassin's network rules are run in parallel. This can cause
+ overhead in terms of the number of file descriptors required; it
+ is recommended that the minimum limit on file descriptors be
+ raised to at least 256 for safety.
+
+ dns_test_interval n (default: 600 seconds)
+ If dns_available is set to 'test' (which is the default), the
+ dns_test_interval time in number of seconds will tell
+ SpamAssassin how often to retest for working DNS.
+
+ LEARNING OPTIONS
+ use_bayes ( 0 | 1 ) (default: 1)
+ Whether to use the naive-Bayesian-style classifier built into
+ SpamAssassin. This is a master on/off switch for all
+ Bayes-related operations.
+
+ use_bayes_rules ( 0 | 1 ) (default: 1)
+ Whether to use rules using the naive-Bayesian-style classifier
+ built into SpamAssassin. This allows you to disable the rules
+ while leaving auto and manual learning enabled.
+
+ bayes_auto_learn ( 0 | 1 ) (default: 1)
+ Whether SpamAssassin should automatically feed high-scoring
+ mails (or low-scoring mails, for non-spam) into its learning
+ systems. The only learning system supported currently is a
+ naive-Bayesian-style classifier.
+
+ See the documentation for the
+ "Mail::SpamAssassin::Plugin::AutoLearnThreshold" plugin module
+ for details on how Bayes auto-learning is implemented by
+ default.
+
+ bayes_ignore_header header_name
+ If you receive mail filtered by upstream mail systems, like a
+ spam-filtering ISP or mailing list, and that service adds new
+ headers (as most of them do), these headers may provide
+ inappropriate cues to the Bayesian classifier, allowing it to
+ take a "short cut". To avoid this, list the headers using this
+ setting. Example:
+
+ bayes_ignore_header X-Upstream-Spamfilter
+ bayes_ignore_header X-Upstream-SomethingElse
+
+ bayes_ignore_from add@ress.com
+ Bayesian classification and autolearning will not be performed
+ on mail from the listed addresses. Program "sa-learn" will also
+ ignore the listed addresses if it is invoked using the
+ "--use-ignores" option. One or more addresses can be listed, see
+ "whitelist_from".
+
+ Spam messages from certain senders may contain many words that
+ frequently occur in ham. For example, one might read messages
+ from a preferred bookstore but also get unwanted spam messages
+ from other bookstores. If the unwanted messages are learned as
+ spam then any messages discussing books, including the preferred
+ bookstore and antiquarian messages would be in danger of being
+ marked as spam. The addresses of the annoying bookstores would
+ be listed. (Assuming they were halfway legitimate and didn't
+ send you mail through myriad affiliates.)
+
+ Those who have pieces of spam in legitimate messages or
+ otherwise receive ham messages containing potentially spammy
+ words might fear that some spam messages might be in danger of
+ being marked as ham. The addresses of the spam mailing lists,
+ correspondents, etc. would be listed.
+
+ bayes_ignore_to add@ress.com
+ Bayesian classification and autolearning will not be performed
+ on mail to the listed addresses. See "bayes_ignore_from" for
+ details.
+
+ bayes_min_ham_num (Default: 200)
+ bayes_min_spam_num (Default: 200)
+ To be accurate, the Bayes system does not activate until a
+ certain number of ham (non-spam) and spam have been learned. The
+ default is 200 of each ham and spam, but you can tune these up
+ or down with these two settings.
+
+ bayes_learn_during_report (Default: 1)
+ The Bayes system will, by default, learn any reported messages
+ ("spamassassin -r") as spam. If you do not want this to happen,
+ set this option to 0.
+
+ bayes_sql_override_username
+ Used by BayesStore::SQL storage implementation.
+
+ If this options is set the BayesStore::SQL module will override
+ the set username with the value given. This could be useful for
+ implementing global or group bayes databases.
+
+ bayes_use_hapaxes (default: 1)
+ Should the Bayesian classifier use hapaxes (words/tokens that
+ occur only once) when classifying? This produces significantly
+ better hit-rates, but increases database size by about a factor
+ of 8 to 10.
+
+ bayes_journal_max_size (default: 102400)
+ SpamAssassin will opportunistically sync the journal and the
+ database. It will do so once a day, but will sync more often if
+ the journal file size goes above this setting, in bytes. If set
+ to 0, opportunistic syncing will not occur.
+
+ bayes_expiry_max_db_size (default: 150000)
+ What should be the maximum size of the Bayes tokens database?
+ When expiry occurs, the Bayes system will keep either 75% of the
+ maximum value, or 100,000 tokens, whichever has a larger value.
+ 150,000 tokens is roughly equivalent to a 8Mb database file.
+
+ bayes_auto_expire (default: 1)
+ If enabled, the Bayes system will try to automatically expire
+ old tokens from the database. Auto-expiry occurs when the number
+ of tokens in the database surpasses the bayes_expiry_max_db_size
+ value.
+
+ bayes_learn_to_journal (default: 0)
+ If this option is set, whenever SpamAssassin does Bayes
+ learning, it will put the information into the journal instead
+ of directly into the database. This lowers contention for
+ locking the database to execute an update, but will also cause
+ more access to the journal and cause a delay before the updates
+ are actually committed to the Bayes database.
+
+ MISCELLANEOUS OPTIONS
+ lock_method type
+ Select the file-locking method used to protect database files
+ on-disk. By default, SpamAssassin uses an NFS-safe locking
+ method on UNIX; however, if you are sure that the database files
+ you'll be using for Bayes and AWL storage will never be accessed
+ over NFS, a non-NFS-safe locking system can be selected.
+
+ This will be quite a bit faster, but may risk file corruption if
+ the files are ever accessed by multiple clients at once, and one
+ or more of them is accessing them through an NFS filesystem.
+
+ Note that different platforms require different locking systems.
+
+ The supported locking systems for "type" are as follows:
+
+ *nfssafe* - an NFS-safe locking system
+ *flock* - simple UNIX "flock()" locking
+ *win32* - Win32 locking using "sysopen (..., O_CREAT|O_EXCL)".
+
+ nfssafe and flock are only available on UNIX, and win32 is only
+ available on Windows. By default, SpamAssassin will choose
+ either nfssafe or win32 depending on the platform in use.
+
+ fold_headers ( 0 | 1 ) (default: 1)
+ By default, headers added by SpamAssassin will be whitespace
+ folded. In other words, they will be broken up into multiple
+ lines instead of one very long one and each other line will have
+ a tabulator prepended to mark it as a continuation of the
+ preceding one.
+
+ The automatic wrapping can be disabled here. Note that this can
+ generate very long lines.
+
+ report_safe_copy_headers header_name ...
+ If using "report_safe", a few of the headers from the original
+ message are copied into the wrapper header (From, To, Cc,
+ Subject, Date, etc.) If you want to have other headers copied as
+ well, you can add them using this option. You can specify
+ multiple headers on the same line, separated by spaces, or you
+ can just use multiple lines.
+
+ envelope_sender_header Name-Of-Header
+ SpamAssassin will attempt to discover the address used in the
+ 'MAIL FROM:' phase of the SMTP transaction that delivered this
+ message, if this data has been made available by the SMTP
+ server. This is used in the "EnvelopeFrom" pseudo-header, and
+ for various rules such as SPF checking.
+
+ By default, various MTAs will use different headers, such as the
+ following:
+
+ X-Envelope-From
+ Envelope-Sender
+ X-Sender
+ Return-Path
+
+ SpamAssassin will attempt to use these, if some heuristics (such
+ as the header placement in the message, or the absence of
+ fetchmail signatures) appear to indicate that they are safe to
+ use. However, it may choose the wrong headers in some mailserver
+ configurations. (More discussion of this can be found in bug
+ 2142 and bug 4747 in the SpamAssassin BugZilla.)
+
+ To avoid this heuristic failure, the "envelope_sender_header"
+ setting may be helpful. Name the header that your MTA adds to
+ messages containing the address used at the MAIL FROM step of
+ the SMTP transaction.
+
+ If the header in question contains "<" or ">" characters at the
+ start and end of the email address in the right-hand side, as in
+ the SMTP transaction, these will be stripped.
+
+ If the header is not found in a message, or if it's value does
+ not contain an "@" sign, SpamAssassin will issue a warning in
+ the logs and fall back to its default heuristics.
+
+ (Note for MTA developers: we would prefer if the use of a single
+ header be avoided in future, since that precludes 'downstream'
+ spam scanning.
+ "http://wiki.apache.org/spamassassin/EnvelopeSenderInReceived"
+ details a better proposal, storing the envelope sender at each
+ hop in the "Received" header.)
+
+ example:
+
+ envelope_sender_header X-SA-Exim-Mail-From
+
+ describe SYMBOLIC_TEST_NAME description ...
+ Used to describe a test. This text is shown to users in the
+ detailed report.
+
+ Note that test names which begin with '__' are reserved for
+ meta-match sub-rules, and are not scored or listed in the 'tests
+ hit' reports.
+
+ Also note that by convention, rule descriptions should be
+ limited in length to no more than 50 characters.
+
+ report_charset CHARSET (default: unset)
+ Set the MIME Content-Type charset used for the text/plain report
+ which is attached to spam mail messages.
+
+ report ...some text for a report...
+ Set the report template which is attached to spam mail messages.
+ See the "10_default_prefs.cf" configuration file in
+ "/usr/share/spamassassin" for an example.
+
+ If you change this, try to keep it under 78 columns. Each
+ "report" line appends to the existing template, so use
+ "clear_report_template" to restart.
+
+ Tags can be included as explained above.
+
+ clear_report_template
+ Clear the report template.
+
+ report_contact ...text of contact address...
+ Set what _CONTACTADDRESS_ is replaced with in the above report
+ text. By default, this is 'the administrator of that system',
+ since the hostname of the system the scanner is running on is
+ also included.
+
+ report_hostname ...hostname to use...
+ Set what _HOSTNAME_ is replaced with in the above report text.
+ By default, this is determined dynamically as whatever the host
+ running SpamAssassin calls itself.
+
+ unsafe_report ...some text for a report...
+ Set the report template which is attached to spam mail messages
+ which contain a non-text/plain part. See the
+ "10_default_prefs.cf" configuration file in
+ "/usr/share/spamassassin" for an example.
+
+ Each "unsafe-report" line appends to the existing template, so
+ use "clear_unsafe_report_template" to restart.
+
+ Tags can be used in this template (see above for details).
+
+ clear_unsafe_report_template
+ Clear the unsafe_report template.
+
+RULE DEFINITIONS AND PRIVILEGED SETTINGS
+ These settings differ from the ones above, in that they are
+ considered 'privileged'. Only users running "spamassassin" from
+ their procmailrc's or forward files, or sysadmins editing a file in
+ "/etc/mail/spamassassin", can use them. "spamd" users cannot use
+ them in their "user_prefs" files, for security and efficiency
+ reasons, unless "allow_user_rules" is enabled (and then, they may
+ only add rules from below).
+
+ allow_user_rules ( 0 | 1 ) (default: 0)
+ This setting allows users to create rules (and only rules) in
+ their "user_prefs" files for use with "spamd". It defaults to
+ off, because this could be a severe security hole. It may be
+ possible for users to gain root level access if "spamd" is run
+ as root. It is NOT a good idea, unless you have some other way
+ of ensuring that users' tests are safe. Don't use this unless
+ you are certain you know what you are doing. Furthermore, this
+ option causes spamassassin to recompile all the tests each time
+ it processes a message for a user with a rule in his/her
+ "user_prefs" file, which could have a significant effect on
+ server load. It is not recommended.
+
+ Note that it is not currently possible to use "allow_user_rules"
+ to modify an existing system rule from a "user_prefs" file with
+ "spamd".
+
+ redirector_pattern /pattern/modifiers
+ A regex pattern that matches both the redirector site portion,
+ and the target site portion of a URI.
+
+ Note: The target URI portion must be surrounded in parentheses
+ and no other part of the pattern may create a backreference.
+
+ Example:
+ http://chkpt.zdnet.com/chkpt/whatever/spammer.domain/yo/dude
+
+ redirector_pattern /^https?:\/\/(?:opt\.)?chkpt\.zdnet\.com\/chkpt\/\w+\/(.*)$/i
+
+ header SYMBOLIC_TEST_NAME header op /pattern/modifiers [if-unset:
+ STRING]
+ Define a test. "SYMBOLIC_TEST_NAME" is a symbolic test name,
+ such as 'FROM_ENDS_IN_NUMS'. "header" is the name of a mail
+ header, such as 'Subject', 'To', etc.
+
+ Appending ":raw" to the header name will inhibit decoding of
+ quoted-printable or base-64 encoded strings.
+
+ Appending ":addr" to the header name will cause everything
+ except the first email address to be removed from the header.
+ For example, all of the following will result in "example@foo":
+
+ example@foo
+ example@foo (Foo Blah)
+ example@foo, example@bar
+ display: example@foo (Foo Blah), example@bar ;
+ Foo Blah <ex...@foo>
+ "Foo Blah" <ex...@foo>
+ "'Foo Blah'" <ex...@foo>
+
+ Appending ":name" to the header name will cause everything
+ except the first real name to be removed from the header. For
+ example, all of the following will result in "Foo Blah"
+
+ example@foo (Foo Blah)
+ example@foo (Foo Blah), example@bar
+ display: example@foo (Foo Blah), example@bar ;
+ Foo Blah <ex...@foo>
+ "Foo Blah" <ex...@foo>
+ "'Foo Blah'" <ex...@foo>
+
+ There are several special pseudo-headers that can be specified:
+
+ "ALL" can be used to mean the text of all the message's headers.
+ "ToCc" can be used to mean the contents of both the 'To' and
+ 'Cc' headers.
+ "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of
+ the SMTP transaction that delivered this message, if this data
+ has been made available by the SMTP server. See
+ "envelope_sender_header" for more information on how to set
+ this.
+ "MESSAGEID" is a symbol meaning all Message-Id's found in the
+ message; some mailing list software moves the real 'Message-Id'
+ to 'Resent-Message-Id' or 'X-Message-Id', then uses its own one
+ in the 'Message-Id' header. The value returned for this symbol
+ is the text from all 3 headers, separated by newlines.
+ "X-Spam-Relays-Untrusted", "X-Spam-Relays-Trusted",
+ "X-Spam-Relays-Internal" and "X-Spam-Relays-External" represent
+ a portable, pre-parsed representation of the message's network
+ path, as recorded in the Received headers, divided into
+ 'trusted' vs 'untrusted' and 'internal' vs 'external' sets. See
+ "http://wiki.apache.org/spamassassin/TrustedRelays" for more
+ details.
+
+ "op" is either "=~" (contains regular expression) or "!~" (does
+ not contain regular expression), and "pattern" is a valid Perl
+ regular expression, with "modifiers" as regexp modifiers in the
+ usual style. Note that multi-line rules are not supported, even
+ if you use "x" as a modifier. Also note that the "#" character
+ must be escaped ("\#") or else it will be considered to be the
+ start of a comment and not part of the regexp.
+
+ If the "[if-unset: STRING]" tag is present, then "STRING" will
+ be used if the header is not found in the mail message.
+
+ Test names must not start with a number, and must contain only
+ alphanumerics and underscores. It is suggested that lower-case
+ characters not be used, and names have a length of no more than
+ 22 characters, as an informal convention. Dashes are not
+ allowed.
+
+ Note that test names which begin with '__' are reserved for
+ meta-match sub-rules, and are not scored or listed in the 'tests
+ hit' reports. Test names which begin with 'T_' are reserved for
+ tests which are undergoing QA, and these are given a very low
+ score.
+
+ If you add or modify a test, please be sure to run a sanity
+ check afterwards by running "spamassassin --lint". This will
+ avoid confusing error messages, or other tests being skipped as
+ a side-effect.
+
+ header SYMBOLIC_TEST_NAME exists:name_of_header
+ Define a header existence test. "name_of_header" is the name of
+ a header to test for existence. This is just a very simple
+ version of the above header tests.
+
+ header SYMBOLIC_TEST_NAME eval:name_of_eval_method([arguments])
+ Define a header eval test. "name_of_eval_method" is the name of
+ a method on the "Mail::SpamAssassin::EvalTests" object.
+ "arguments" are optional arguments to the function call.
+
+ header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone' [,
+ 'sub-test'])
+ Check a DNSBL (a DNS blacklist or whitelist). This will retrieve
+ Received: headers from the message, extract the IP addresses,
+ select which ones are 'untrusted' based on the
+ "trusted_networks" logic, and query that DNSBL zone. There's a
+ few things to note:
+
+ duplicated or private IPs
+ Duplicated IPs are only queried once and reserved IPs are
+ not queried. Private IPs are those listed in
+ <http://www.iana.org/assignments/ipv4-address-space>,
+ <http://duxcw.com/faq/network/privip.htm>,
+ <http://duxcw.com/faq/network/autoip.htm>, or
+ <ftp://ftp.rfc-editor.org/in-notes/rfc3330.txt> as private.
+
+ the 'set' argument
+ This is used as a 'zone ID'. If you want to look up a
+ multiple-meaning zone like NJABL or SORBS, you can then
+ query the results from that zone using it; but all
+ check_rbl_sub() calls must use that zone ID.
+
+ Also, if more than one IP address gets a DNSBL hit for a
+ particular rule, it does not affect the score because rules
+ only trigger once per message.
+
+ the 'zone' argument
+ This is the root zone of the DNSBL, ending in a period.
+
+ the 'sub-test' argument
+ This optional argument behaves the same as the sub-test
+ argument in "check_rbl_sub()" below.
+
+ selecting all IPs except for the originating one
+ This is accomplished by placing '-notfirsthop' at the end of
+ the set name. This is useful for querying against DNS lists
+ which list dialup IP addresses; the first hop may be a
+ dialup, but as long as there is at least one more hop, via
+ their outgoing SMTP server, that's legitimate, and so should
+ not gain points. If there is only one hop, that will be
+ queried anyway, as it should be relaying via its outgoing
+ SMTP server instead of sending directly to your MX (mail
+ exchange).
+
+ selecting IPs by whether they are trusted
+ When checking a 'nice' DNSBL (a DNS whitelist), you cannot
+ trust the IP addresses in Received headers that were not
+ added by trusted relays. To test the first IP address that
+ can be trusted, place '-firsttrusted' at the end of the set
+ name. That should test the IP address of the relay that
+ connected to the most remote trusted relay.
+
+ Note that this requires that SpamAssassin know which relays
+ are trusted. For simple cases, SpamAssassin can make a good
+ estimate. For complex cases, you may get better results by
+ setting "trusted_networks" manually.
+
+ In addition, you can test all untrusted IP addresses by
+ placing '-untrusted' at the end of the set name. Important
+ note -- this does NOT include the IP address from the most
+ recent 'untrusted line', as used in '-firsttrusted' above.
+ That's because we're talking about the trustworthiness of
+ the IP address data, not the source header line, here; and
+ in the case of the most recent header (the 'firsttrusted'),
+ that data can be trusted. See the Wiki page at
+ "http://wiki.apache.org/spamassassin/TrustedRelays" for more
+ information on this.
+
+ Selecting just the last external IP
+ By using '-lastexternal' at the end of the set name, you can
+ select only the external host that connected to your
+ internal network, or at least the last external host with a
+ public IP.
+
+ header SYMBOLIC_TEST_NAME eval:check_rbl_txt('set', 'zone')
+ Same as check_rbl(), except querying using IN TXT instead of IN
+ A records. If the zone supports it, it will result in a line of
+ text describing why the IP is listed, typically a hyperlink to a
+ database entry.
+
+ header SYMBOLIC_TEST_NAME eval:check_rbl_sub('set', 'sub-test')
+ Create a sub-test for 'set'. If you want to look up a
+ multi-meaning zone like relays.osirusoft.com, you can then query
+ the results from that zone using the zone ID from the original
+ query. The sub-test may either be an IPv4 dotted address for
+ RBLs that return multiple A records or a non-negative decimal
+ number to specify a bitmask for RBLs that return a single A
+ record containing a bitmask of results, a SenderBase test
+ beginning with "sb:", or (if none of the preceding options seem
+ to fit) a regular expression.
+
+ Note: the set name must be exactly the same for as the main
+ query rule, including selections like '-notfirsthop' appearing
+ at the end of the set name.
+
+ body SYMBOLIC_TEST_NAME /pattern/modifiers
+ Define a body pattern test. "pattern" is a Perl regular
+ expression. Note: as per the header tests, "#" must be escaped
+ ("\#") or else it is considered the beginning of a comment.
+
+ The 'body' in this case is the textual parts of the message
+ body; any non-text MIME parts are stripped, and the message
+ decoded from Quoted-Printable or Base-64-encoded format if
+ necessary. The message Subject header is considered part of the
+ body and becomes the first paragraph when running the rules. All
+ HTML tags and line breaks will be removed before matching.
+
+ body SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
+ Define a body eval test. See above.
+
+ uri SYMBOLIC_TEST_NAME /pattern/modifiers
+ Define a uri pattern test. "pattern" is a Perl regular
+ expression. Note: as per the header tests, "#" must be escaped
+ ("\#") or else it is considered the beginning of a comment.
+
+ The 'uri' in this case is a list of all the URIs in the body of
+ the email, and the test will be run on each and every one of
+ those URIs, adjusting the score if a match is found. Use this
+ test instead of one of the body tests when you need to match a
+ URI, as it is more accurately bound to the start/end points of
+ the URI, and will also be faster.
+
+ rawbody SYMBOLIC_TEST_NAME /pattern/modifiers
+ Define a raw-body pattern test. "pattern" is a Perl regular
+ expression. Note: as per the header tests, "#" must be escaped
+ ("\#") or else it is considered the beginning of a comment.
+
+ The 'raw body' of a message is the raw data inside all textual
+ parts. The text will be decoded from base64 or quoted-printable
+ encoding, but HTML tags and line breaks will still be present.
+ The pattern will be applied line-by-line.
+
+ rawbody SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
+ Define a raw-body eval test. See above.
+
+ full SYMBOLIC_TEST_NAME /pattern/modifiers
+ Define a full message pattern test. "pattern" is a Perl regular
+ expression. Note: as per the header tests, "#" must be escaped
+ ("\#") or else it is considered the beginning of a comment.
+
+ The full message is the pristine message headers plus the
+ pristine message body, including all MIME data such as images,
+ other attachments, MIME boundaries, etc.
+
+ full SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
+ Define a full message eval test. See above.
+
+ meta SYMBOLIC_TEST_NAME boolean expression
+ Define a boolean expression test in terms of other tests that
+ have been hit or not hit. For example:
+
+ meta META1 TEST1 && !(TEST2 || TEST3)
+
+ Note that English language operators ("and", "or") will be
+ treated as rule names, and that there is no "XOR" operator.
+
+ meta SYMBOLIC_TEST_NAME boolean arithmetic expression
+ Can also define a boolean arithmetic expression in terms of
+ other tests, with an unhit test having the value "0" and a hit
+ test having a nonzero value. The value of a hit meta test is
+ that of its arithmetic expression. The value of a hit eval test
+ is that returned by its method. The value of a hit header, body,
+ rawbody, uri, or full test which has the "multiple" tflag is the
+ number of times the test hit. The value of any other type of hit
+ test is "1".
+
+ For example:
+
+ meta META2 (3 * TEST1 - 2 * TEST2) > 0
+
+ Note that Perl builtins and functions, like "abs()", can't be
+ used, and will be treated as rule names.
+
+ If you want to define a meta-rule, but do not want its
+ individual sub-rules to count towards the final score unless the
+ entire meta-rule matches, give the sub-rules names that start
+ with '__' (two underscores). SpamAssassin will ignore these for
+ scoring.
+
+ tflags SYMBOLIC_TEST_NAME [
+ {net|nice|learn|userconf|noautolearn|multiple} ]
+ Used to set flags on a test. These flags are used in the
+ score-determination back end system for details of the test's
+ behaviour. Please see "bayes_auto_learn" for more information
+ about tflag interaction with those systems. The following flags
+ can be set:
+
+ net The test is a network test, and will not be run in the mass
+ checking system or if -L is used, therefore its score should
+ not be modified.
+
+ nice
+ The test is intended to compensate for common false
+ positives, and should be assigned a negative score.
+
+ userconf
+ The test requires user configuration before it can be used
+ (like language- specific tests).
+
+ learn
+ The test requires training before it can be used.
+
+ noautolearn
+ The test will explicitly be ignored when calculating the
+ score for learning systems.
+
+ multiple
+ The test will be evaluated multiple times, for use with meta
+ rules. Only affects header, body, rawbody, uri, and full
+ tests.
+
+ priority SYMBOLIC_TEST_NAME n
+ Assign a specific priority to a test. All tests, except for DNS
+ and Meta tests, are run in increasing priority value order
+ (negative priority values are run before positive priority
+ values). The default test priority is 0 (zero).
+
+ The values <-99999999999999> and <-99999999999998> have a
+ special meaning internally, and should not be used.
+
+ADMINISTRATOR SETTINGS
+ These settings differ from the ones above, in that they are
+ considered 'more privileged' -- even more than the ones in the
+ PRIVILEGED SETTINGS section. No matter what "allow_user_rules" is
+ set to, these can never be set from a user's "user_prefs" file when
+ spamc/spamd is being used. However, all settings can be used by
+ local programs run directly by the user.
+
+ version_tag string
+ This tag is appended to the SA version in the X-Spam-Status
+ header. You should include it when modify your ruleset,
+ especially if you plan to distribute it. A good choice for
+ *string* is your last name or your initials followed by a number
+ which you increase with each change.
+
+ The version_tag will be lowercased, and any non-alphanumeric or
+ period character will be replaced by an underscore.
+
+ e.g.
+
+ version_tag myrules1 # version=2.41-myrules1
+
+ test SYMBOLIC_TEST_NAME (ok|fail) Some string to test against
+ Define a regression testing string. You can have more than one
+ regression test string per symbolic test name. Simply specify a
+ string that you wish the test to match.
+
+ These tests are only run as part of the test suite - they should
+ not affect the general running of SpamAssassin.
+
+ rbl_timeout n (default: 15)
+ All DNS queries are made at the beginning of a check and we try
+ to read the results at the end. This value specifies the maximum
+ period of time to wait for an DNS query. If most of the DNS
+ queries have succeeded for a particular message, then
+ SpamAssassin will not wait for the full period to avoid wasting
+ time on unresponsive server(s). For the default 15 second
+ timeout, here is a chart of queries remaining versus the
+ effective timeout in seconds:
+
+ queries left 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
+ timeout 15 15 14 14 13 11 10 8 5 3 0
+
+ In addition, whenever the effective timeout is lowered due to
+ additional query results returning, the remaining queries are
+ always given at least one more second before timing out, but the
+ wait time will never exceed rbl_timeout.
+
+ For example, if 20 queries are made at the beginning of a
+ message check and 16 queries have returned (leaving 20%), the
+ remaining 4 queries must finish within 5 seconds of the
+ beginning of the check or they will be timed out.
+
+ util_rb_tld tld1 tld2 ...
+ This option allows the addition of new TLDs to the
+ RegistrarBoundaries code. Updates to the list usually happen
+ when new versions of SpamAssassin are released, but sometimes
+ it's necessary to add in new TLDs faster than a release can
+ occur. TLDs include things like com, net, org, etc.
+
+ util_rb_2tld 2tld-1.tld 2tld-2.tld ...
+ This option allows the addition of new 2nd-level TLDs (2TLD) to
+ the RegistrarBoundaries code. Updates to the list usually happen
+ when new versions of SpamAssassin are released, but sometimes
+ it's necessary to add in new 2TLDs faster than a release can
+ occur. 2TLDs include things like co.uk, fed.us, etc.
+
+ bayes_path /path/filename (default: ~/.spamassassin/bayes)
+ This is the directory and filename for Bayes databases. Several
+ databases will be created, with this as the base directory and
+ filename, with "_toks", "_seen", etc. appended to the base. The
+ default setting results in files called
+ "~/.spamassassin/bayes_seen", "~/.spamassassin/bayes_toks", etc.
+
+ By default, each user has their own in their "~/.spamassassin"
+ directory with mode 0700/0600. For system-wide SpamAssassin use,
+ you may want to reduce disk space usage by sharing this across
+ all users. However, Bayes appears to be more effective with
+ individual user databases.
+
+ bayes_file_mode (default: 0700)
+ The file mode bits used for the Bayesian filtering database
+ files.
+
+ Make sure you specify this using the 'x' mode bits set, as it
+ may also be used to create directories. However, if a file is
+ created, the resulting file will not have any execute bits set
+ (the umask is set to 111).
+
+ bayes_store_module Name::Of::BayesStore::Module
+ If this option is set, the module given will be used as an
+ alternate to the default bayes storage mechanism. It must
+ conform to the published storage specification (see
+ Mail::SpamAssassin::BayesStore). For example, set this to
+ Mail::SpamAssassin::BayesStore::SQL to use the generic SQL
+ storage module.
+
+ bayes_sql_dsn DBI::databasetype:databasename:hostname:port
+ Used for BayesStore::SQL storage implementation.
+
+ This option give the connect string used to connect to the SQL
+ based Bayes storage.
+
+ bayes_sql_username
+ Used by BayesStore::SQL storage implementation.
+
+ This option gives the username used by the above DSN.
+
+ bayes_sql_password
+ Used by BayesStore::SQL storage implementation.
+
+ This option gives the password used by the above DSN.
+
+ bayes_sql_username_authorized ( 0 | 1 ) (default: 0)
+ Whether to call the services_authorized_for_username plugin hook
+ in BayesSQL. If the hook does not determine that the user is
+ allowed to use bayes or is invalid then then database will not
+ be initialized.
+
+ NOTE: By default the user is considered invalid until a plugin
+ returns a true value. If you enable this, but do not have a
+ proper plugin loaded, all users will turn up as invalid.
+
+ The username passed into the plugin can be affected by the
+ bayes_sql_override_username config option.
+
+ user_scores_dsn DBI:databasetype:databasename:hostname:port
+ If you load user scores from an SQL database, this will set the
+ DSN used to connect. Example: "DBI:mysql:spamassassin:localhost"
+
+ If you load user scores from an LDAP directory, this will set
+ the DSN used to connect. You have to write the DSN as an LDAP
+ URL, the components being the host and port to connect to, the
+ base DN for the seasrch, the scope of the search (base, one or
+ sub), the single attribute being the multivalued attribute used
+ to hold the configuration data (space separated pairs of key and
+ value, just as in a file) and finally the filter being the
+ expression used to filter out the wanted username. Note that the
+ filter expression is being used in a sprintf statement with the
+ username as the only parameter, thus is can hold a single
+ __USERNAME__ expression. This will be replaced with the
+ username.
+
+ Example:
+ "ldap://localhost:389/dc=koehntopp,dc=de?spamassassinconfig?uid=
+ __USERNAME__"
+
+ user_scores_sql_username username
+ The authorized username to connect to the above DSN.
+
+ user_scores_sql_password password
+ The password for the database username, for the above DSN.
+
+ user_scores_sql_custom_query query
+ This option gives you the ability to create a custom SQL query
+ to retrieve user scores and preferences. In order to work
+ correctly your query should return two values, the preference
+ name and value, in that order. In addition, there are several
+ "variables" that you can use as part of your query, these
+ variables will be substituted for the current values right
+ before the query is run. The current allowed variables are:
+
+ _TABLE_
+ The name of the table where user scores and preferences are
+ stored. Currently hardcoded to userpref, to change this
+ value you need to create a new custom query with the new
+ table name.
+
+ _USERNAME_
+ The current user's username.
+
+ _MAILBOX_
+ The portion before the @ as derived from the current user's
+ username.
+
+ _DOMAIN_
+ The portion after the @ as derived from the current user's
+ username, this value may be null.
+
+ The query must be one one continuous line in order to parse
+ correctly.
+
+ Here are several example queries, please note that these are
+ broken up for easy reading, in your config it should be one
+ continuous line.
+
+ Current default query:
+ "SELECT preference, value FROM _TABLE_ WHERE username =
+ _USERNAME_ OR username = '@GLOBAL' ORDER BY username ASC"
+
+ Use global and then domain level defaults:
+ "SELECT preference, value FROM _TABLE_ WHERE username =
+ _USERNAME_ OR username = '@GLOBAL' OR username =
+ '@~'||_DOMAIN_ ORDER BY username ASC"
+
+ Maybe global prefs should override user prefs:
+ "SELECT preference, value FROM _TABLE_ WHERE username =
+ _USERNAME_ OR username = '@GLOBAL' ORDER BY username DESC"
+
+ user_scores_ldap_username
+ This is the Bind DN used to connect to the LDAP server.
+
+ Example: "cn=master,dc=koehntopp,dc=de"
+
+ user_scores_ldap_password
+ This is the password used to connect to the LDAP server.
+
+ loadplugin PluginModuleName [/path/module.pm]
+ Load a SpamAssassin plugin module. The "PluginModuleName" is the
+ perl module name, used to create the plugin object itself.
+
+ "/path/to/module.pm" is the file to load, containing the
+ module's perl code; if it's specified as a relative path, it's
+ considered to be relative to the current configuration file. If
+ it is omitted, the module will be loaded using perl's search
+ path (the @INC array).
+
+ See "Mail::SpamAssassin::Plugin" for more details on writing
+ plugins.
+
+ tryplugin PluginModuleName [/path/module.pm]
+ Same as "loadplugin", but silently ignored if the .pm file
+ cannot be found in the filesystem.
+
+PREPROCESSING OPTIONS
+ include filename
+ Include configuration lines from "filename". Relative paths are
+ considered relative to the current configuration file or user
+ preferences file.
+
+ if (conditional perl expression)
+ Used to support conditional interpretation of the configuration
+ file. Lines between this and a corresponding "else" or "endif"
+ line, will be ignored unless the conditional expression
+ evaluates as true (in the perl sense; that is, defined and
+ non-0).
+
+ The conditional accepts a limited subset of perl for security --
+ just enough to perform basic arithmetic comparisons. The
+ following input is accepted:
+
+ numbers, whitespace, arithmetic operations and grouping
+ Namely these characters and ranges:
+
+ ( ) - + * / _ . , < = > ! ~ 0-9 whitespace
+
+ version
+ This will be replaced with the version number of the
+ currently-running SpamAssassin engine. Note: The version
+ used is in the internal SpamAssassin version format which is
+ "x.yyyzzz", where x is major version, y is minor version,
+ and z is maintenance version. So 3.0.0 is 3.000000, and
+ 3.4.80 is 3.004080.
+
+ plugin(Name::Of::Plugin)
+ This is a function call that returns 1 if the plugin named
+ "Name::Of::Plugin" is loaded, or "undef" otherwise.
+
+ If the end of a configuration file is reached while still inside
+ a "if" scope, a warning will be issued, but parsing will restart
+ on the next file.
+
+ For example:
+
+ if (version > 3.000000)
+ header MY_FOO ...
+ endif
+
+ loadplugin MyPlugin plugintest.pm
+
+ if plugin (MyPlugin)
+ header MY_PLUGIN_FOO eval:check_for_foo()
+ score MY_PLUGIN_FOO 0.1
+ endif
+
+ ifplugin PluginModuleName
+ An alias for "if plugin(PluginModuleName)".
+
+ else
+ Used to support conditional interpretation of the configuration
+ file. Lines between this and a corresponding "endif" line, will
+ be ignored unless the conditional expression evaluates as false
+ (in the perl sense; that is, not defined and 0).
+
+ require_version n.nnnnnn
+ Indicates that the entire file, from this line on, requires a
+ certain version of SpamAssassin to run. If a different (older or
+ newer) version of SpamAssassin tries to read the configuration
+ from this file, it will output a warning instead, and ignore it.
+
+ Note: The version used is in the internal SpamAssassin version
+ format which is "x.yyyzzz", where x is major version, y is minor
+ version, and z is maintenance version. So 3.0.0 is 3.000000, and
+ 3.4.80 is 3.004080.
+
+TEMPLATE TAGS
+ The following "tags" can be used as placeholders in certain options.
+ They will be replaced by the corresponding value when they are used.
+
+ Some tags can take an argument (in parentheses). The argument is
+ optional, and the default is shown below.
+
+ _YESNOCAPS_ "YES"/"NO" for is/isn't spam
+ _YESNO_ "Yes"/"No" for is/isn't spam
+ _SCORE(PAD)_ message score, if PAD is included and is either spaces or
+ zeroes, then pad scores with that many spaces or zeroes
+ (default, none) ie: _SCORE(0)_ makes 2.4 become 02.4,
+ _SCORE(00)_ is 002.4. 12.3 would be 12.3 and 012.3
+ respectively.
+ _REQD_ message threshold
+ _VERSION_ version (eg. 3.0.0 or 3.1.0-r26142-foo1)
+ _SUBVERSION_ sub-version/code revision date (eg. 2004-01-10)
+ _HOSTNAME_ hostname of the machine the mail was processed on
+ _REMOTEHOSTNAME_ hostname of the machine the mail was sent from, only
+ available with spamd
+ _REMOTEHOSTADDR_ ip address of the machine the mail was sent from, only
+ available with spamd
+ _BAYES_ bayes score
+ _TOKENSUMMARY_ number of new, neutral, spammy, and hammy tokens found
+ _BAYESTC_ number of new tokens found
+ _BAYESTCLEARNED_ number of seen tokens found
+ _BAYESTCSPAMMY_ number of spammy tokens found
+ _BAYESTCHAMMY_ number of hammy tokens found
+ _HAMMYTOKENS(N)_ the N most significant hammy tokens (default, 5)
+ _SPAMMYTOKENS(N)_ the N most significant spammy tokens (default, 5)
+ _DATE_ rfc-2822 date of scan
+ _STARS(*)_ one "*" (use any character) for each full score point
+ (note: limited to 50 'stars')
+ _RELAYSTRUSTED_ relays used and deemed to be trusted (see the
+ 'X-Spam-Relays-Trusted' pseudo-header)
+ _RELAYSUNTRUSTED_ relays used that can not be trusted (see the
+ 'X-Spam-Relays-Untrusted' pseudo-header)
+ _RELAYSINTERNAL_ relays used and deemed to be internal (see the
+ 'X-Spam-Relays-Internal' pseudo-header)
+ _RELAYSEXTERNAL_ relays used and deemed to be external (see the
+ 'X-Spam-Relays-External' pseudo-header)
+ _LASTEXTERNALIP_ IP address of client in the external-to-internal
+ SMTP handover
+ _LASTEXTERNALRDNS_ reverse-DNS of client in the external-to-internal
+ SMTP handover
+ _LASTEXTERNALHELO_ HELO string used by client in the external-to-internal
+ SMTP handover
+ _AUTOLEARN_ autolearn status ("ham", "no", "spam", "disabled",
+ "failed", "unavailable")
+ _AUTOLEARNSCORE_ portion of message score used by autolearn
+ _TESTS(,)_ tests hit separated by "," (or other separator)
+ _TESTSSCORES(,)_ as above, except with scores appended (eg. AWL=-3.0,...)
+ _SUBTESTS(,)_ subtests (start with "__") hit separated by ","
+ (or other separator)
+ _DCCB_ DCC's "Brand"
+ _DCCR_ DCC's results
+ _PYZOR_ Pyzor results
+ _RBL_ full results for positive RBL queries in DNS URI format
+ _LANGUAGES_ possible languages of mail
+ _PREVIEW_ content preview
+ _REPORT_ terse report of tests hit (for header reports)
+ _SUMMARY_ summary of tests hit for standard report (for body reports)
+ _CONTACTADDRESS_ contents of the 'report_contact' setting
+ _HEADER(NAME)_ includes the value of a message header. value is the same
+ as is found for header rules (see elsewhere in this doc)
+
+ If a tag reference uses the name of a tag which is not in this list
+ or defined by a loaded plugin, the reference will be left intact and
+ not replaced by any value.
+
+ The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second
+ argument which specifies a format. See the HAMMYTOKENS/SPAMMYTOKENS
+ TAG FORMAT section, below, for details.
+
+ HAMMYTOKENS/SPAMMYTOKENS TAG FORMAT
+ The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second
+ argument which specifies a format: "_SPAMMYTOKENS(N,FMT)_",
+ "_HAMMYTOKENS(N,FMT)_" The following formats are available:
+
+ short
+ Only the tokens themselves are listed. *For example, preference
+ file entry:*
+
+ "add_header all Spammy _SPAMMYTOKENS(2,short)_"
+
+ *Results in message header:*
+
+ "X-Spam-Spammy: remove.php, UD:jpg"
+
+ Indicating that the top two spammy tokens found are "remove.php"
+ and "UD:jpg". (The token itself follows the last colon, the text
+ before the colon indicates something about the token. "UD" means
+ the token looks like it might be part of a domain name.)
+
+ compact
+ The token probability, an abbreviated declassification distance
+ (see example), and the token are listed. *For example,
+ preference file entry:*
+
+ "add_header all Spammy _SPAMMYTOKENS(2,compact)_"
+
+ *Results in message header:*
+
+ "0.989-6--remove.php, 0.988-+--UD:jpg"
+
+ Indicating that the probabilities of the top two tokens are
+ 0.989 and 0.988, respectively. The first token has a
+ declassification distance of 6, meaning that if the token had
+ appeared in at least 6 more ham messages it would not be
+ considered spammy. The "+" for the second token indicates a
+ declassification distance greater than 9.
+
+ long
+ Probability, declassification distance, number of times seen in
+ a ham message, number of times seen in a spam message, age and
+ the token are listed.
+
+ *For example, preference file entry:*
+
+ "add_header all Spammy _SPAMMYTOKENS(2,long)_"
+
+ *Results in message header:*
+
+ "X-Spam-Spammy: 0.989-6--0h-4s--4d--remove.php,
+ 0.988-33--2h-25s--1d--UD:jpg"
+
+ In addition to the information provided by the compact option,
+ the long option shows that the first token appeared in zero ham
+ messages and four spam messages, and that it was last seen four
+ days ago. The second token appeared in two ham messages, 25 spam
+ messages and was last seen one day ago. (Unlike the "compact"
+ option, the long option shows declassification distances that
+ are greater than 9.)
+
+LOCALI[SZ]ATION
+ A line starting with the text "lang xx" will only be interpreted if
+ the user is in that locale, allowing test descriptions and templates
+ to be set for that language.
+
+ The locales string should specify either both the language and
+ country, e.g. "lang pt_BR", or just the language, e.g. "lang de".
+
+SEE ALSO
+ "Mail::SpamAssassin" "spamassassin" "spamd"
+
Added: spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.html
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.html?view=auto&rev=559437
==============================================================================
--- spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.html (added)
+++ spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.html Wed Jul 25 05:52:42 2007
@@ -0,0 +1,56 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>Mail::SpamAssassin::Conf::LDAP - load SpamAssassin scores from LDAP database</title>
+<link rev="made" href="mailto:jm@apache.org" />
+</head>
+
+<body style="background-color: white">
+
+<p><a name="__index__"></a></p>
+<!-- INDEX BEGIN -->
+
+<ul>
+
+ <li><a href="#name">NAME</a></li>
+ <li><a href="#synopsis">SYNOPSIS</a></li>
+ <li><a href="#description">DESCRIPTION</a></li>
+ <li><a href="#methods">METHODS</a></li>
+</ul>
+<!-- INDEX END -->
+
+<hr />
+<p>
+</p>
+<h1><a name="name">NAME</a></h1>
+<p>Mail::SpamAssassin::Conf::LDAP - load SpamAssassin scores from LDAP database</p>
+<p>
+</p>
+<hr />
+<h1><a name="synopsis">SYNOPSIS</a></h1>
+<pre>
+ (see Mail::SpamAssassin)</pre>
+<p>
+</p>
+<hr />
+<h1><a name="description">DESCRIPTION</a></h1>
+<p>Mail::SpamAssassin is a module to identify spam using text analysis and
+several internet-based realtime blacklists.</p>
+<p>This class is used internally by SpamAssassin to load scores from an LDAP
+database. Please refer to the <code>Mail::SpamAssassin</code> documentation for public
+interfaces.</p>
+<p>
+</p>
+<hr />
+<h1><a name="methods">METHODS</a></h1>
+<dl>
+<dt><strong><a name="item_load">$f->load ($username)</a></strong><br />
+</dt>
+<dd>
+Read configuration paramaters from LDAP server and parse scores from it.
+</dd>
+</dl>
+
+</body>
+
+</html>
Added: spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.txt
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.txt?view=auto&rev=559437
==============================================================================
--- spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.txt (added)
+++ spamassassin/site/full/3.2.x/doc/Mail_SpamAssassin_Conf_LDAP.txt Wed Jul 25 05:52:42 2007
@@ -0,0 +1,20 @@
+NAME
+ Mail::SpamAssassin::Conf::LDAP - load SpamAssassin scores from LDAP
+ database
+
+SYNOPSIS
+ (see Mail::SpamAssassin)
+
+DESCRIPTION
+ Mail::SpamAssassin is a module to identify spam using text analysis and
+ several internet-based realtime blacklists.
+
+ This class is used internally by SpamAssassin to load scores from an
+ LDAP database. Please refer to the "Mail::SpamAssassin" documentation
+ for public interfaces.
+
+METHODS
+ $f->load ($username)
+ Read configuration paramaters from LDAP server and parse scores from
+ it.
+