You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spamassassin.apache.org by pa...@apache.org on 2011/06/22 17:01:03 UTC

svn commit: r1138498 [13/15] - in /spamassassin/site/full/3.3.x: ./ doc/

Added: spamassassin/site/full/3.3.x/doc/sa-learn.txt
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.3.x/doc/sa-learn.txt?rev=1138498&view=auto
==============================================================================
--- spamassassin/site/full/3.3.x/doc/sa-learn.txt (added)
+++ spamassassin/site/full/3.3.x/doc/sa-learn.txt Wed Jun 22 15:00:59 2011
@@ -0,0 +1,626 @@
+NAME
+    sa-learn - train SpamAssassin's Bayesian classifier
+
+SYNOPSIS
+    sa-learn [options] [file]...
+
+    sa-learn [options] --dump [ all | data | magic ]
+
+    Options:
+
+     --ham                 Learn messages as ham (non-spam)
+     --spam                Learn messages as spam
+     --forget              Forget a message
+     --use-ignores         Use bayes_ignore_from and bayes_ignore_to
+     --sync                Synchronize the database and the journal if needed
+     --force-expire        Force a database sync and expiry run
+     --dbpath <path>       Allows commandline override (in bayes_path form)
+                           for where to read the Bayes DB from
+     --dump [all|data|magic]  Display the contents of the Bayes database
+                           Takes optional argument for what to display
+      --regexp <re>        For dump only, specifies which tokens to
+                           dump based on a regular expression.
+     -f file, --folders=file  Read list of files/directories from file
+     --dir                 Ignored; historical compatibility
+     --file                Ignored; historical compatibility
+     --mbox                Input sources are in mbox format
+     --mbx                 Input sources are in mbx format
+     --showdots            Show progress using dots
+     --progress            Show progress using progress bar
+     --no-sync             Skip synchronizing the database and journal
+                           after learning
+     -L, --local           Operate locally, no network accesses
+     --import              Migrate data from older version/non DB_File
+                           based databases
+     --clear               Wipe out existing database
+     --backup              Backup, to STDOUT, existing database
+     --restore <filename>  Restore a database from filename
+     -u username, --username=username
+                           Override username taken from the runtime
+                           environment, used with SQL
+     -C path, --configpath=path, --config-file=path
+                           Path to standard configuration dir
+     -p prefs, --prefspath=file, --prefs-file=file
+                           Set user preferences file
+     --siteconfigpath=path Path for site configs
+                           (default: /etc/mail/spamassassin)
+     --cf='config line'    Additional line of configuration
+     -D, --debug [area=n,...]  Print debugging messages
+     -V, --version         Print version
+     -h, --help            Print usage message
+
+DESCRIPTION
+    Given a typical selection of your incoming mail classified as spam or
+    ham (non-spam), this tool will feed each mail to SpamAssassin, allowing
+    it to 'learn' what signs are likely to mean spam, and which are likely
+    to mean ham.
+
+    Simply run this command once for each of your mail folders, and it will
+    ''learn'' from the mail therein.
+
+    Note that csh-style *globbing* in the mail folder names is supported; in
+    other words, listing a folder name as "*" will scan every folder that
+    matches. See "Mail::SpamAssassin::ArchiveIterator" for more details.
+
+    SpamAssassin remembers which mail messages it has learnt already, and
+    will not re-learn those messages again, unless you use the --forget
+    option. Messages learnt as spam will have SpamAssassin markup removed,
+    on the fly.
+
+    If you make a mistake and scan a mail as ham when it is spam, or vice
+    versa, simply rerun this command with the correct classification, and
+    the mistake will be corrected. SpamAssassin will automatically 'forget'
+    the previous indications.
+
+    Users of "spamd" who wish to perform training remotely, over a network,
+    should investigate the "spamc -L" switch.
+
+OPTIONS
+    --ham
+        Learn the input message(s) as ham. If you have previously learnt any
+        of the messages as spam, SpamAssassin will forget them first, then
+        re-learn them as ham. Alternatively, if you have previously learnt
+        them as ham, it'll skip them this time around. If the messages have
+        already been filtered through SpamAssassin, the learner will ignore
+        any modifications SpamAssassin may have made.
+
+    --spam
+        Learn the input message(s) as spam. If you have previously learnt
+        any of the messages as ham, SpamAssassin will forget them first,
+        then re-learn them as spam. Alternatively, if you have previously
+        learnt them as spam, it'll skip them this time around. If the
+        messages have already been filtered through SpamAssassin, the
+        learner will ignore any modifications SpamAssassin may have made.
+
+    --folders=*filename*, -f *filename*
+        sa-learn will read in the list of folders from the specified file,
+        one folder per line in the file. If the folder is prefixed with
+        "ham:type:" or "spam:type:", sa-learn will learn that folder
+        appropriately, otherwise the folders will be assumed to be of the
+        type specified by --ham or --spam.
+
+        "type" above is optional, but is the same as the standard for
+        ArchiveIterator: mbox, mbx, dir, file, or detect (the default if not
+        specified).
+
+    --mbox
+        sa-learn will read in the file(s) containing the emails to be
+        learned, and will process them in mbox format (one or more emails
+        per file).
+
+    --mbx
+        sa-learn will read in the file(s) containing the emails to be
+        learned, and will process them in mbx format (one or more emails per
+        file).
+
+    --use-ignores
+        Don't learn the message if a from address matches configuration file
+        item "bayes_ignore_from" or a to address matches "bayes_ignore_to".
+        The option might be used when learning from a large file of messages
+        from which the hammy spam messages or spammy ham messages have not
+        been removed.
+
+    --sync
+        Synchronize the journal and databases. Upon successfully syncing the
+        database with the entries in the journal, the journal file is
+        removed.
+
+    --force-expire
+        Forces an expiry attempt, regardless of whether it may be necessary
+        or not. Note: This doesn't mean any tokens will actually expire.
+        Please see the EXPIRATION section below.
+
+        Note: "--force-expire" also causes the journal data to be
+        synchronized into the Bayes databases.
+
+    --forget
+        Forget a given message previously learnt.
+
+    --dbpath
+        Allows a commandline override of the *bayes_path* configuration
+        option.
+
+    --dump *option*
+        Display the contents of the Bayes database. Without an option or
+        with the *all* option, all magic tokens and data tokens will be
+        displayed. *magic* will only display magic tokens, and *data* will
+        only display the data tokens.
+
+        Can also use the --regexp *RE* option to specify which tokens to
+        display based on a regular expression.
+
+    --clear
+        Clear an existing Bayes database by removing all traces of the
+        database.
+
+        WARNING: This is destructive and should be used with care.
+
+    --backup
+        Performs a dump of the Bayes database in machine/human readable
+        format.
+
+        The dump will include token and seen data. It is suitable for input
+        back into the --restore command.
+
+    --restore=*filename*
+        Performs a restore of the Bayes database defined by *filename*.
+
+        WARNING: This is a destructive operation, previous Bayes data will
+        be wiped out.
+
+    -h, --help
+        Print help message and exit.
+
+    -u *username*, --username=*username*
+        If specified this username will override the username taken from the
+        runtime environment. You can use this option to specify users in a
+        virtual user configuration when using SQL as the Bayes backend.
+
+        NOTE: This option will not change to the given *username*, it will
+        only attempt to act on behalf of that user. Because of this you will
+        need to have proper permissions to be able to change files owned by
+        *username*. In the case of SQL this generally is not a problem.
+
+    -C *path*, --configpath=*path*, --config-file=*path*
+        Use the specified path for locating the distributed configuration
+        files. Ignore the default directories (usually
+        "/usr/share/spamassassin" or similar).
+
+    --siteconfigpath=*path*
+        Use the specified path for locating site-specific configuration
+        files. Ignore the default directories (usually
+        "/etc/mail/spamassassin" or similar).
+
+    --cf='config line'
+        Add additional lines of configuration directly from the
+        command-line, parsed after the configuration files are read.
+        Multiple --cf arguments can be used, and each will be considered a
+        separate line of configuration.
+
+    -p *prefs*, --prefspath=*prefs*, --prefs-file=*prefs*
+        Read user score preferences from *prefs* (usually
+        "$HOME/.spamassassin/user_prefs").
+
+    --progress
+        Prints a progress bar (to STDERR) showing the current progress. In
+        the case where no valid terminal is found this option will behave
+        very much like the --showdots option.
+
+    -D [*area,...*], --debug [*area,...*]
+        Produce debugging output. If no areas are listed, all debugging
+        information is printed. Diagnostic output can also be enabled for
+        each area individually; *area* is the area of the code to
+        instrument. For example, to produce diagnostic output on bayes,
+        learn, and dns, use:
+
+                spamassassin -D bayes,learn,dns
+
+        For more information about which areas (also known as channels) are
+        available, please see the documentation at:
+
+                C<http://wiki.apache.org/spamassassin/DebugChannels>
+
+        Higher priority informational messages that are suitable for logging
+        in normal circumstances are available with an area of "info".
+
+    --no-sync
+        Skip the slow synchronization step which normally takes place after
+        changing database entries. If you plan to learn from many folders in
+        a batch, or to learn many individual messages one-by-one, it is
+        faster to use this switch and run "sa-learn --sync" once all the
+        folders have been scanned.
+
+        Clarification: The state of *--no-sync* overrides the
+        *bayes_learn_to_journal* configuration option. If not specified,
+        sa-learn will learn to the database directly. If specified, sa-learn
+        will learn to the journal file.
+
+        Note: *--sync* and *--no-sync* can be specified on the same
+        commandline, which is slightly confusing. In this case, the
+        *--no-sync* option is ignored since there is no learn operation.
+
+    -L, --local
+        Do not perform any network accesses while learning details about the
+        mail messages. This will speed up the learning process, but may
+        result in a slightly lower accuracy.
+
+        Note that this is currently ignored, as current versions of
+        SpamAssassin will not perform network access while learning; but
+        future versions may.
+
+    --import
+        If you previously used SpamAssassin's Bayesian learner without the
+        "DB_File" module installed, it will have created files in other
+        formats, such as "GDBM_File", "NDBM_File", or "SDBM_File". This
+        switch allows you to migrate that old data into the "DB_File"
+        format. It will overwrite any data currently in the "DB_File".
+
+        Can also be used with the --dbpath *path* option to specify the
+        location of the Bayes files to use.
+
+MIGRATION
+    There are now multiple backend storage modules available for storing
+    user's bayesian data. As such you might want to migrate from one backend
+    to another. Here is a simple procedure for migrating from one backend to
+    another.
+
+    Note that if you have individual user databases you will have to perform
+    a similar procedure for each one of them.
+
+    sa-learn --sync
+        This will sync any outstanding journal entries
+
+    sa-learn --backup > backup.txt
+        This will save all your Bayes data to a plain text file.
+
+    sa-learn --clear
+        This is optional, but good to do to clear out the old database.
+
+    Repeat!
+        At this point, if you have multiple databases, you should perform
+        the procedure above for each of them. (i.e. each user's database
+        needs to be backed up before continuing.)
+
+    Switch backends
+        Once you have backed up all databases you can update your
+        configuration for the new database backend. This will involve at
+        least the bayes_store_module config option and may involve some
+        additional config options depending on what is required by the
+        module. (For example, you may need to configure an SQL database.)
+
+    sa-learn --restore backup.txt
+        Again, you need to do this for every database.
+
+    If you are migrating to SQL you can make use of the -u <username> option
+    in sa-learn to populate each user's database. Otherwise, you must run
+    sa-learn as the user who database you are restoring.
+
+INTRODUCTION TO BAYESIAN FILTERING
+    (Thanks to Michael Bell for this section!)
+
+    For a more lengthy description of how this works, go to
+    http://www.paulgraham.com/ and see "A Plan for Spam". It's reasonably
+    readable, even if statistics make me break out in hives.
+
+    The short semi-inaccurate version: Given training, a spam heuristics
+    engine can take the most "spammy" and "hammy" words and apply
+    probabilistic analysis. Furthermore, once given a basis for the
+    analysis, the engine can continue to learn iteratively by applying both
+    the non-Bayesian and Bayesian rulesets together to create evolving
+    "intelligence".
+
+    SpamAssassin 2.50 and later supports Bayesian spam analysis, in the form
+    of the BAYES rules. This is a new feature, quite powerful, and is
+    disabled until enough messages have been learnt.
+
+    The pros of Bayesian spam analysis:
+
+    Can greatly reduce false positives and false negatives.
+        It learns from your mail, so it is tailored to your unique e-mail
+        flow.
+
+    Once it starts learning, it can continue to learn from SpamAssassin and
+    improve over time.
+
+    And the cons:
+
+    A decent number of messages are required before results are useful for
+    ham/spam determination.
+    It's hard to explain why a message is or isn't marked as spam.
+        i.e.: a straightforward rule, that matches, say, "VIAGRA" is easy to
+        understand. If it generates a false positive or false negative, it
+        is fairly easy to understand why.
+
+        With Bayesian analysis, it's all probabilities - "because the past
+        says it is likely as this falls into a probabilistic distribution
+        common to past spam in your systems". Tell that to your users! Tell
+        that to the client when he asks "what can I do to change this". (By
+        the way, the answer in this case is "use whitelisting".)
+
+    It will take disk space and memory.
+        The databases it maintains take quite a lot of resources to store
+        and use.
+
+GETTING STARTED
+    Still interested? Ok, here's the guidelines for getting this working.
+
+    First a high-level overview:
+
+    Build a significant sample of both ham and spam.
+        I suggest several thousand of each, placed in SPAM and HAM
+        directories or mailboxes. Yes, you MUST hand-sort this - otherwise
+        the results won't be much better than SpamAssassin on its own.
+        Verify the spamminess/haminess of EVERY message. You're urged to
+        avoid using a publicly available corpus (sample) - this must be
+        taken from YOUR mail server, if it is to be statistically useful.
+        Otherwise, the results may be pretty skewed.
+
+    Use this tool to teach SpamAssassin about these samples, like so:
+                sa-learn --spam /path/to/spam/folder
+                sa-learn --ham /path/to/ham/folder
+                ...
+
+        Let SpamAssassin proceed, learning stuff. When it finds ham and spam
+        it will add the "interesting tokens" to the database.
+
+    If you need SpamAssassin to forget about specific messages, use the
+    --forget option.
+        This can be applied to either ham or spam that has run through the
+        sa-learn processes. It's a bit of a hammer, really, lowering the
+        weighting of the specific tokens in that message (only if that
+        message has been processed before).
+
+    Learning from single messages uses a command like this:
+                sa-learn --ham --no-sync mailmessage
+
+        This is handy for binding to a key in your mail user agent. It's
+        very fast, as all the time-consuming stuff is deferred until you run
+        with the "--sync" option.
+
+    Autolearning is enabled by default
+        If you don't have a corpus of mail saved to learn, you can let
+        SpamAssassin automatically learn the mail that you receive. If you
+        are autolearning from scratch, the amount of mail you receive will
+        determine how long until the BAYES_* rules are activated.
+
+EFFECTIVE TRAINING
+    Learning filters require training to be effective. If you don't train
+    them, they won't work. In addition, you need to train them with new
+    messages regularly to keep them up-to-date, or their data will become
+    stale and impact accuracy.
+
+    You need to train with both spam *and* ham mails. One type of mail alone
+    will not have any effect.
+
+    Note that if your mail folders contain things like forwarded spam,
+    discussions of spam-catching rules, etc., this will cause trouble. You
+    should avoid scanning those messages if possible. (An easy way to do
+    this is to move them aside, into a folder which is not scanned.)
+
+    If the messages you are learning from have already been filtered through
+    SpamAssassin, the learner will compensate for this. In effect, it learns
+    what each message would look like if you had run "spamassassin -d" over
+    it in advance.
+
+    Another thing to be aware of, is that typically you should aim to train
+    with at least 1000 messages of spam, and 1000 ham messages, if possible.
+    More is better, but anything over about 5000 messages does not improve
+    accuracy significantly in our tests.
+
+    Be careful that you train from the same source -- for example, if you
+    train on old spam, but new ham mail, then the classifier will think that
+    a mail with an old date stamp is likely to be spam.
+
+    It's also worth noting that training with a very small quantity of ham,
+    will produce atrocious results. You should aim to train with at least
+    the same amount (or more if possible!) of ham data than spam.
+
+    On an on-going basis, it is best to keep training the filter to make
+    sure it has fresh data to work from. There are various ways to do this:
+
+    1. Supervised learning
+        This means keeping a copy of all or most of your mail, separated
+        into spam and ham piles, and periodically re-training using those.
+        It produces the best results, but requires more work from you, the
+        user.
+
+        (An easy way to do this, by the way, is to create a new folder for
+        'deleted' messages, and instead of deleting them from other folders,
+        simply move them in there instead. Then keep all spam in a separate
+        folder and never delete it. As long as you remember to move
+        misclassified mails into the correct folder set, it is easy enough
+        to keep up to date.)
+
+    2. Unsupervised learning from Bayesian classification
+        Another way to train is to chain the results of the Bayesian
+        classifier back into the training, so it reinforces its own
+        decisions. This is only safe if you then retrain it based on any
+        errors you discover.
+
+        SpamAssassin does not support this method, due to experimental
+        results which strongly indicate that it does not work well, and
+        since Bayes is only one part of the resulting score presented to the
+        user (while Bayes may have made the wrong decision about a mail, it
+        may have been overridden by another system).
+
+    3. Unsupervised learning from SpamAssassin rules
+        Also called 'auto-learning' in SpamAssassin. Based on statistical
+        analysis of the SpamAssassin success rates, we can automatically
+        train the Bayesian database with a certain degree of confidence that
+        our training data is accurate.
+
+        It should be supplemented with some supervised training in addition,
+        if possible.
+
+        This is the default, but can be turned off by setting the
+        SpamAssassin configuration parameter "bayes_auto_learn" to 0.
+
+    4. Mistake-based training
+        This means training on a small number of mails, then only training
+        on messages that SpamAssassin classifies incorrectly. This works,
+        but it takes longer to get it right than a full training session
+        would.
+
+FILES
+    sa-learn and the other parts of SpamAssassin's Bayesian learner, use a
+    set of persistent database files to store the learnt tokens, as follows.
+
+    bayes_toks
+        The database of tokens, containing the tokens learnt, their count of
+        occurrences in ham and spam, and the timestamp when the token was
+        last seen in a message.
+
+        This database also contains some 'magic' tokens, as follows: the
+        version number of the database, the number of ham and spam messages
+        learnt, the number of tokens in the database, and timestamps of: the
+        last journal sync, the last expiry run, the last expiry token
+        reduction count, the last expiry timestamp delta, the oldest token
+        timestamp in the database, and the newest token timestamp in the
+        database.
+
+        This is a database file, using "DB_File". The database 'version
+        number' is 0 for databases from 2.5x, 1 for databases from certain
+        2.6x development releases, 2 for 2.6x, and 3 for 3.0 and later
+        releases.
+
+    bayes_seen
+        A map of Message-Id and some data from headers and body to what that
+        message was learnt as. This is used so that SpamAssassin can avoid
+        re-learning a message it has already seen, and so it can reverse the
+        training if you later decide that message was learnt incorrectly.
+
+        This is a database file, using "DB_File".
+
+    bayes_journal
+        While SpamAssassin is scanning mails, it needs to track which tokens
+        it uses in its calculations. To avoid the contention of having each
+        SpamAssassin process attempting to gain write access to the Bayes
+        DB, the token timestamps are written to a 'journal' file which will
+        later (either automatically or via "sa-learn --sync") be used to
+        synchronize the Bayes DB.
+
+        Also, through the use of "bayes_learn_to_journal", or when using the
+        "--no-sync" option with sa-learn, the actual learning data will take
+        be placed into the journal for later synchronization. This is
+        typically useful for high-traffic sites to avoid the same contention
+        as stated above.
+
+EXPIRATION
+    Since SpamAssassin can auto-learn messages, the Bayes database files
+    could increase perpetually until they fill your disk. To control this,
+    SpamAssassin performs journal synchronization and bayes expiration
+    periodically when certain criteria (listed below) are met.
+
+    SpamAssassin can sync the journal and expire the DB tokens either
+    manually or opportunistically. A journal sync is due if *--sync* is
+    passed to sa-learn (manual), or if the following is true
+    (opportunistic):
+
+    - bayes_journal_max_size does not equal 0 (means don't sync)
+    - the journal file exists
+
+    and either:
+
+    - the journal file has a size greater than bayes_journal_max_size
+
+    or
+
+    - a journal sync has previously occurred, and at least 1 day has passed
+    since that sync
+
+    Expiry is due if *--force-expire* is passed to sa-learn (manual), or if
+    all of the following are true (opportunistic):
+
+    - the last expire was attempted at least 12hrs ago
+    - bayes_auto_expire does not equal 0
+    - the number of tokens in the DB is > 100,000
+    - the number of tokens in the DB is > bayes_expiry_max_db_size
+    - there is at least a 12 hr difference between the oldest and newest
+    token atimes
+
+  EXPIRE LOGIC
+    If either the manual or opportunistic method causes an expire run to
+    start, here is the logic that is used:
+
+    - figure out how many tokens to keep. take the larger of either
+    bayes_expiry_max_db_size * 75% or 100,000 tokens. therefore, the goal
+    reduction is number of tokens - number of tokens to keep.
+    - if the reduction number is < 1000 tokens, abort (not worth the
+    effort).
+    - if an expire has been done before, guesstimate the new atime delta
+    based on the old atime delta. (new_atime_delta = old_atime_delta *
+    old_reduction_count / goal)
+    - if no expire has been done before, or the last expire looks "weird",
+    do an estimation pass. The definition of "weird" is:
+
+        - last expire over 30 days ago
+        - last atime delta was < 12 hrs
+        - last reduction count was < 1000 tokens
+        - estimated new atime delta is < 12 hrs
+        - the difference between the last reduction count and the goal
+        reduction count is > 50%
+
+  ESTIMATION PASS LOGIC
+    Go through each of the DB's tokens. Starting at 12hrs, calculate whether
+    or not the token would be expired (based on the difference between the
+    token's atime and the db's newest token atime) and keep the count. Work
+    out from 12hrs exponentially by powers of 2. ie: 12hrs * 1, 12hrs * 2,
+    12hrs * 4, 12hrs * 8, and so on, up to 12hrs * 512 (6144hrs, or 256
+    days).
+
+    The larger the delta, the smaller the number of tokens that will be
+    expired. Conversely, the number of tokens goes up as the delta gets
+    smaller. So starting at the largest atime delta, figure out which delta
+    will expire the most tokens without going above the goal expiration
+    count. Use this to choose the atime delta to use, unless one of the
+    following occurs:
+
+    - the largest atime (smallest reduction count) would expire too many
+    tokens. this means the learned tokens are mostly old and there needs to
+    be new tokens learned before an expire can occur.
+    - all of the atime choices result in 0 tokens being removed. this means
+    the tokens are all newer than 12 hours and there needs to be new tokens
+    learned before an expire can occur.
+    - the number of tokens that would be removed is < 1000. the benefit
+    isn't worth the effort. more tokens need to be learned.
+
+    If the expire run gets past this point, it will continue to the end. A
+    new DB is created since the majority of DB libraries don't shrink the DB
+    file when tokens are removed. So we do the "create new, migrate old to
+    new, remove old, rename new" shuffle.
+
+  EXPIRY RELATED CONFIGURATION SETTINGS
+    "bayes_auto_expire" is used to specify whether or not SpamAssassin ought
+    to opportunistically attempt to expire the Bayes database. The default
+    is 1 (yes).
+    "bayes_expiry_max_db_size" specifies both the auto-expire token count
+    point, as well as the resulting number of tokens after expiry as
+    described above. The default value is 150,000, which is roughly
+    equivalent to a 6Mb database file if you're using DB_File.
+    "bayes_journal_max_size" specifies how large the Bayes journal will grow
+    before it is opportunistically synced. The default value is 102400.
+
+INSTALLATION
+    The sa-learn command is part of the Mail::SpamAssassin Perl module.
+    Install this as a normal Perl module, using "perl -MCPAN -e shell", or
+    by hand.
+
+SEE ALSO
+    spamassassin(1) spamc(1) Mail::SpamAssassin(3)
+    Mail::SpamAssassin::ArchiveIterator(3)
+
+    <http://www.paulgraham.com/> Paul Graham's "A Plan For Spam" paper
+
+    <http://www.linuxjournal.com/article/6467> Gary Robinson's f(x) and
+    combining algorithms, as used in SpamAssassin
+
+    <http://www.bgl.nu/~glouis/bogofilter/> 'Training on error' page. A
+    discussion of various Bayes training regimes, including 'train on error'
+    and unsupervised training.
+
+PREREQUISITES
+    "Mail::SpamAssassin"
+
+AUTHORS
+    The SpamAssassin(tm) Project <http://spamassassin.apache.org/>
+

Added: spamassassin/site/full/3.3.x/doc/sa-update.html
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.3.x/doc/sa-update.html?rev=1138498&view=auto
==============================================================================
--- spamassassin/site/full/3.3.x/doc/sa-update.html (added)
+++ spamassassin/site/full/3.3.x/doc/sa-update.html Wed Jun 22 15:00:59 2011
@@ -0,0 +1,285 @@
+<?xml version="1.0" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>sa-update - automate SpamAssassin rule updates</title>
+<meta http-equiv="content-type" content="text/html; charset=utf-8" />
+<link rev="made" href="mailto:parker@minotaur.apache.org" />
+</head>
+
+<body style="background-color: white">
+
+
+<!-- INDEX BEGIN -->
+<div name="index">
+<p><a name="__index__"></a></p>
+
+<ul>
+
+	<li><a href="#name">NAME</a></li>
+	<li><a href="#synopsis">SYNOPSIS</a></li>
+	<li><a href="#description">DESCRIPTION</a></li>
+	<li><a href="#options">OPTIONS</a></li>
+	<li><a href="#exit_codes">EXIT CODES</a></li>
+	<li><a href="#see_also">SEE ALSO</a></li>
+	<li><a href="#prerequesites">PREREQUESITES</a></li>
+	<li><a href="#bugs">BUGS</a></li>
+	<li><a href="#authors">AUTHORS</a></li>
+	<li><a href="#copyright">COPYRIGHT</a></li>
+</ul>
+
+<hr name="index" />
+</div>
+<!-- INDEX END -->
+
+<p>
+</p>
+<h1><a name="name">NAME</a></h1>
+<p>sa-update - automate SpamAssassin rule updates</p>
+<p>
+</p>
+<hr />
+<h1><a name="synopsis">SYNOPSIS</a></h1>
+<p><strong>sa-update</strong> [options]</p>
+<p>Options:</p>
+<pre>
+  --channel channel       Retrieve updates from this channel
+                          Use multiple times for multiple channels
+  --channelfile file      Retrieve updates from the channels in the file
+  --checkonly             Check for update availability, do not install
+  --install filename      Install updates directly from this file. Signature
+                          verification will use &quot;file.asc&quot; and &quot;file.sha1&quot;
+  --allowplugins          Allow updates to load plugin code
+  --gpgkey key            Trust the key id to sign releases
+                          Use multiple times for multiple keys
+  --gpgkeyfile file       Trust the key ids in the file to sign releases
+  --gpghomedir path       Store the GPG keyring in this directory
+  --gpg and --nogpg       Use (or do not use) GPG to verify updates
+                          (--gpg is assumed by use of the above
+                          --gpgkey and --gpgkeyfile options)
+  --import file           Import GPG key(s) from file into sa-update's
+                          keyring. Use multiple times for multiple files
+  --updatedir path        Directory to place updates, defaults to the
+                          SpamAssassin site rules directory
+                          (default: /home/parker/perl5/perlbrew/perls/perl-5.14.1/var/spamassassin/3.003003)
+  --refreshmirrors        Force the MIRRORED.BY file to be updated
+  -D, --debug [area=n,...]  Print debugging messages
+  -v, --verbose           Be more verbose, like print updated channel names
+  -V, --version           Print version
+  -h, --help              Print usage message</pre>
+<p>
+</p>
+<hr />
+<h1><a name="description">DESCRIPTION</a></h1>
+<p>sa-update automates the process of downloading and installing new rules and
+configuration, based on channels.  The default channel is
+<em>updates.spamassassin.org</em>, which has updated rules since the previous
+release.</p>
+<p>Update archives are verified using SHA1 hashes and GPG signatures, by default.</p>
+<p>Note that <code>sa-update</code> will not restart <code>spamd</code> or otherwise cause
+a scanner to reload the now-updated ruleset automatically.  Instead,
+<code>sa-update</code> is typically used in something like the following manner:</p>
+<pre>
+        sa-update &amp;&amp; /etc/init.d/spamassassin reload</pre>
+<p>This works because <code>sa-update</code> only returns an exit status of <code>0</code> if
+it has successfully downloaded and installed an updated ruleset.</p>
+<p>
+</p>
+<hr />
+<h1><a name="options">OPTIONS</a></h1>
+<dl>
+<dt><strong><a name="channel" class="item"><strong>--channel</strong></a></strong></dt>
+
+<dd>
+<p>sa-update can update multiple channels at the same time.  By default, it will
+only access &quot;updates.spamassassin.org&quot;, but more channels can be specified via
+this option.  If there are multiple additional channels, use the option
+multiple times, once per channel.  i.e.:</p>
+<pre>
+        sa-update --channel foo.example.com --channel bar.example.com</pre>
+</dd>
+<dt><strong><a name="channelfile" class="item"><strong>--channelfile</strong></a></strong></dt>
+
+<dd>
+<p>Similar to the <strong>--channel</strong> option, except specify the additional channels in a
+file instead of on the commandline.  This is useful when there are a
+lot of additional channels.</p>
+</dd>
+<dt><strong><a name="checkonly" class="item"><strong>--checkonly</strong></a></strong></dt>
+
+<dd>
+<p>Only check if an update is available, don't actually download and install it.
+The exit code will be <code>0</code> or <code>1</code> as described below.</p>
+</dd>
+<dt><strong><a name="install" class="item"><strong>--install</strong></a></strong></dt>
+
+<dd>
+<p>Install updates &quot;offline&quot;, from the named tar.gz file, instead of performing
+DNS lookups and HTTP invocations.</p>
+<p>Files named <strong>file</strong>.sha1 and <strong>file</strong>.asc will be used for the SHA-1 and GPG
+signature, respectively.  The filename provided must contain a version number
+of at least 3 digits, which will be used as the channel's update version
+number.</p>
+<p>Multiple <strong>--channel</strong> switches cannot be used with <strong>--install</strong>.  To install
+multiple channels from tarballs, run <code>sa-update</code> multiple times with different
+<strong>--channel</strong> and <strong>--install</strong> switches, e.g.:</p>
+<pre>
+        sa-update --channel foo.example.com --install foo-34958.tgz
+        sa-update --channel bar.example.com --install bar-938455.tgz</pre>
+</dd>
+<dt><strong><a name="allowplugins" class="item"><strong>--allowplugins</strong></a></strong></dt>
+
+<dd>
+<p>Allow downloaded updates to activate plugins.  The default is not to
+activate plugins; any <code>loadplugin</code> or <code>tryplugin</code> lines will be commented
+in the downloaded update rules files.</p>
+</dd>
+<dt><strong><a name="gpg_nogpg" class="item"><strong>--gpg</strong>, <strong>--nogpg</strong></a></strong></dt>
+
+<dd>
+<p>sa-update by default will verify update archives by use of a SHA1 checksum
+and GPG signature.  SHA1 hashes can verify whether or not the downloaded
+archive has been corrupted, but it does not offer any form of security
+regarding whether or not the downloaded archive is legitimate (aka:
+non-modifed by evildoers).  GPG verification of the archive is used to
+solve that problem.</p>
+<p>If you wish to skip GPG verification, you can use the <strong>--nogpg</strong> option
+to disable its use.  Use of the following gpgkey-related options will
+override <strong>--nogpg</strong> and keep GPG verification enabled.</p>
+<p>Note: Currently, only GPG itself is supported (ie: not PGP).  v1.2 has been
+tested, although later versions ought to work as well.</p>
+</dd>
+<dt><strong><a name="gpgkey" class="item"><strong>--gpgkey</strong></a></strong></dt>
+
+<dd>
+<p>sa-update has the concept of &quot;release trusted&quot; GPG keys.  When an archive is
+downloaded and the signature verified, sa-update requires that the signature
+be from one of these &quot;release trusted&quot; keys or else verification fails.  This
+prevents third parties from manipulating the files on a mirror, for instance,
+and signing with their own key.</p>
+<p>By default, sa-update trusts key id <code>265FA05B</code>, which is the standard
+SpamAssassin release key.  Use this option to trust additional keys.  See the
+<strong>--import</strong> option for how to add keys to sa-update's keyring.  For sa-update
+to use a key it must be in sa-update's keyring and trusted.</p>
+<p>For multiple keys, use the option multiple times.  i.e.:</p>
+<pre>
+        sa-update --gpgkey E580B363 --gpgkey 298BC7D0</pre>
+<p>Note: use of this option automatically enables GPG verification.</p>
+</dd>
+<dt><strong><a name="gpgkeyfile" class="item"><strong>--gpgkeyfile</strong></a></strong></dt>
+
+<dd>
+<p>Similar to the <strong>--gpgkey</strong> option, except specify the additional keys in a file
+instead of on the commandline.  This is extremely useful when there are a lot
+of additional keys that you wish to trust.</p>
+</dd>
+<dt><strong><a name="gpghomedir" class="item"><strong>--gpghomedir</strong></a></strong></dt>
+
+<dd>
+<p>Specify a directory path to use as a storage area for the <code>sa-update</code> GPG
+keyring.  By default, this is</p>
+<pre>
+        /home/parker/perl5/perlbrew/perls/perl-5.14.1/etc/mail/spamassassin/sa-update-keys</pre>
+</dd>
+<dt><strong><a name="import2" class="item"><strong>--import</strong></a></strong></dt>
+
+<dd>
+<p>Use to import GPG key(s) from a file into the sa-update keyring which is
+located in the directory specified by <strong>--gpghomedir</strong>.  Before using channels
+from third party sources, you should use this option to import the GPG key(s)
+used by those channels.  You must still use the <strong>--gpgkey</strong> or <strong>--gpgkeyfile</strong>
+options above to get sa-update to trust imported keys.</p>
+<p>To import multiple keys, use the option multiple times.  i.e.:</p>
+<pre>
+        sa-update --import channel1-GPG.KEY --import channel2-GPG.KEY</pre>
+<p>Note: use of this option automatically enables GPG verification.</p>
+</dd>
+<dt><strong><a name="refreshmirrors" class="item"><strong>--refreshmirrors</strong></a></strong></dt>
+
+<dd>
+<p>Force the list of sa-update mirrors for each channel, stored in the MIRRORED.BY
+file, to be updated.  By default, the MIRRORED.BY file will be cached for up to
+7 days after each time it is downloaded.</p>
+</dd>
+<dt><strong><a name="updatedir2" class="item"><strong>--updatedir</strong></a></strong></dt>
+
+<dd>
+<p>By default, <code>sa-update</code> will use the system-wide rules update directory:</p>
+<pre>
+        /home/parker/perl5/perlbrew/perls/perl-5.14.1/var/spamassassin/3.003003</pre>
+<p>If the updates should be stored in another location, specify it here.</p>
+<p>Note that use of this option is not recommended; if you're just using sa-update
+to download updated rulesets for a scanner, and sa-update is placing updates in
+the wrong directory, you probably need to rebuild SpamAssassin with different
+<code>Makefile.PL</code> arguments, instead of overriding sa-update's runtime behaviour.</p>
+</dd>
+<dt><strong><a name="d_area_debug_area4" class="item"><strong>-D</strong> [<em>area,...</em>], <strong>--debug</strong> [<em>area,...</em>]</a></strong></dt>
+
+<dd>
+<p>Produce debugging output.  If no areas are listed, all debugging information is
+printed.  Diagnostic output can also be enabled for each area individually;
+<em>area</em> is the area of the code to instrument. For example, to produce
+diagnostic output on channel, gpg, and http, use:</p>
+<pre>
+        sa-update -D channel,gpg,http</pre>
+<p>For more information about which areas (also known as channels) are
+available, please see the documentation at
+<a href="http://wiki.apache.org/spamassassin/DebugChannels">http://wiki.apache.org/spamassassin/DebugChannels</a>.</p>
+</dd>
+<dt><strong><a name="h_help4" class="item"><strong>-h</strong>, <strong>--help</strong></a></strong></dt>
+
+<dd>
+<p>Print help message and exit.</p>
+</dd>
+<dt><strong><a name="v_version3" class="item"><strong>-V</strong>, <strong>--version</strong></a></strong></dt>
+
+<dd>
+<p>Print sa-update version and exit.</p>
+</dd>
+</dl>
+<p>
+</p>
+<hr />
+<h1><a name="exit_codes">EXIT CODES</a></h1>
+<p>An exit code of <code>0</code> means an update was available, and was downloaded and
+installed successfully if --checkonly was not specified.</p>
+<p>An exit code of <code>1</code> means no fresh updates were available.</p>
+<p>An exit code of <code>2</code> means that at least one update is available but that a
+lint check of the site pre files failed.  The site pre files must pass a lint
+check before any updates are attempted.</p>
+<p>An exit code of <code>4</code> or higher, indicates that errors occurred while
+attempting to download and extract updates.</p>
+<p>
+</p>
+<hr />
+<h1><a name="see_also">SEE ALSO</a></h1>
+<p>Mail::SpamAssassin(3)
+Mail::SpamAssassin::Conf(3)
+<code>spamassassin(1)</code>
+<code>spamd(1)</code>
+&lt;http://wiki.apache.org/spamassassin/RuleUpdates&gt;</p>
+<p>
+</p>
+<hr />
+<h1><a name="prerequesites">PREREQUESITES</a></h1>
+<p><code>Mail::SpamAssassin</code></p>
+<p>
+</p>
+<hr />
+<h1><a name="bugs">BUGS</a></h1>
+<p>See &lt;http://issues.apache.org/SpamAssassin/&gt;</p>
+<p>
+</p>
+<hr />
+<h1><a name="authors">AUTHORS</a></h1>
+<p>The Apache SpamAssassin(tm) Project &lt;http://spamassassin.apache.org/&gt;</p>
+<p>
+</p>
+<hr />
+<h1><a name="copyright">COPYRIGHT</a></h1>
+<p>SpamAssassin is distributed under the Apache License, Version 2.0, as
+described in the file <code>LICENSE</code> included with the distribution.</p>
+
+</body>
+
+</html>

Added: spamassassin/site/full/3.3.x/doc/sa-update.txt
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.3.x/doc/sa-update.txt?rev=1138498&view=auto
==============================================================================
--- spamassassin/site/full/3.3.x/doc/sa-update.txt (added)
+++ spamassassin/site/full/3.3.x/doc/sa-update.txt Wed Jun 22 15:00:59 2011
@@ -0,0 +1,219 @@
+NAME
+    sa-update - automate SpamAssassin rule updates
+
+SYNOPSIS
+    sa-update [options]
+
+    Options:
+
+      --channel channel       Retrieve updates from this channel
+                              Use multiple times for multiple channels
+      --channelfile file      Retrieve updates from the channels in the file
+      --checkonly             Check for update availability, do not install
+      --install filename      Install updates directly from this file. Signature
+                              verification will use "file.asc" and "file.sha1"
+      --allowplugins          Allow updates to load plugin code
+      --gpgkey key            Trust the key id to sign releases
+                              Use multiple times for multiple keys
+      --gpgkeyfile file       Trust the key ids in the file to sign releases
+      --gpghomedir path       Store the GPG keyring in this directory
+      --gpg and --nogpg       Use (or do not use) GPG to verify updates
+                              (--gpg is assumed by use of the above
+                              --gpgkey and --gpgkeyfile options)
+      --import file           Import GPG key(s) from file into sa-update's
+                              keyring. Use multiple times for multiple files
+      --updatedir path        Directory to place updates, defaults to the
+                              SpamAssassin site rules directory
+                              (default: /home/parker/perl5/perlbrew/perls/perl-5.14.1/var/spamassassin/3.003003)
+      --refreshmirrors        Force the MIRRORED.BY file to be updated
+      -D, --debug [area=n,...]  Print debugging messages
+      -v, --verbose           Be more verbose, like print updated channel names
+      -V, --version           Print version
+      -h, --help              Print usage message
+
+DESCRIPTION
+    sa-update automates the process of downloading and installing new rules
+    and configuration, based on channels. The default channel is
+    *updates.spamassassin.org*, which has updated rules since the previous
+    release.
+
+    Update archives are verified using SHA1 hashes and GPG signatures, by
+    default.
+
+    Note that "sa-update" will not restart "spamd" or otherwise cause a
+    scanner to reload the now-updated ruleset automatically. Instead,
+    "sa-update" is typically used in something like the following manner:
+
+            sa-update && /etc/init.d/spamassassin reload
+
+    This works because "sa-update" only returns an exit status of 0 if it
+    has successfully downloaded and installed an updated ruleset.
+
+OPTIONS
+    --channel
+        sa-update can update multiple channels at the same time. By default,
+        it will only access "updates.spamassassin.org", but more channels
+        can be specified via this option. If there are multiple additional
+        channels, use the option multiple times, once per channel. i.e.:
+
+                sa-update --channel foo.example.com --channel bar.example.com
+
+    --channelfile
+        Similar to the --channel option, except specify the additional
+        channels in a file instead of on the commandline. This is useful
+        when there are a lot of additional channels.
+
+    --checkonly
+        Only check if an update is available, don't actually download and
+        install it. The exit code will be 0 or 1 as described below.
+
+    --install
+        Install updates "offline", from the named tar.gz file, instead of
+        performing DNS lookups and HTTP invocations.
+
+        Files named file.sha1 and file.asc will be used for the SHA-1 and
+        GPG signature, respectively. The filename provided must contain a
+        version number of at least 3 digits, which will be used as the
+        channel's update version number.
+
+        Multiple --channel switches cannot be used with --install. To
+        install multiple channels from tarballs, run "sa-update" multiple
+        times with different --channel and --install switches, e.g.:
+
+                sa-update --channel foo.example.com --install foo-34958.tgz
+                sa-update --channel bar.example.com --install bar-938455.tgz
+
+    --allowplugins
+        Allow downloaded updates to activate plugins. The default is not to
+        activate plugins; any "loadplugin" or "tryplugin" lines will be
+        commented in the downloaded update rules files.
+
+    --gpg, --nogpg
+        sa-update by default will verify update archives by use of a SHA1
+        checksum and GPG signature. SHA1 hashes can verify whether or not
+        the downloaded archive has been corrupted, but it does not offer any
+        form of security regarding whether or not the downloaded archive is
+        legitimate (aka: non-modifed by evildoers). GPG verification of the
+        archive is used to solve that problem.
+
+        If you wish to skip GPG verification, you can use the --nogpg option
+        to disable its use. Use of the following gpgkey-related options will
+        override --nogpg and keep GPG verification enabled.
+
+        Note: Currently, only GPG itself is supported (ie: not PGP). v1.2
+        has been tested, although later versions ought to work as well.
+
+    --gpgkey
+        sa-update has the concept of "release trusted" GPG keys. When an
+        archive is downloaded and the signature verified, sa-update requires
+        that the signature be from one of these "release trusted" keys or
+        else verification fails. This prevents third parties from
+        manipulating the files on a mirror, for instance, and signing with
+        their own key.
+
+        By default, sa-update trusts key id "265FA05B", which is the
+        standard SpamAssassin release key. Use this option to trust
+        additional keys. See the --import option for how to add keys to
+        sa-update's keyring. For sa-update to use a key it must be in
+        sa-update's keyring and trusted.
+
+        For multiple keys, use the option multiple times. i.e.:
+
+                sa-update --gpgkey E580B363 --gpgkey 298BC7D0
+
+        Note: use of this option automatically enables GPG verification.
+
+    --gpgkeyfile
+        Similar to the --gpgkey option, except specify the additional keys
+        in a file instead of on the commandline. This is extremely useful
+        when there are a lot of additional keys that you wish to trust.
+
+    --gpghomedir
+        Specify a directory path to use as a storage area for the
+        "sa-update" GPG keyring. By default, this is
+
+                /home/parker/perl5/perlbrew/perls/perl-5.14.1/etc/mail/spamassassin/sa-update-keys
+
+    --import
+        Use to import GPG key(s) from a file into the sa-update keyring
+        which is located in the directory specified by --gpghomedir. Before
+        using channels from third party sources, you should use this option
+        to import the GPG key(s) used by those channels. You must still use
+        the --gpgkey or --gpgkeyfile options above to get sa-update to trust
+        imported keys.
+
+        To import multiple keys, use the option multiple times. i.e.:
+
+                sa-update --import channel1-GPG.KEY --import channel2-GPG.KEY
+
+        Note: use of this option automatically enables GPG verification.
+
+    --refreshmirrors
+        Force the list of sa-update mirrors for each channel, stored in the
+        MIRRORED.BY file, to be updated. By default, the MIRRORED.BY file
+        will be cached for up to 7 days after each time it is downloaded.
+
+    --updatedir
+        By default, "sa-update" will use the system-wide rules update
+        directory:
+
+                /home/parker/perl5/perlbrew/perls/perl-5.14.1/var/spamassassin/3.003003
+
+        If the updates should be stored in another location, specify it
+        here.
+
+        Note that use of this option is not recommended; if you're just
+        using sa-update to download updated rulesets for a scanner, and
+        sa-update is placing updates in the wrong directory, you probably
+        need to rebuild SpamAssassin with different "Makefile.PL" arguments,
+        instead of overriding sa-update's runtime behaviour.
+
+    -D [*area,...*], --debug [*area,...*]
+        Produce debugging output. If no areas are listed, all debugging
+        information is printed. Diagnostic output can also be enabled for
+        each area individually; *area* is the area of the code to
+        instrument. For example, to produce diagnostic output on channel,
+        gpg, and http, use:
+
+                sa-update -D channel,gpg,http
+
+        For more information about which areas (also known as channels) are
+        available, please see the documentation at
+        <http://wiki.apache.org/spamassassin/DebugChannels>.
+
+    -h, --help
+        Print help message and exit.
+
+    -V, --version
+        Print sa-update version and exit.
+
+EXIT CODES
+    An exit code of 0 means an update was available, and was downloaded and
+    installed successfully if --checkonly was not specified.
+
+    An exit code of 1 means no fresh updates were available.
+
+    An exit code of 2 means that at least one update is available but that a
+    lint check of the site pre files failed. The site pre files must pass a
+    lint check before any updates are attempted.
+
+    An exit code of 4 or higher, indicates that errors occurred while
+    attempting to download and extract updates.
+
+SEE ALSO
+    Mail::SpamAssassin(3) Mail::SpamAssassin::Conf(3) spamassassin(1)
+    spamd(1) <http://wiki.apache.org/spamassassin/RuleUpdates>
+
+PREREQUESITES
+    "Mail::SpamAssassin"
+
+BUGS
+    See <http://issues.apache.org/SpamAssassin/>
+
+AUTHORS
+    The Apache SpamAssassin(tm) Project <http://spamassassin.apache.org/>
+
+COPYRIGHT
+    SpamAssassin is distributed under the Apache License, Version 2.0, as
+    described in the file "LICENSE" included with the distribution.
+

Added: spamassassin/site/full/3.3.x/doc/spamassassin-run.html
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.3.x/doc/spamassassin-run.html?rev=1138498&view=auto
==============================================================================
--- spamassassin/site/full/3.3.x/doc/spamassassin-run.html (added)
+++ spamassassin/site/full/3.3.x/doc/spamassassin-run.html Wed Jun 22 15:00:59 2011
@@ -0,0 +1,355 @@
+<?xml version="1.0" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+<title>spamassassin - simple front-end filtering script for SpamAssassin</title>
+<meta http-equiv="content-type" content="text/html; charset=utf-8" />
+<link rev="made" href="mailto:parker@minotaur.apache.org" />
+</head>
+
+<body style="background-color: white">
+
+
+<!-- INDEX BEGIN -->
+<div name="index">
+<p><a name="__index__"></a></p>
+
+<ul>
+
+	<li><a href="#name">NAME</a></li>
+	<li><a href="#synopsis">SYNOPSIS</a></li>
+	<li><a href="#description">DESCRIPTION</a></li>
+	<li><a href="#options">OPTIONS</a></li>
+	<li><a href="#see_also">SEE ALSO</a></li>
+	<li><a href="#prerequisites">PREREQUISITES</a></li>
+	<li><a href="#bugs">BUGS</a></li>
+	<li><a href="#authors">AUTHORS</a></li>
+	<li><a href="#copyright">COPYRIGHT</a></li>
+</ul>
+
+<hr name="index" />
+</div>
+<!-- INDEX END -->
+
+<p>
+</p>
+<h1><a name="name">NAME</a></h1>
+<p>spamassassin - simple front-end filtering script for SpamAssassin</p>
+<p>
+</p>
+<hr />
+<h1><a name="synopsis">SYNOPSIS</a></h1>
+<p><strong>spamassassin</strong> [options] [ &lt; <em>mailmessage</em> | <em>path</em> ... ]</p>
+<p><strong>spamassassin</strong> <strong>-d</strong> [ &lt; <em>mailmessage</em> | <em>path</em> ... ]</p>
+<p><strong>spamassassin</strong> <strong>-r</strong> [ &lt; <em>mailmessage</em> | <em>path</em> ... ]</p>
+<p><strong>spamassassin</strong> <strong>-k</strong> [ &lt; <em>mailmessage</em> | <em>path</em> ... ]</p>
+<p><strong>spamassassin</strong> <strong>-W</strong>|<strong>-R</strong> [ &lt; <em>mailmessage</em> | <em>path</em> ... ]</p>
+<p>Options:</p>
+<pre>
+ -L, --local                       Local tests only (no online tests)
+ -r, --report                      Report message as spam
+ -k, --revoke                      Revoke message as spam
+ -d, --remove-markup               Remove spam reports from a message
+ -C path, --configpath=path, --config-file=path
+                                   Path to standard configuration dir
+ -p prefs, --prefspath=file, --prefs-file=file
+                                   Set user preferences file
+ --siteconfigpath=path             Path for site configs
+                                   (def: /etc/mail/spamassassin)
+ --cf='config line'                Additional line of configuration
+ -x, --nocreate-prefs              Don't create user preferences file
+ -e, --exit-code                   Exit with a non-zero exit code if the
+                                   tested message was spam
+ --mbox                            read in messages in mbox format
+ --mbx                             read in messages in UW mbx format
+ -t, --test-mode                   Pipe message through and add extra
+                                   report to the bottom
+ --lint                            Lint the rule set: report syntax errors
+ -W, --add-to-whitelist            Add addresses in mail to persistent address whitelist
+ --add-to-blacklist                Add addresses in mail to persistent address blacklist
+ -R, --remove-from-whitelist       Remove all addresses found in mail from
+                                   persistent address list
+ --add-addr-to-whitelist=addr      Add addr to persistent address whitelist
+ --add-addr-to-blacklist=addr      Add addr to persistent address blacklist
+ --remove-addr-from-whitelist=addr Remove addr from persistent address list
+ --ipv4only, --ipv4-only, --ipv4   Disable attempted use of ipv6 for DNS
+ --progress                        Print progress bar
+ -D, --debug [area=n,...]          Print debugging messages
+ -V, --version                     Print version
+ -h, --help                        Print usage message</pre>
+<p>
+</p>
+<hr />
+<h1><a name="description">DESCRIPTION</a></h1>
+<p>spamassassin is a simple front-end filter for SpamAssassin.</p>
+<p>Using the SpamAssassin rule base, it uses a wide range of heuristic
+tests on mail headers and body text to identify &quot;spam&quot;, also known as
+unsolicited bulk email.  Once identified, the mail is then tagged as
+spam for later filtering using the user's own mail user-agent
+application.</p>
+<p>The default tagging operations that take place are detailed in <em>spamassassin/&quot;TAGGING&quot;</em>.</p>
+<p>By default, message(s) are read in from STDIN (&lt; <em>mailmessage</em>), or
+from specified files and directories (<em>path</em> ...)  STDIN and files
+are assumed to be in <em>file</em> format, with a single message per file.
+Directories are assumed to be in a format where each file in the directory
+contains only one message (directories are not recursed and filenames
+containing whitespace or beginning with &quot;.&quot; or &quot;,&quot; are skipped).
+The options <em>--mbox</em> and <em>--mbx</em> can override the assumed format,
+see the appropriate OPTION information below.</p>
+<p>Please note that SpamAssassin is not designed to scan large
+messages. Don't feed messages larger than about 500 KB to
+SpamAssassin, as this will consume a huge amount of memory.</p>
+<p>
+</p>
+<hr />
+<h1><a name="options">OPTIONS</a></h1>
+<dl>
+<dt><strong><a name="e_error_code_exit_code2" class="item"><strong>-e</strong>, <strong>--error-code</strong>, <strong>--exit-code</strong></a></strong></dt>
+
+<dd>
+<p>Exit with a non-zero error code, if the message is determined to be
+spam.</p>
+</dd>
+<dt><strong><a name="h_help7" class="item"><strong>-h</strong>, <strong>--help</strong></a></strong></dt>
+
+<dd>
+<p>Print help message and exit.</p>
+</dd>
+<dt><strong><a name="v_version6" class="item"><strong>-V</strong>, <strong>--version</strong></a></strong></dt>
+
+<dd>
+<p>Print version and exit.</p>
+</dd>
+<dt><strong><a name="t_test_mode2" class="item"><strong>-t</strong>, <strong>--test-mode</strong></a></strong></dt>
+
+<dd>
+<p>Test mode.  Pipe message through and add extra report.  Note that the report
+text assumes that the message is spam, since in normal use it is only visible
+in this case.  Pay attention to the score instead.</p>
+<p>If you run this with <strong>-d</strong>, the message will first have SpamAssassin
+markup removed before being tested.</p>
+</dd>
+<dt><strong><a name="r_report2" class="item"><strong>-r</strong>, <strong>--report</strong></a></strong></dt>
+
+<dd>
+<p>Report this message as manually-verified spam.  This will submit the mail
+message read from STDIN to various spam-blocker databases.  Currently,
+these are the Distributed Checksum Clearinghouse
+<code>http://www.rhyolite.com/anti-spam/dcc/</code>, Pyzor
+<code>http://pyzor.sourceforge.net/</code>, Vipul's Razor
+<code>http://razor.sourceforge.net/</code>, and SpamCop <code>http://www.spamcop.net/</code>.</p>
+<p>If the message contains SpamAssassin markup, the markup will be stripped
+out automatically before submission.  The support modules for DCC, Pyzor,
+and Razor must be installed for spam to be reported to each service.
+SpamCop reports will have greater effect if you register and set the
+<code>spamcop_to_address</code> option.</p>
+<p>The message will also be submitted to SpamAssassin's learning systems;
+currently this is the internal Bayesian statistical-filtering system (the
+BAYES rules).  (Note that if you <em>only</em> want to perform statistical
+learning, and do not want to report mail to third-parties, you should use
+the <code>sa-learn</code> command directly instead.)</p>
+</dd>
+<dt><strong><a name="k_revoke2" class="item"><strong>-k</strong>, <strong>--revoke</strong></a></strong></dt>
+
+<dd>
+<p>Revoke this message.  This will revoke the mail message read from STDIN from
+various spam-blocker databases.  Currently, these are Vipul's Razor.</p>
+<p>Revocation support for the Distributed Checksum Clearinghouse, Pyzor, and
+SpamCop is not currently available.</p>
+<p>If the message contains SpamAssassin markup, the markup will be stripped
+out automatically before submission.  The support modules for Razor must
+be installed for spam to be revoked from the service.</p>
+<p>The message will also be submitted as 'ham' (non-spam) to SpamAssassin's
+learning systems; currently this is the internal Bayesian
+statistical-filtering system (the BAYES rules).  (Note that if you <em>only</em>
+want to perform statistical learning, and do not want to report mail to
+third-parties, you should use the <code>sa-learn</code> command directly instead.)</p>
+</dd>
+<dt><strong><a name="lint2" class="item"><strong>--lint</strong></a></strong></dt>
+
+<dd>
+<p>Syntax check (lint) the rule set and configuration files, reporting
+typos and rules that do not compile correctly.  Exits with 0 if there
+are no errors, or greater than 0 if any errors are found.</p>
+</dd>
+<dt><strong><a name="w_add_to_whitelist2" class="item"><strong>-W</strong>, <strong>--add-to-whitelist</strong></a></strong></dt>
+
+<dd>
+<p>Add all email addresses, in the headers and body of the mail message read
+from STDIN, to a persistent address whitelist.  Note that you must be running
+<code>spamassassin</code> or <code>spamd</code> with a persistent address list plugin enabled for
+this to work.</p>
+</dd>
+<dt><strong><a name="add_to_blacklist2" class="item"><strong>--add-to-blacklist</strong></a></strong></dt>
+
+<dd>
+<p>Add all email addresses, in the headers and body of the mail message read
+from STDIN, to the persistent address blacklist.  Note that you must be
+running <code>spamassassin</code> or <code>spamd</code> with a persistent address list plugin
+enabled for this to work.</p>
+</dd>
+<dt><strong><a name="r_remove_from_whitelist2" class="item"><strong>-R</strong>, <strong>--remove-from-whitelist</strong></a></strong></dt>
+
+<dd>
+<p>Remove all email addresses, in the headers and body of the mail message read
+from STDIN, from a persistent address list. STDIN must contain a full email
+message, so to remove a single address you should use
+<strong>--remove-addr-from-whitelist</strong> instead.</p>
+<p>Note that you must be running <code>spamassassin</code> or <code>spamd</code> with a persistent
+address list plugin enabled for this to work.</p>
+</dd>
+<dt><strong><a name="add_addr_to_whitelist2" class="item"><strong>--add-addr-to-whitelist</strong></a></strong></dt>
+
+<dd>
+<p>Add the named email address to a persistent address whitelist.  Note that you
+must be running <code>spamassassin</code> or <code>spamd</code> with a persistent address list
+plugin enabled for this to work.</p>
+</dd>
+<dt><strong><a name="add_addr_to_blacklist2" class="item"><strong>--add-addr-to-blacklist</strong></a></strong></dt>
+
+<dd>
+<p>Add the named email address to a persistent address blacklist.  Note that you
+must be running <code>spamassassin</code> or <code>spamd</code> with a persistent address list
+plugin enabled for this to work.</p>
+</dd>
+<dt><strong><a name="remove_addr_from_whitelist2" class="item"><strong>--remove-addr-from-whitelist</strong></a></strong></dt>
+
+<dd>
+<p>Remove the named email address from a persistent address whitelist.  Note that
+you must be running <code>spamassassin</code> or <code>spamd</code> with a persistent address
+list plugin enabled for this to work.</p>
+</dd>
+<dt><strong><a name="ipv4only_ipv4_only_ipv43" class="item"><strong> --ipv4only</strong>, <strong>--ipv4-only</strong>, <strong>--ipv4</strong></a></strong></dt>
+
+<dd>
+<p>Do not use IPv6 for DNS tests. Normally, SpamAssassin will try to detect if
+IPv6 is available, using only IPv4 if it is not. Use if the existing tests
+for IPv6 availability produce incorrect results or crashes.</p>
+</dd>
+<dt><strong><a name="l_local4" class="item"><strong>-L</strong>, <strong>--local</strong></a></strong></dt>
+
+<dd>
+<p>Do only the ''local'' tests, ones that do not require an internet connection to
+operate.  Normally, SpamAssassin will try to detect whether you are connected
+to the net before doing these tests anyway, but for faster checks you may wish
+to use this.</p>
+<p>Note that SpamAssassin's network rules are run in parallel.  This can cause
+overhead in terms of the number of file descriptors required if <strong>--local</strong> is
+not used; it is recommended that the minimum limit on fds be raised to at least
+256 for safety.</p>
+</dd>
+<dt><strong><a name="d_remove_markup2" class="item"><strong>-d</strong>, <strong>--remove-markup</strong></a></strong></dt>
+
+<dd>
+<p>Remove SpamAssassin markup (the &quot;SpamAssassin results&quot; report, X-Spam-Status
+headers, etc.) from the mail message.  The resulting message, which will be
+more or less identical to the original, pre-SpamAssassin input, will be output
+to STDOUT.</p>
+<p>(Note: the message will not be exactly identical; some headers will be
+reformatted due to some features of the Mail::Internet package, but the body
+text will be.)</p>
+</dd>
+<dt><strong><a name="c_path_configpath_path_config_file_path4" class="item"><strong>-C</strong> <em>path</em>, <strong>--configpath</strong>=<em>path</em>, <strong>--config-file</strong>=<em>path</em></a></strong></dt>
+
+<dd>
+<p>Use the specified path for locating the distributed configuration files.
+Ignore the default directories (usually <code>/usr/share/spamassassin</code> or similar).</p>
+</dd>
+<dt><strong><a name="siteconfigpath_path5" class="item"><strong>--siteconfigpath</strong>=<em>path</em></a></strong></dt>
+
+<dd>
+<p>Use the specified path for locating site-specific configuration files.  Ignore
+the default directories (usually <code>/etc/mail/spamassassin</code> or similar).</p>
+</dd>
+<dt><strong><a name="cf_config_line5" class="item"><strong>--cf='config line'</strong></a></strong></dt>
+
+<dd>
+<p>Add additional lines of configuration directly from the command-line, parsed
+after the configuration files are read.   Multiple <strong>--cf</strong> arguments can be
+used, and each will be considered a separate line of configuration.  For
+example:</p>
+<pre>
+        spamassassin -t --cf=&quot;body NEWRULE /text/&quot; --cf=&quot;score NEWRULE 3.0&quot;</pre>
+</dd>
+<dt><strong><a name="p_prefs_prefspath_prefs_prefs_file_prefs4" class="item"><strong>-p</strong> <em>prefs</em>, <strong>--prefspath</strong>=<em>prefs</em>, <strong>--prefs-file</strong>=<em>prefs</em></a></strong></dt>
+
+<dd>
+<p>Read user score preferences from <em>prefs</em> (usually <code>$HOME/.spamassassin/user_prefs</code>).</p>
+</dd>
+<dt><strong><a name="progress3" class="item"><strong>--progress</strong></a></strong></dt>
+
+<dd>
+<p>Prints a progress bar (to STDERR) showing the current progress.  This option
+will only be useful if you are redirecting STDOUT (and not STDERR).  In the
+case where no valid terminal is found this option will behave very much like
+the --showdots option in other SpamAssassin programs.</p>
+</dd>
+<dt><strong><a name="d_area_debug_area6" class="item"><strong>-D</strong> [<em>area,...</em>], <strong>--debug</strong> [<em>area,...</em>]</a></strong></dt>
+
+<dd>
+<p>Produce debugging output. If no areas are listed, all debugging information is
+printed. Diagnostic output can also be enabled for each area individually;
+<em>area</em> is the area of the code to instrument. For example, to produce
+diagnostic output on bayes, learn, and dns, use:</p>
+<pre>
+        spamassassin -D bayes,learn,dns</pre>
+<p>Higher priority informational messages that are suitable for logging in normal
+circumstances are available with an area of &quot;info&quot;.</p>
+<p>For more information about which areas (also known as channels) are available,
+please see the documentation at:</p>
+<pre>
+        L&lt;<a href="http://wiki.apache.org/spamassassin/DebugChannels&gt">http://wiki.apache.org/spamassassin/DebugChannels&gt</a>;</pre>
+</dd>
+<dt><strong><a name="x_nocreate_prefs2" class="item"><strong>-x</strong>, <strong>--nocreate-prefs</strong></a></strong></dt>
+
+<dd>
+<p>Disable creation of user preferences file.</p>
+</dd>
+<dt><strong><a name="mbox3" class="item"><strong>--mbox</strong></a></strong></dt>
+
+<dd>
+<p>Specify that the input message(s) are in mbox format.  mbox is a standard
+Unix message folder format.</p>
+</dd>
+<dt><strong><a name="mbx3" class="item"><strong>--mbx</strong></a></strong></dt>
+
+<dd>
+<p>Specify that the input message(s) are in UW .mbx format.  mbx is
+the mailbox format used within the University of Washington's IMAP
+implementation; see <code>http://www.washington.edu/imap/</code>.</p>
+</dd>
+</dl>
+<p>
+</p>
+<hr />
+<h1><a name="see_also">SEE ALSO</a></h1>
+<p>sa-learn(1)
+<code>spamd(1)</code>
+<code>spamc(1)</code>
+Mail::SpamAssassin::Conf(3)
+Mail::SpamAssassin(3)</p>
+<p>
+</p>
+<hr />
+<h1><a name="prerequisites">PREREQUISITES</a></h1>
+<p><code>Mail::SpamAssassin</code></p>
+<p>
+</p>
+<hr />
+<h1><a name="bugs">BUGS</a></h1>
+<p>See &lt;http://issues.apache.org/SpamAssassin/&gt;</p>
+<p>
+</p>
+<hr />
+<h1><a name="authors">AUTHORS</a></h1>
+<p>The SpamAssassin(tm) Project &lt;http://spamassassin.apache.org/&gt;</p>
+<p>
+</p>
+<hr />
+<h1><a name="copyright">COPYRIGHT</a></h1>
+<p>SpamAssassin is distributed under the Apache License, Version 2.0, as
+described in the file <code>LICENSE</code> included with the distribution.</p>
+
+</body>
+
+</html>

Added: spamassassin/site/full/3.3.x/doc/spamassassin-run.txt
URL: http://svn.apache.org/viewvc/spamassassin/site/full/3.3.x/doc/spamassassin-run.txt?rev=1138498&view=auto
==============================================================================
--- spamassassin/site/full/3.3.x/doc/spamassassin-run.txt (added)
+++ spamassassin/site/full/3.3.x/doc/spamassassin-run.txt Wed Jun 22 15:00:59 2011
@@ -0,0 +1,275 @@
+NAME
+    spamassassin - simple front-end filtering script for SpamAssassin
+
+SYNOPSIS
+    spamassassin [options] [ < *mailmessage* | *path* ... ]
+
+    spamassassin -d [ < *mailmessage* | *path* ... ]
+
+    spamassassin -r [ < *mailmessage* | *path* ... ]
+
+    spamassassin -k [ < *mailmessage* | *path* ... ]
+
+    spamassassin -W|-R [ < *mailmessage* | *path* ... ]
+
+    Options:
+
+     -L, --local                       Local tests only (no online tests)
+     -r, --report                      Report message as spam
+     -k, --revoke                      Revoke message as spam
+     -d, --remove-markup               Remove spam reports from a message
+     -C path, --configpath=path, --config-file=path
+                                       Path to standard configuration dir
+     -p prefs, --prefspath=file, --prefs-file=file
+                                       Set user preferences file
+     --siteconfigpath=path             Path for site configs
+                                       (def: /etc/mail/spamassassin)
+     --cf='config line'                Additional line of configuration
+     -x, --nocreate-prefs              Don't create user preferences file
+     -e, --exit-code                   Exit with a non-zero exit code if the
+                                       tested message was spam
+     --mbox                            read in messages in mbox format
+     --mbx                             read in messages in UW mbx format
+     -t, --test-mode                   Pipe message through and add extra
+                                       report to the bottom
+     --lint                            Lint the rule set: report syntax errors
+     -W, --add-to-whitelist            Add addresses in mail to persistent address whitelist
+     --add-to-blacklist                Add addresses in mail to persistent address blacklist
+     -R, --remove-from-whitelist       Remove all addresses found in mail from
+                                       persistent address list
+     --add-addr-to-whitelist=addr      Add addr to persistent address whitelist
+     --add-addr-to-blacklist=addr      Add addr to persistent address blacklist
+     --remove-addr-from-whitelist=addr Remove addr from persistent address list
+     --ipv4only, --ipv4-only, --ipv4   Disable attempted use of ipv6 for DNS
+     --progress                        Print progress bar
+     -D, --debug [area=n,...]          Print debugging messages
+     -V, --version                     Print version
+     -h, --help                        Print usage message
+
+DESCRIPTION
+    spamassassin is a simple front-end filter for SpamAssassin.
+
+    Using the SpamAssassin rule base, it uses a wide range of heuristic
+    tests on mail headers and body text to identify "spam", also known as
+    unsolicited bulk email. Once identified, the mail is then tagged as spam
+    for later filtering using the user's own mail user-agent application.
+
+    The default tagging operations that take place are detailed in "TAGGING"
+    in spamassassin.
+
+    By default, message(s) are read in from STDIN (< *mailmessage*), or from
+    specified files and directories (*path* ...) STDIN and files are assumed
+    to be in *file* format, with a single message per file. Directories are
+    assumed to be in a format where each file in the directory contains only
+    one message (directories are not recursed and filenames containing
+    whitespace or beginning with "." or "," are skipped). The options
+    *--mbox* and *--mbx* can override the assumed format, see the
+    appropriate OPTION information below.
+
+    Please note that SpamAssassin is not designed to scan large messages.
+    Don't feed messages larger than about 500 KB to SpamAssassin, as this
+    will consume a huge amount of memory.
+
+OPTIONS
+    -e, --error-code, --exit-code
+        Exit with a non-zero error code, if the message is determined to be
+        spam.
+
+    -h, --help
+        Print help message and exit.
+
+    -V, --version
+        Print version and exit.
+
+    -t, --test-mode
+        Test mode. Pipe message through and add extra report. Note that the
+        report text assumes that the message is spam, since in normal use it
+        is only visible in this case. Pay attention to the score instead.
+
+        If you run this with -d, the message will first have SpamAssassin
+        markup removed before being tested.
+
+    -r, --report
+        Report this message as manually-verified spam. This will submit the
+        mail message read from STDIN to various spam-blocker databases.
+        Currently, these are the Distributed Checksum Clearinghouse
+        "http://www.rhyolite.com/anti-spam/dcc/", Pyzor
+        "http://pyzor.sourceforge.net/", Vipul's Razor
+        "http://razor.sourceforge.net/", and SpamCop
+        "http://www.spamcop.net/".
+
+        If the message contains SpamAssassin markup, the markup will be
+        stripped out automatically before submission. The support modules
+        for DCC, Pyzor, and Razor must be installed for spam to be reported
+        to each service. SpamCop reports will have greater effect if you
+        register and set the "spamcop_to_address" option.
+
+        The message will also be submitted to SpamAssassin's learning
+        systems; currently this is the internal Bayesian
+        statistical-filtering system (the BAYES rules). (Note that if you
+        *only* want to perform statistical learning, and do not want to
+        report mail to third-parties, you should use the "sa-learn" command
+        directly instead.)
+
+    -k, --revoke
+        Revoke this message. This will revoke the mail message read from
+        STDIN from various spam-blocker databases. Currently, these are
+        Vipul's Razor.
+
+        Revocation support for the Distributed Checksum Clearinghouse,
+        Pyzor, and SpamCop is not currently available.
+
+        If the message contains SpamAssassin markup, the markup will be
+        stripped out automatically before submission. The support modules
+        for Razor must be installed for spam to be revoked from the service.
+
+        The message will also be submitted as 'ham' (non-spam) to
+        SpamAssassin's learning systems; currently this is the internal
+        Bayesian statistical-filtering system (the BAYES rules). (Note that
+        if you *only* want to perform statistical learning, and do not want
+        to report mail to third-parties, you should use the "sa-learn"
+        command directly instead.)
+
+    --lint
+        Syntax check (lint) the rule set and configuration files, reporting
+        typos and rules that do not compile correctly. Exits with 0 if there
+        are no errors, or greater than 0 if any errors are found.
+
+    -W, --add-to-whitelist
+        Add all email addresses, in the headers and body of the mail message
+        read from STDIN, to a persistent address whitelist. Note that you
+        must be running "spamassassin" or "spamd" with a persistent address
+        list plugin enabled for this to work.
+
+    --add-to-blacklist
+        Add all email addresses, in the headers and body of the mail message
+        read from STDIN, to the persistent address blacklist. Note that you
+        must be running "spamassassin" or "spamd" with a persistent address
+        list plugin enabled for this to work.
+
+    -R, --remove-from-whitelist
+        Remove all email addresses, in the headers and body of the mail
+        message read from STDIN, from a persistent address list. STDIN must
+        contain a full email message, so to remove a single address you
+        should use --remove-addr-from-whitelist instead.
+
+        Note that you must be running "spamassassin" or "spamd" with a
+        persistent address list plugin enabled for this to work.
+
+    --add-addr-to-whitelist
+        Add the named email address to a persistent address whitelist. Note
+        that you must be running "spamassassin" or "spamd" with a persistent
+        address list plugin enabled for this to work.
+
+    --add-addr-to-blacklist
+        Add the named email address to a persistent address blacklist. Note
+        that you must be running "spamassassin" or "spamd" with a persistent
+        address list plugin enabled for this to work.
+
+    --remove-addr-from-whitelist
+        Remove the named email address from a persistent address whitelist.
+        Note that you must be running "spamassassin" or "spamd" with a
+        persistent address list plugin enabled for this to work.
+
+     --ipv4only, --ipv4-only, --ipv4
+        Do not use IPv6 for DNS tests. Normally, SpamAssassin will try to
+        detect if IPv6 is available, using only IPv4 if it is not. Use if
+        the existing tests for IPv6 availability produce incorrect results
+        or crashes.
+
+    -L, --local
+        Do only the ''local'' tests, ones that do not require an internet
+        connection to operate. Normally, SpamAssassin will try to detect
+        whether you are connected to the net before doing these tests
+        anyway, but for faster checks you may wish to use this.
+
+        Note that SpamAssassin's network rules are run in parallel. This can
+        cause overhead in terms of the number of file descriptors required
+        if --local is not used; it is recommended that the minimum limit on
+        fds be raised to at least 256 for safety.
+
+    -d, --remove-markup
+        Remove SpamAssassin markup (the "SpamAssassin results" report,
+        X-Spam-Status headers, etc.) from the mail message. The resulting
+        message, which will be more or less identical to the original,
+        pre-SpamAssassin input, will be output to STDOUT.
+
+        (Note: the message will not be exactly identical; some headers will
+        be reformatted due to some features of the Mail::Internet package,
+        but the body text will be.)
+
+    -C *path*, --configpath=*path*, --config-file=*path*
+        Use the specified path for locating the distributed configuration
+        files. Ignore the default directories (usually
+        "/usr/share/spamassassin" or similar).
+
+    --siteconfigpath=*path*
+        Use the specified path for locating site-specific configuration
+        files. Ignore the default directories (usually
+        "/etc/mail/spamassassin" or similar).
+
+    --cf='config line'
+        Add additional lines of configuration directly from the
+        command-line, parsed after the configuration files are read.
+        Multiple --cf arguments can be used, and each will be considered a
+        separate line of configuration. For example:
+
+                spamassassin -t --cf="body NEWRULE /text/" --cf="score NEWRULE 3.0"
+
+    -p *prefs*, --prefspath=*prefs*, --prefs-file=*prefs*
+        Read user score preferences from *prefs* (usually
+        "$HOME/.spamassassin/user_prefs").
+
+    --progress
+        Prints a progress bar (to STDERR) showing the current progress. This
+        option will only be useful if you are redirecting STDOUT (and not
+        STDERR). In the case where no valid terminal is found this option
+        will behave very much like the --showdots option in other
+        SpamAssassin programs.
+
+    -D [*area,...*], --debug [*area,...*]
+        Produce debugging output. If no areas are listed, all debugging
+        information is printed. Diagnostic output can also be enabled for
+        each area individually; *area* is the area of the code to
+        instrument. For example, to produce diagnostic output on bayes,
+        learn, and dns, use:
+
+                spamassassin -D bayes,learn,dns
+
+        Higher priority informational messages that are suitable for logging
+        in normal circumstances are available with an area of "info".
+
+        For more information about which areas (also known as channels) are
+        available, please see the documentation at:
+
+                L<http://wiki.apache.org/spamassassin/DebugChannels>
+
+    -x, --nocreate-prefs
+        Disable creation of user preferences file.
+
+    --mbox
+        Specify that the input message(s) are in mbox format. mbox is a
+        standard Unix message folder format.
+
+    --mbx
+        Specify that the input message(s) are in UW .mbx format. mbx is the
+        mailbox format used within the University of Washington's IMAP
+        implementation; see "http://www.washington.edu/imap/".
+
+SEE ALSO
+    sa-learn(1) spamd(1) spamc(1) Mail::SpamAssassin::Conf(3)
+    Mail::SpamAssassin(3)
+
+PREREQUISITES
+    "Mail::SpamAssassin"
+
+BUGS
+    See <http://issues.apache.org/SpamAssassin/>
+
+AUTHORS
+    The SpamAssassin(tm) Project <http://spamassassin.apache.org/>
+
+COPYRIGHT
+    SpamAssassin is distributed under the Apache License, Version 2.0, as
+    described in the file "LICENSE" included with the distribution.
+