You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spamassassin.apache.org by "Kevin A. McGrail" <KM...@PCCC.com> on 2013/10/11 06:42:56 UTC

Committers/PMC: Call for Vote on SpamAssassin 3.4.0-rc3 release

Committers/PMC:

Please vote to release 3.4.0-rc3.   I'm now running it on one production 
system and have been monitoring it for a while.  So far so good!

Files at http://people.apache.org/~kmcgrail/devel/

Proposed Announcement follows.

Discussing blockers to a real 3.4.0:

 From my perspective, I'm working on some RBL changes.  Not really 
blockers, but it's my focus right now.

I also think Bug 5503 needs to be completed because it could change 
headers people expect and a 3.4.0 major release is the time to do 
something like that.

We also need to work on these issues:

Bug 6422 to test CPAN
Bug 6639 re: Suse 11 and CPAN


After that, I'm hoping to at least spend a few hours documenting some of 
the things AXB pointed out:

    New configuration options  - Imo - these should be documented:

    Plugin/WLBLEval:

    blacklist_uri_host  example.net
    blacklist_uri_host  somehost.example.net

    whitelist_uri_host  example.org
    whitelist_uri_host  somehost.example.org

    Addition:

    - Disable lookups for a specific DNS list (instead of zeroing out rules)

    dns_query_restriction deny  someBL.example.org

      - If possible some minimal rule samples for

    "added the following sub-options to the tflags setting"
    autolearn_force, maxhits=N, ips_only, domains_only, a, ns.

    A lot of features like these remain unused because they're either
    very hidden or not loudly documented with sample rules.


Then we should be able to get 3.4.0 out the door which leads to these 
issues:

Bug 6885 to switch the website's publishing method will be needed to 
update the website to announce the release.

Work on post-3.4.0 issues: KAM will be working through the rest of the 
build/README file and expect lots of issues / problems that will need to 
be overcome to switch to 3.4.X.

Regards,
KAM
To: users, dev, announce
Subject: ANNOUNCE: Apache SpamAssassin 3.4.0 available

Release Notes -- Apache SpamAssassin -- Version 3.4.0

Introduction
------------

This is a major release.  It introduces over two years of bug fixes and
features including the Bayes Redis (http://redis.io/) back-end (bug 6879),
EDNS0 changes (bug 6910), native IPv6 support, numerous URIBL.pm changes
or features and a small API change in libspamc (bug 6562) with many other
subtle changes.

SpamAssassin was tested on perl 5.18.0, and (out of curiosity) also
on a Raspberry Pi (ARM6, Raspbian / Debian 7.0 Wheezy, perl 5.14.2)
... yes it is 20 times slower compared to i7-960 CPU, but all tests pass!

Overall, this release has been tested in many production-level
environments for nearly a year, including testing on an IPv6-only host.
It is highly recommended and stable.

NOTE: Complete changes are available at
http://svn.apache.org/repos/asf/spamassassin/branches/3.4/Changes


Notable Sendmail Bug
--------------------

Sendmail 8.14.5 and below contain a canonicalization misfeature / bug
that can cause DKIM failures.
See https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6462.


Compatibility with version 3.3.2
--------------------------------

* DNS queries generated by SpamAssassin now enable option EDNS0 in query
packets and specify a buffer size of 4096 bytes by default. This allows
DNS replies larger than 512 bytes to be returned in one UDP datagram,
avoiding a need for re-issuing a failed query over a TCP protocol. This
default setting is well suited if a DNS resolver (i.e. a recursive DNS
server) is located on the same LAN as a host running SpamAssassin, which
is the usual setup for all but perhaps some home uses of SpamAssassin.

The option should be disabled (by 'dns_options noedns0') when a recursive
DNS server is only reachable through some old-fashioned firewall or through
some picky router with deep packet inspection which bans DNS UDP messages
larger than 512 bytes, or blocks fragmented UDP datagrams.

The 'dns_options' setting is documented in Mail::SpamAssassin::Conf POD
or man page, more details in bug 6910 and bug 6862.


* A default setting for option 'dns_available' was changed from 'test' to
'yes' (bug 6770, bug 6769), so SpamAssassin now assumes by default that
it is running on a host with an internet connection and a working DNS
resolver. If this is not the case, please configure this option explicitly.

The change avoids surprises on an otherwise well connected host which may
experience a temporary DNS unavailability at the system startup time or a
temporary network outage when spamd was starting, and the initial failed
test would disable DNS test permanently. The option is documented in
Mail::SpamAssassin::Conf POD or man page.


* When Bayes classification is in use and messages are 'learned' as spam
or ham and stored in a database, the Bayes plugin generates internal
message IDs of learned messages and stores them in a 'seen' database to
avoid re-learning duplicates and accidental un-learning messages that
were not previously learned. With changes in bug 5185, the calculation
of message IDs in a bayes 'seen' database has changed, so new code can
no longer associate new messages with those learned before the change.

Note that this change does not affect recognition of old tokens and the
classification algorithm, only duplicate detection and unlearning of old
messages is affected.

Because of this change, if you use Bayes and you are upgrading from a
version prior to 3.4.0, you may consider wiping your Bayes database
and starting fresh.

However, this is not mandatory.  If you choose to keep your current
database tokens, these are the ramifications:

1 - If you re-process emails that have already been learned before,
     it will create duplicate entries because of the new msg_id format.
     The duplicates will expire, eventually, and should cause minimal
     impact unless it occurs frequently.

2 - If you try and unlearn or reclassify an email processed prior to the
     upgrade, the system will be unable to do so because of the new msg_id
     format. If unlearning a message (that was learned before the change)
     is important, consider just clearing your Bayes store and starting
     from scratch.


Dependency changes since version 3.3.2
--------------------------------------

Dependency on the following Perl modules were dropped: Net::Ident,
IP::Country::Fast and IP::Country.

Dependency on a perl module LWP::UserAgent as used by sa-update is now
made optional if any of programs curl, wget, or fetch are available.

New optional dependencies on the following Perl modules were introduced:

- new optional dependency on Geo::IP in a RelayCountry plugin (bug 6599);
   for backward compatibility IP::Country::Fast is used if Geo::IP is
   not installed

- new optional dependency on IO::Socket::IP for a cleaner IP support
   regardless of a protocol family (IPv4 and IPv6)

- new optional dependency on Net::Patricia to speed up lookups on
   internal_networks, trusted_networks or msa_networks when these lists
   contain a larger number of entries

- new optional dependency on programs curl, wget, or a FreeBSD fetch.
   sa-update will use any of these external programs to download rule
   updates, either over IPv6 or over IPv4. Any of these three programs
   suffices - the installation procedure is currently unclear on this,
   its warning may be understood as if all three programs are needed,
   which is not the case

- minimal required version of NetAddr::IP was bumped to 4.010


Internal changes potentially affecting third party software
using Mail::SpamAssassin library
-----------------------------------------------------------

A caller is now given a choice to call srand() by itself or let a
SpamAssassin library do it as before. This can avoid unnecessary entropy
loss in a perl's random number generator. It is controlled by option
skip_prng_reseeding in a call to Mail::SpamAssassin::new(). The change
was documented in bug 6690.

The Mail::SpamAssassin::parser can now accept a message also as a string
reference, avoiding one copy in memory. Documented in bug 6686.

A caller may pass the original mail body size to Mail::SpamAssassin::parse
through the suppl_attrib argument's field 'body_size'. This mail body size
is accessible to the eval rule check_body_length. It can be useful when a
caller only passes a truncated message to SpamAssassin. Documented in bug
6830.

A new plugin callback "prefork_init" was introduced, which should be called
by a master process (e.g. spamd) before forking multiple child processes.
For compatibility this call is currently optional, but recommended for new
versions. Currently only a Redis backend for Bayes checks will benefit from
being notified before a fork. Documented in bug 6942.


Notable bug fixes
-----------------

The sa-update program now avoids repeatedly downloading same rules if
subsequent unpacking of rules and updating fails. Documented in bug 6655.

Several incompatibilities with newer versions of a perl module Net::DNS
as used by sa-update and by the SpamAssassin library were fixed.
See Net::DNS problem [rt.cpan.org #83451].

A perl module Razor agent clobbers entropy of a random numbers generator by
re-initializing the generator on every call. The SpamAssassin Razor plugin
now provides a workaround, preserving entropy across calls to Razor2 agent.

A workaround in BayesStore/MySQL.pm was added for a MySQL server bug,
see http://bugs.mysql.com/bug.php?id=46675 .

Documentation was fixed: trailing dots in DNSBL zone names are not required
since version 3.1.0 of Mail::SpamAssassin (September 2005).


Notable features:
Redis database backend for a Bayes database
-------------------------------------------

In addition to existing backends, the 3.4.0 introduces support for keeping
a Bayes database on a Redis server, either running locally, or accessed
over network. Similar to SQL backends, the database may be concurrently
used by several hosts running SpamAssassin.

The current implementation only supports a global Bayes database, i.e.
per-recipient sub-databases are not supported. The Redis 2.6.* server
supports access over IPv4 or over a Unix socket, starting with version
2.8.0 also IPv6 is supported. Bear in mind that Redis server only offers
limited access controls, so it is advisable to let the Redis server bind
to a loopback interface only, or to use other mechanisms to limit access,
such as local firewall rules.

The Redis backend for Bayes can put a Lua scripting support in a Redis
server to good use, improving performance. The Lua support is available
in Redis server since version 2.6.  In absence of a Lua support, the Redis
backend uses batched (pipelined) traditional Redis commands, so it should
work with a Redis server version 2.4 (untested), although this is not
recommended for busy sites.

Expiration of token and 'seen' message id entries is left to the Redis
server. There is no provision for manually expiring a database, so it is
highly recommended to leave the setting bayes_auto_expire to its default
value 1 (i.e. enabled).

Example configuration:

   bayes_store_module  Mail::SpamAssassin::BayesStore::Redis
   bayes_sql_dsn server=127.0.0.1:6379;password=foo;database=2
   bayes_token_ttl 21d
   bayes_seen_ttl   8d
   bayes_auto_expire 1


Notable features:
Improved support for IPv6
-------------------------

The rules-updating program sa-update and its infrastructure is now usable
over either IPv4 or IPv6, including from an IPv6-only hosts (bug 6654).

SpamAssassin is now usable on an IPv6-only host: affects installation,
self-tests, rule updates, client, server, and a command-line spamassassin.

Command line options -4 and -6 were added to prefer/choose/force IPv4 or
IPv6 in programs spamassassin, spamd, spamc, and sa-update.

Command line options --listen and --allowed-ips in spamd can now accept
IPv6 addresses.

Preferably a perl module IO::Socket::IP is used (if it is available) for
network communication regardless of a protocol family - for DNS queries,
by spamd server side, and by a client code in Mail::SpamAssassin::Client.
As a fallback when the module IO::Socket::IP is unavailable, an older
module IO::Socket::INET6 is used, or eventually the IO::Socket::INET is
used as last resort.

The spamd server can now simultaneously listen on multiple sockets,
possibly in different protocol domains (Unix sockets, INET or INET6
protocol families.

DnsResolver was updated allowing it to work on an IPv6-only host (bug 6653)

A plugin RelayCountry now uses module Geo::IP and its database of IPv6
addresses GEOIP_COUNTRY_EDITION_V6 when available.

The following configuration options were extended to accept IPv6 addresses:
dns_server, trusted_networks, internal_networks, msa_networks, (but not yet
the whitelist_from_rcvd), and their defaults were adjusted accordingly.

The parser code of Received header fields can now deal with IPv6 addresses
in a mail header section.

The AutoWhitelist plugin was updated and can now deal with IPv6 addresses.

Installation unit tests were updated to prevent them from failing on an
INET6 -only host.


New command-line options
------------------------

New command-line option for spamd: added an option --listen (or -i),
which can be specified multiple times and allows spamd to accept requests
over multiple INET (IPv4) or INET6 (IPv6) or UNIX sockets. See bug 6841,
and see also option --port.

New command-line option for spamc: -X (or --unavailable-tempfail) allows
spamc to return EX_TEMPFAIL instead of EX_UNAVAILABLE when using option -x.

As already noted in the 'Improved support for IPv6' section, options -4
and -6 were added to programs spamassassin, spamd, spamc, and sa-update.

The sa-update utility can now take multiple -v or --verbose options to
increase verbosity.

The sa-learn command has a new option option --max-size .


New configuration options
-------------------------

Plugin/URIDNSBL: new tflags options 'a' and 'ns' were introduced. They are
documented in the Mail::SpamAssassin::Plugin::URIDNSBL POD or man page.

Plugin/AutoLearnThreshold: new option autolearn_force was added. It is
documented in the Mail::SpamAssassin::Plugin::AutoLearnThreshold POD or
man page.

Plugin/ASN: new options asn_prefix and clear_asn_lookups were added.
They are documented in Mail::SpamAssassin::Plugin::ASN POD or man page.


The following new options, as implemented by various plugins or by
other modules, are all documented in the Mail::SpamAssassin::Conf POD
or man page:

- Plugin/WLBLEval: new configuration options were added: enlist_uri_host,
delist_uri_host, with shorthands blacklist_uri_host and whitelist_uri_host
and an associated eval rule check_uri_host_listed.

- Configuration options dns_query_restriction (allow|deny) and
clear_dns_query_restriction were added (bug 6884).

- A 'dns_options' setting received new sub-options 'dns0x20' and 'edns'.

- Added option 'dns_server' which specifies an IP address of a DNS server
and optionally its port number.

- Added options dns_local_ports_permit, dns_local_ports_avoid and
dns_local_ports_none to control source port local ranges available to
DNS queries

- Added the following sub-options to the tflags setting: autolearn_force,
maxhits=N, ips_only, domains_only, a, ns.

- The option whitelist_from_rcvd can now take an IP address as its second
argument (instead of a domain name), which can be useful for whitelisting
a sending mailer which has no reverse DNS mapping.



ArchiveIterator has new options opt_max_size and opt_from_regex. They are
documented in Mail::SpamAssassin::ArchiveIterator POD or man page.

A new tag (macro) _RULESVERSION_ was added. It is a comma-separated list of
rules versions, retrieved from an '# UPDATE version' comment in rules files
and can be used in an 'add_header' configuration setting.


New plugins
-----------

A new plugin AskDNS was introduced.

Using a DNS query template as specified in a parameter of an askdns rule,
the plugin replaces tag names as found in the template with their values
and launches DNS queries as soon as tag values become available. When DNS
responses trickle in, filters them according to the requested DNS resource
record type and optional subrule filtering expression, yielding a rule hit
if a response meets filtering conditions.


Optimizations
-------------

Several smaller performance optimizations were introduced, among others:
bug 6508 (uses Net::Patricia if available), bug 6854 (base64 attachments),
bug 6915 (get_tag speedup).

The DNS client code module now caches queries and replies for the duration
of processing one mail message. Duplicate DNS queries by different rules
which happen to query the same DNS resource are now avoided.



Downloading and availability
----------------------------

Downloads are available from:

http://spamassassin.apache.org/downloads.cgi

md5sum of archive files:

01d2561ccee32d00c86a58d2475cd386 Mail-SpamAssassin-3.4.0-rc3.tar.bz2
94141224a71f06f231e14ee21f1f3583 Mail-SpamAssassin-3.4.0-rc3.tar.gz
12937ab14c0521bd2366dbbdd254c46f Mail-SpamAssassin-3.4.0-rc3.zip
cc005150b7521e43222e153b1d11e8c0 
Mail-SpamAssassin-rules-3.4.0-rc3.r1530876.tgz

sha1sum of archive files:

2b2ab7fe9d864dd12160b3383e1300b57610b669 Mail-SpamAssassin-3.4.0-rc3.tar.bz2
8e92e16bf1ecf32f73a5c5b4417bcc93877e4a87 Mail-SpamAssassin-3.4.0-rc3.tar.gz
062c28135f4b2c2e1bd2f0119731d3c931b33374 Mail-SpamAssassin-3.4.0-rc3.zip
e9290f51bc268d28e65210d26db7d5063ffd4aca 
Mail-SpamAssassin-rules-3.4.0-rc3.r1530876.tgz

Note that the *-rules-*.tar.gz files are only necessary if you cannot, 
or do not
wish to, run "sa-update" after install to download the latest fresh rules.

See the INSTALL and UPGRADE files in the distribution for important 
installation notes.


GPG Verification Procedure
--------------------------
The release files also have a .asc accompanying them.  The file serves
as an external GPG signature for the given release file.  The signing
key is available via the wwwkeys.pgp.net key server, as well as
http://www.apache.org/dist/spamassassin/KEYS

The key information is:

pub   4096R/F7D39814 2009-12-02
        Key fingerprint = D809 9BC7 9E17 D7E4 9BC2  1E31 FDE5 2F40 F7D3 9814
uid                  SpamAssassin Project Management Committee 
<pr...@spamassassin.apache.org>
uid                  SpamAssassin Signing Key (Code Signing Key, 
replacement for 1024D/265FA05B) <de...@spamassassin.apache.org>
sub   4096R/7B3265A5 2009-12-02

To verify a release file, download the file with the accompanying .asc 
file and run the following commands:

   gpg -v --keyserver wwwkeys.pgp.net --recv-key F7D39814
   gpg --verify Mail-SpamAssassin-3.4.0-pre1.tar.bz2.asc
   gpg --fingerprint F7D39814

Then verify that the key matches the signature.

Note that older versions of gnupg may not be able to complete the steps 
above.
Specifically, GnuPG v1.0.6, 1.0.7 & 1.2.6 failed while v1.4.11 worked 
flawlessly.

See http://www.apache.org/info/verification.html for more information on 
verifying Apache releases.

About Apache SpamAssassin
-------------------------

Apache SpamAssassin is a mature, widely-deployed open source project
that serves as a mail filter to identify spam. SpamAssassin uses a
variety of mechanisms including mail header and text analysis, Bayesian
filtering, DNS blocklists, and collaborative filtering databases. In
addition, Apache SpamAssassin has a modular architecture that allows
other technologies to be quickly incorporated as an addition or as a
replacement for existing methods.

Apache SpamAssassin typically runs on a server, classifies and labels
spam before it reaches your mailbox, while allowing other components of
a mail system to act on its results.

Most of the Apache SpamAssassin is written in Perl, with heavily
traversed code paths carefully optimized. Benefits are portability,
robustness and facilitated maintenance. It can run on a wide variety of
POSIX platforms.

The server and the Perl library feels at home on Unix and Linux
platforms, and reportedly also works on MS Windows systems under ActivePerl.

For more information, visit http://spamassassin.apache.org/


About The Apache Software Foundation
------------------------------------

Established in 1999, The Apache Software Foundation provides
organizational, legal, and financial support for more than 100
freely-available, collaboratively-developed Open Source projects. The
pragmatic Apache License enables individual and commercial users to
easily deploy Apache software; the Foundation's intellectual property
framework limits the legal exposure of its 2,500+ contributors.

For more information, visit http://www.apache.org/

Re: Committers/PMC: Call for Vote on SpamAssassin 3.4.0-rc3 release

Posted by Mark Martinec <Ma...@ijs.si>.

> Committers/PMC:
> Please vote to release 3.4.0-rc3.

Looks good, thanks!

+1 from me.

> I'm now running it on one production
> system and have been monitoring it for a while.  So far so good!

Same here.

> Files at http://people.apache.org/~kmcgrail/devel/
> Proposed Announcement follows.

Just comitted some small tweaks to the announcement text.

> Discussing blockers to a real 3.4.0:
> 
>  From my perspective, I'm working on some RBL changes.  Not really
> blockers, but it's my focus right now.

Ok.

> I also think Bug 5503 needs to be completed because it could change
> headers people expect and a 3.4.0 major release is the time to do
> something like that.
> We also need to work on these issues:
> Bug 6422 to test CPAN
> Bug 6639 re: Suse 11 and CPAN

I have no opinion on this.

> After that, I'm hoping to at least spend a few hours documenting some of
> the things AXB pointed out:
> 
>     New configuration options  - Imo - these should be documented:
> 
>     Plugin/WLBLEval:
> 
>     blacklist_uri_host  example.net
>     blacklist_uri_host  somehost.example.net
> 
>     whitelist_uri_host  example.org
>     whitelist_uri_host  somehost.example.org
> 
>     Addition:
> 
>     - Disable lookups for a specific DNS list (instead of zeroing out
>       rules)
> 
>     dns_query_restriction deny  someBL.example.org
> 
>       - If possible some minimal rule samples for
> 
>     "added the following sub-options to the tflags setting"
>     autolearn_force, maxhits=N, ips_only, domains_only, a, ns.
> 
>     A lot of features like these remain unused because they're either
>     very hidden or not loudly documented with sample rules.

Good.


A bug 6953 worries me a little, as some Linux folks which do have
module IO::Socket::INET6 installed but not the IO::Socket::IP
may come across this issue. I have added a paragraph and a ref to
this bug into the announcement text. I don't have a good solution.


> Then we should be able to get 3.4.0 out the door which leads to these
> issues:
> 
> Bug 6885 to switch the website's publishing method will be needed to
> update the website to announce the release.
> 
> Work on post-3.4.0 issues: KAM will be working through the rest of the
> build/README file and expect lots of issues / problems that will need to
> be overcome to switch to 3.4.X.

Ok.


  Mark