You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Jason Haar <Ja...@trimble.com> on 2014/02/18 02:59:01 UTC

SA 3.4.0 and Redis

Hi there

We have a geographically distributed edge mail relay network (some in
the US and some in Europe) and I'm wondering if the new REDIS support
could be used to centralize our Bayes?

Is anything special required to be done to get 4-6 spamd servers to use
the same REDIS backend? Will network outages (which will happen) cause
corruption  that could impact the others? (eg what if spamd is trying to
upload 3 records to redis and only the first two go through)

Thanks

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1


Re: SA 3.4.0 and Redis

Posted by Jason Haar <Ja...@trimble.com>.
On 18/02/14 15:49, Mark Martinec wrote:
> One server in each continent might be acceptable, but hasn't
> been tried.

Yeah, in fact I can separate into Europe and US (responsible for
different domains), so two Redis makes more sense.

> No corruption can happen due to network problems. Cases where some
> but not all tokens are learned, or tokens learned but 'seen' entry
> not added are non-problematic if it doesn't happen too often.
> Token updates usually fit within a single IP packet, so in most
> cases either all of the transaction gets committed or none,
> even in case of network problems.

My definition of "corruption" is when it causes the app to be unhappy -
not necessarily that the data is corrupt :-) Obviously a TCP packet
either arrives and passes checksum or it's thrown away. Your last
sentence answers my question - network outages shouldn't cause the Bayes
data to become useless to SA - good

>
> A full network breakdown (or server down) would cause SpamAssassin
> to log warnings for each mail message, but will move on anyway,
> just without Bayes checks. 

Yep - that's fine. I think it'll be worth a shot :-)

Thanks!

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1


Re: SA 3.4.0 and Redis

Posted by Mark Martinec <Ma...@ijs.si>.
2014-02-18, Jason Haar wrote:
> We have a geographically distributed edge mail relay network (some in
> the US and some in Europe) and I'm wondering if the new REDIS support
> could be used to centralize our Bayes?

If you have a fast and reliable connection between the two,
then in principle it could work, although even the roundtrip
time across the globe is several times the time needed for a
local transaction, so this is probably not a desirable setup.
One server in each continent might be acceptable, but hasn't
been tried.

Bear in mind that a redis server offers no access controls of
its own, so IP restrictions need to be handled by a firewall
if redis binds to a publicly reachable interface.

> Is anything special required to be done to get 4-6 spamd servers
> to use the same REDIS backend?

No, this is normal. It is no different that having multiple spamd
or amavisd child processes under a single master process, each
process accesses a database completely independently.

> Will network outages (which will happen) cause
> corruption that could impact the others? (eg what if spamd is trying
> to upload 3 records to redis and only the first two go through)

No corruption can happen due to network problems. Cases where some
but not all tokens are learned, or tokens learned but 'seen' entry
not added are non-problematic if it doesn't happen too often.
Token updates usually fit within a single IP packet, so in most
cases either all of the transaction gets committed or none,
even in case of network problems.

A full network breakdown (or server down) would cause SpamAssassin
to log warnings for each mail message, but will move on anyway,
just without Bayes checks. Depending on the mail traffic rate
and the duration of outage the volume of such warnings may be
undesirable. Intermittent network problems or slowness would be
more problematic, as it could slow down mail checking substantially,
as timeouts for failing rules and checks are rather large.

   Mark