You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by hospice admin <ho...@outlook.com> on 2014/06/07 12:19:42 UTC

Advice re- SA 3.4.0

Hi Team,

 

Ive finally completed the upgrade of all my mail servers from FC18 + SA 3.3.2 + Perl 5.15.3 to FC20 + SA 3.4.0 + Perl 5.18.2. I run SA from within MineDefang 2.74 in both cases.

 

I've simply moved across all the rules and plug-ins I used 3.3.2 to 3.4.0, and during our beta testing over the last month or so, everything has worked like a dream. Thanks to all involved in producing such a great piece of software. I'd have made the change months ago if I'd realised it was going to be this easy :)

 

I've been reading through all the new features in 3.4.0, looking at new plug-ins, etc. There's a lot of really interesting looking stuff ...

 

Just wondering if anyone had any advice along the lines "you really must do this", or "you'd be crazy to do that" re- all the new stuff, etc?

 

I'm particularly 'interested' in things relating to Bayes, which has bitten me in the rear so many times, but seems to have migrated over faultlessly.

 

Thanks

 

Judy.

 

 
 		 	   		  

Re: Advice re- SA 3.4.0

Posted by Axb <ax...@gmail.com>.
On 06/07/2014 01:47 PM, hospice admin wrote:
>
>
>
>> Date: Sat, 7 Jun 2014 13:43:37 +0200
>> From: axb.lists@gmail.com
>> To: users@spamassassin.apache.org
>> Subject: Re: Advice re- SA 3.4.0
>>
>> On 06/07/2014 01:33 PM, hospice admin wrote:
>>>
>>>
>>>
>>>> Date: Sat, 7 Jun 2014 13:22:13 +0200 From: axb.lists@gmail.com To:
>>>> users@spamassassin.apache.org Subject: Re: Advice re- SA 3.4.0
>>>>
>>>> On 06/07/2014 01:09 PM, hospice admin wrote:
>>>>> I was wondering about this one and had put it to one side until I
>>>>> had a chance to look at the memory implications in more detail.
>>>>> We run a VM infrastructure here, so I'll load it up on one server
>>>>> and keep throwing resources at it until I have something
>>>>> approaching stability.
>>>>
>>>> my Redis DB's memory usage # Memory used_memory:3919713920
>>>> used_memory_human:3.65G used_memory_rss:6440914944
>>>> used_memory_peak:6307356768 used_memory_peak_human:5.87G
>>>> used_memory_lua:99328
>>>>
>>>> the peak is used when Reids dumps the DB to file, every N minutes.
>>>> To do this, a second Redis server instance is started and does the
>>>> dump so if you think your Bayes DB will be 4GB... double that (at
>>>> least) for safety. If the box starts swapping hard it will all
>>>> become incredibly slow or even crash/feed.
>>>>
>>>> free total used free shared buffers cached Mem: 14262652 8116144
>>>> 6146508 0 151260 1349872 -/+ buffers/cache: 6615012 7647640 Swap:
>>>> 2046968 10396 2036572
>>>>
>>>>
>>>>> An almost related item ... have you found the 'RelayCountry'
>>>>> plugin to be worth the effort?
>>>> Due to the nature of my traffic I see little use for it.
>>>>
>>>> Ff you mainly deal with regional traffic it's probably worth
>>>> trying.
>>>>
>>>>
>>>
>>> WOW! My DB isn't anything like 4GB, but our whole setup presently
>>> runs in 6GB, to the increase is likely to be scary big. I guess you
>>> get what you pay for though!
>>
>> Don't let my Redis DB size scare you unless you're handling mail for a
>> for tens of thousands of users.
>>
>> I'd suggest your start off with a dedicated VM for the Redis server,
>> assign it 8 GB (to be on the safe side) set our tokens to expire in 2
>> weeks and watch it closely. This way if you have a Redis issue, it won't
>> affect mail processing
>>
>> # FOR REDIS ONLY!!!!
>> bayes_token_ttl 14d
>> bayes_seen_ttl 7d
>>
>> should get you started...
>> You can also limit Redis' memory usage (if you want)
>>

I put some Redis/Bayes stuff in
http://svn.apache.org/repos/asf/spamassassin/trunk/contrib/HOWTO.Bayes-Redis/

hope this helps

Axb

RE: Advice re- SA 3.4.0

Posted by hospice admin <ho...@outlook.com>.
 

> Date: Sat, 7 Jun 2014 13:43:37 +0200
> From: axb.lists@gmail.com
> To: users@spamassassin.apache.org
> Subject: Re: Advice re- SA 3.4.0
> 
> On 06/07/2014 01:33 PM, hospice admin wrote:
> >
> >
> >
> >> Date: Sat, 7 Jun 2014 13:22:13 +0200 From: axb.lists@gmail.com To:
> >> users@spamassassin.apache.org Subject: Re: Advice re- SA 3.4.0
> >>
> >> On 06/07/2014 01:09 PM, hospice admin wrote:
> >>> I was wondering about this one and had put it to one side until I
> >>> had a chance to look at the memory implications in more detail.
> >>> We run a VM infrastructure here, so I'll load it up on one server
> >>> and keep throwing resources at it until I have something
> >>> approaching stability.
> >>
> >> my Redis DB's memory usage # Memory used_memory:3919713920
> >> used_memory_human:3.65G used_memory_rss:6440914944
> >> used_memory_peak:6307356768 used_memory_peak_human:5.87G
> >> used_memory_lua:99328
> >>
> >> the peak is used when Reids dumps the DB to file, every N minutes.
> >> To do this, a second Redis server instance is started and does the
> >> dump so if you think your Bayes DB will be 4GB... double that (at
> >> least) for safety. If the box starts swapping hard it will all
> >> become incredibly slow or even crash/feed.
> >>
> >> free total used free shared buffers cached Mem: 14262652 8116144
> >> 6146508 0 151260 1349872 -/+ buffers/cache: 6615012 7647640 Swap:
> >> 2046968 10396 2036572
> >>
> >>
> >>> An almost related item ... have you found the 'RelayCountry'
> >>> plugin to be worth the effort?
> >> Due to the nature of my traffic I see little use for it.
> >>
> >> Ff you mainly deal with regional traffic it's probably worth
> >> trying.
> >>
> >>
> >
> > WOW! My DB isn't anything like 4GB, but our whole setup presently
> > runs in 6GB, to the increase is likely to be scary big. I guess you
> > get what you pay for though!
> 
> Don't let my Redis DB size scare you unless you're handling mail for a 
> for tens of thousands of users.
> 
> I'd suggest your start off with a dedicated VM for the Redis server, 
> assign it 8 GB (to be on the safe side) set our tokens to expire in 2 
> weeks and watch it closely. This way if you have a Redis issue, it won't 
> affect mail processing
> 
> # FOR REDIS ONLY!!!!
> bayes_token_ttl 14d
> bayes_seen_ttl 7d
> 
> should get you started...
> You can also limit Redis' memory usage (if you want)
> 
> 
 
Thanks for that!
 
Judy.
 		 	   		  

Re: Advice re- SA 3.4.0

Posted by Axb <ax...@gmail.com>.
On 06/07/2014 01:33 PM, hospice admin wrote:
>
>
>
>> Date: Sat, 7 Jun 2014 13:22:13 +0200 From: axb.lists@gmail.com To:
>> users@spamassassin.apache.org Subject: Re: Advice re- SA 3.4.0
>>
>> On 06/07/2014 01:09 PM, hospice admin wrote:
>>> I was wondering about this one and had put it to one side until I
>>> had a chance to look at the memory implications in more detail.
>>> We run a VM infrastructure here, so I'll load it up on one server
>>> and keep throwing resources at it until I have something
>>> approaching stability.
>>
>> my Redis DB's memory usage # Memory used_memory:3919713920
>> used_memory_human:3.65G used_memory_rss:6440914944
>> used_memory_peak:6307356768 used_memory_peak_human:5.87G
>> used_memory_lua:99328
>>
>> the peak is used when Reids dumps the DB to file, every N minutes.
>> To do this, a second Redis server instance is started and does the
>> dump so if you think your Bayes DB will be 4GB... double that (at
>> least) for safety. If the box starts swapping hard it will all
>> become incredibly slow or even crash/feed.
>>
>> free total used free shared buffers cached Mem: 14262652 8116144
>> 6146508 0 151260 1349872 -/+ buffers/cache: 6615012 7647640 Swap:
>> 2046968 10396 2036572
>>
>>
>>> An almost related item ... have you found the 'RelayCountry'
>>> plugin to be worth the effort?
>> Due to the nature of my traffic I see little use for it.
>>
>> Ff you mainly deal with regional traffic it's probably worth
>> trying.
>>
>>
>
> WOW! My DB isn't anything like 4GB, but our whole setup presently
> runs in 6GB, to the increase is likely to be scary big. I guess you
> get what you pay for though!

Don't let my Redis DB size scare you unless you're handling mail for a 
for tens of thousands of users.

I'd suggest your start off with a dedicated VM for the Redis server, 
assign it 8 GB (to be on the safe side) set our tokens to expire in 2 
weeks and watch it closely. This way if you have a Redis issue, it won't 
affect mail processing

# FOR REDIS ONLY!!!!
bayes_token_ttl	14d
bayes_seen_ttl	7d

should get you started...
You can also limit Redis' memory usage (if you want)



RE: Advice re- SA 3.4.0

Posted by hospice admin <ho...@outlook.com>.
 

> Date: Sat, 7 Jun 2014 13:22:13 +0200
> From: axb.lists@gmail.com
> To: users@spamassassin.apache.org
> Subject: Re: Advice re- SA 3.4.0
> 
> On 06/07/2014 01:09 PM, hospice admin wrote:
> > I was wondering about this one and had put it to one side until I had
> > a chance to look at the memory implications in more detail. We run a
> > VM infrastructure here, so I'll load it up on one server and keep
> > throwing resources at it until I have something approaching
> > stability.
> 
> my Redis DB's memory usage
> # Memory
> used_memory:3919713920
> used_memory_human:3.65G
> used_memory_rss:6440914944
> used_memory_peak:6307356768
> used_memory_peak_human:5.87G
> used_memory_lua:99328
> 
> the peak is used when Reids dumps the DB to file, every N minutes.
> To do this, a second Redis server instance is started and does the dump 
> so if you think your Bayes DB will be 4GB... double that (at least) for 
> safety. If the box starts swapping hard it will all become incredibly 
> slow or even crash/feed.
> 
> free
> total used free shared buffers cached
> Mem: 14262652 8116144 6146508 0 151260 1349872
> -/+ buffers/cache: 6615012 7647640
> Swap: 2046968 10396 2036572
> 
> 
> > An almost related item ... have you found the 'RelayCountry' plugin
> > to be worth the effort?
> Due to the nature of my traffic I see little use for it.
> 
> Ff you mainly deal with regional traffic it's probably worth trying.
> 
> 
 
WOW! My DB isn't anything like 4GB, but our whole setup presently runs in 6GB, to the increase is likely to be scary big. I guess you get what you pay for though!
 
Thanks
 
Judy
 		 	   		  

Re: Advice re- SA 3.4.0

Posted by Axb <ax...@gmail.com>.
On 06/07/2014 01:09 PM, hospice admin wrote:
> I was wondering about this one and had put it to one side until I had
> a chance to look at the memory implications in more detail. We run a
> VM infrastructure here, so I'll load it up on one server and keep
> throwing resources at it until I have something approaching
> stability.

my Redis DB's memory usage
# Memory
used_memory:3919713920
used_memory_human:3.65G
used_memory_rss:6440914944
used_memory_peak:6307356768
used_memory_peak_human:5.87G
used_memory_lua:99328

the peak is used when Reids dumps the DB to file, every N minutes.
To do this, a second Redis server instance is started and does the dump 
so if you think your Bayes DB will be 4GB... double that (at least) for 
safety. If the box starts swapping hard it will all become incredibly 
slow or even crash/feed.

free
              total       used       free     shared    buffers     cached
Mem:      14262652    8116144    6146508          0     151260    1349872
-/+ buffers/cache:    6615012    7647640
Swap:      2046968      10396    2036572


> An almost related item ... have you found the 'RelayCountry' plugin
> to be worth the effort?
Due to the nature of my traffic I see little use for it.

Ff you mainly deal with regional traffic it's probably worth trying.



RE: Advice re- SA 3.4.0

Posted by hospice admin <ho...@outlook.com>.
 

> Date: Sat, 7 Jun 2014 12:49:32 +0200
> From: axb.lists@gmail.com
> To: users@spamassassin.apache.org
> Subject: Re: Advice re- SA 3.4.0
> 
> On 06/07/2014 12:19 PM, hospice admin wrote:
> > Just wondering if anyone had any advice along the lines "you really
> > must do this", or "you'd be crazy to do that" re- all the new stuff,
> > etc?
> >
> >
> >
> > I'm particularly 'interested' in things relating to Bayes, which has
> > bitten me in the rear so many times, but seems to have migrated over
> > faultlessly.
> 
> 
> For my setup one of the best new features has been the Redis backend for 
> Bayes.
> 
> Although it requires a ton of memory (which is cheap) it allows keeping 
> a *huge* amount of tokens for a long time without a decrease in 
> performance.
> 
> Feeding it trap data as spam with different expiration time than ham via 
> autolearn has made Bayes way more useful.
> 
> 0.000 0 22712317 0 non-token data: nspam
> 0.000 0 10031781 0 non-token data: nham
> 
> 
> try that with sql or file based Bayes - it wouldn't scale and it could 
> probably cause scan times in many, many seconds/msg range
> 
> Bayes/Redis is so fast I don't notice a performance difference if I 
> enable it or not.
> So depending on your traffic load, imo, it's become a must have.
> 
> also forced autolearn has helped a lot with failsafe metas/rules
> 
> tflags RULE_NAME autolearn_force
> 
> h2h
> 
> Axb
 


Thanks. That sounds like great advice.
 
I was wondering about this one and had put it to one side until I had a chance to look at the memory implications in more detail. We run a VM infrastructure here, so I'll load it up on one server and keep throwing resources at it until I have something approaching stability.
 
An almost related item ... have you found the 'RelayCountry' plugin to be worth the effort?
 
Thanks again
 
Judy.
 		 	   		  

Re: Advice re- SA 3.4.0

Posted by Axb <ax...@gmail.com>.
On 06/07/2014 12:19 PM, hospice admin wrote:
> Just wondering if anyone had any advice along the lines "you really
> must do this", or "you'd be crazy to do that" re- all the new stuff,
> etc?
>
>
>
> I'm particularly 'interested' in things relating to Bayes, which has
> bitten me in the rear so many times, but seems to have migrated over
> faultlessly.


For my setup one of the best new features has been the Redis backend for 
Bayes.

Although it requires a ton of memory (which is cheap) it allows keeping 
a *huge* amount of tokens for a long time without a decrease in 
performance.

Feeding it trap data as spam with different expiration time than ham via 
autolearn has made Bayes way more useful.

0.000          0   22712317          0  non-token data: nspam
0.000          0   10031781          0  non-token data: nham


try that with sql or file based Bayes - it wouldn't scale and it could 
probably cause scan times in many, many seconds/msg range

Bayes/Redis is so fast I don't notice a performance difference if I 
enable it or not.
So depending on your traffic load, imo, it's become a must have.

also forced autolearn has helped a lot with failsafe metas/rules

tflags  RULE_NAME  autolearn_force

h2h

Axb