You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Stefan Ewert <nu...@gmx.net> on 2005/06/13 17:06:45 UTC

Boost up Spamassassin option

Hi,

does anyone know about a option which speeds up spamassassin extremly:

order the tests: fastest first, getting slower , slowest is the last test in 
the list (dns perhaps, razor, pyzor, dcc).

and now: stop testing the mail, as soon as spamscore is greater than needed to 
be marked as a spam mail.
i dont want to know if this mail has got 30 points, im just interested in a 
decision between spam and not spam.

regards s.
-- 
"UNIX ist benutzerfreundlich - es ist nur etwas wählerisch..." (Walter Misar)

Re: Boost up Spamassassin option

Posted by Matt Kettler <mk...@comcast.net>.
At 11:06 AM 6/13/2005, Stefan Ewert wrote:
>does anyone know about a option which speeds up spamassassin extremly:
>
>order the tests: fastest first, getting slower , slowest is the last test in
>the list (dns perhaps, razor, pyzor, dcc).
>
>and now: stop testing the mail, as soon as spamscore is greater than 
>needed to
>be marked as a spam mail.
>i dont want to know if this mail has got 30 points, im just interested in a
>decision between spam and not spam.

Doesn't work. It was tried a long time ago. It greatly slows down SA.

In order to do this without FPs you must run all negative score tests 
first, then positive score tests.

Doing that slows SA down in the average case because you have to make 
multiple passes over the message body.

Your approach would actually involve making half-a-dozen of passes over the 
message body. Painfully slow in the average case.

Your approach also winds up eliminating the benefit of parallelization that 
the current approach uses. SA starts the network checks FIRST, so that they 
are running while SA runs other body checks. The total scan time winds up 
being the time of the slowest network check, instead of the sum of all the 
tests.

For example, if the slowest network test takes 4 seconds, and other network 
tests take 3, 2, and the body scan takes 1 second. Wildly contrived 
numbers, but good enough to make a point with.

In your approach a message would scan in 1+2+3+4= 10 seconds. Maybe you'll 
get lucky and skip the last test, so your time will be 1+2+3 = 6 seconds.

The current approach would take 4 seconds in all cases, because the network 
checks run in parallel with each other and the body scan. SA starts all the 
network checks, then runs the body rules, then waits for the network check 
answers.

In order to be faster you'd have to be lucky enough to skip two tests, and 
then you'd be 3 seconds, a 25% speed improvement. But the normal case takes 
2.5 times as long to run. On average, you'll probably wind up being about 
half the speed.... Ouch.





Re: Boost up Spamassassin option

Posted by Theo Van Dinter <fe...@apache.org>.
On Mon, Jun 13, 2005 at 05:06:45PM +0200, Stefan Ewert wrote:
> does anyone know about a option which speeds up spamassassin extremly:

Used to exist in 2.4, didn't work and cause a bigger performance drag
than it provided anyway, so we took it out.  There's talk about a new
way to add it back in so it doesn't suck, but no actual code has been
written yet.  Maybe 3.2.

BTW: it's called "short-circuit", for future reference. :)

-- 
Randomly Generated Tagline:
Do the voices in my head bother you?

Re: Boost up Spamassassin option

Posted by jdow <jd...@earthlink.net>.
From: "Stefan Ewert" <nu...@gmx.net>

> > On another paw, more memory is generally a good way to speed up the
> > spamassassin operation. A good DNS setup is also required so that you
> > do not get delays in DNS lookups. Do not select DNS tests for sites
> > that no longer exist. That is a major slow down.
> >
> sorry, i cant follow you, where can i read something about this topic?

A not uncommon problem I see on this list is people attempting to use
DNS test sites (Black List sites) that no longer exist. Then you get
a 10 second or so timeout on the DNS lookup attempts to that site.

> > Now, when you complain about speeds how about some numbers. What is
> > the processor speed, what is the amount of memory, what other things
> > run on the computer, how long is "spamassassin --lint" taking, how
> > long is a typical message processing take, and so forth. Give us
> > something to work with. THen we can tell you what is wrong. Of
> > particular interest are non-stock things you are doing. Did you add
> > additional DNS tests, for example?
> >
> first of all im not complaining, this is a misunderstanding. i just want
to
> help a little in the development of SA. im new in this list, so  never
heard
> about this idea before ;)
> here are the facts: amd3000+, 512 MB, desktop pc, lint  takes 5 secs, id
guess > a typical messages takes about 25 secs.

That sounds very much like your DNS based tests are failing. Try turning
off the DNS based tests and see what you get. Or run a test with a
message and the "spamassassin -D -t <message" format. The -D turns on
debug messages. And you can see where the long delays come from. It is
not from the normal rules or Bayes.

> im using the standard configuration file, so from my side i didnt add any
> other tests like dns. theres no nameserver running on my pc.
>
> just thougt, it would be very simple to stop testing at the right moment,
but
> it seems like i know to few about this filtering process.
>
> -- 
> MUM, CAN I GO OUT AND CODE TONIGHT?

In as much as I'm not your mother I can't say. Were you indeed my son
I'd be surprised if he was doing heavy coding. He's a mechanical
engineer working for a automaker.

{^_-}



Re: Boost up Spamassassin option

Posted by Stefan Ewert <nu...@gmx.net>.
> On another paw, more memory is generally a good way to speed up the
> spamassassin operation. A good DNS setup is also required so that you
> do not get delays in DNS lookups. Do not select DNS tests for sites
> that no longer exist. That is a major slow down.
>
sorry, i cant follow you, where can i read something about this topic?

> Now, when you complain about speeds how about some numbers. What is
> the processor speed, what is the amount of memory, what other things
> run on the computer, how long is "spamassassin --lint" taking, how
> long is a typical message processing take, and so forth. Give us
> something to work with. THen we can tell you what is wrong. Of
> particular interest are non-stock things you are doing. Did you add
> additional DNS tests, for example?
>
first of all im not complaining, this is a misunderstanding. i just want to 
help a little in the development of SA. im new in this list, so  never heard 
about this idea before ;)
here are the facts: amd3000+, 512 MB, desktop pc, lint  takes 5 secs, id guess 
a typical messages takes about 25 secs.
im using the standard configuration file, so from my side i didnt add any 
other tests like dns. theres no nameserver running on my pc.

just thougt, it would be very simple to stop testing at the right moment, but 
it seems like i know to few about this filtering process.

-- 
MUM, CAN I GO OUT AND CODE TONIGHT?

Re: Boost up Spamassassin option

Posted by jdow <jd...@earthlink.net>.
Some scores have negative values. Some of the negative values are
big enough to make 30 into a negative score.

This is a discussion that comes up quite often. And it's been decided
every time that no change should be made.

On another paw, more memory is generally a good way to speed up the
spamassassin operation. A good DNS setup is also required so that you
do not get delays in DNS lookups. Do not select DNS tests for sites
that no longer exist. That is a major slow down.

Now, when you complain about speeds how about some numbers. What is
the processor speed, what is the amount of memory, what other things
run on the computer, how long is "spamassassin --lint" taking, how
long is a typical message processing take, and so forth. Give us
something to work with. THen we can tell you what is wrong. Of
particular interest are non-stock things you are doing. Did you add
additional DNS tests, for example?

{^_^}
----- Original Message ----- 
From: "Stefan Ewert" <nu...@gmx.net>
To: <us...@spamassassin.apache.org>
Sent: 2005 June, 13, Monday 08:06
Subject: Boost up Spamassassin option


Hi,

does anyone know about a option which speeds up spamassassin extremly:

order the tests: fastest first, getting slower , slowest is the last test in
the list (dns perhaps, razor, pyzor, dcc).

and now: stop testing the mail, as soon as spamscore is greater than needed
to
be marked as a spam mail.
i dont want to know if this mail has got 30 points, im just interested in a
decision between spam and not spam.

regards s.
-- 
"UNIX ist benutzerfreundlich - es ist nur etwas wählerisch..." (Walter
Misar)