You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Michelle Konzack <li...@tamay-dogan.net> on 2008/09/04 09:33:04 UTC
spamassassin taks ten minutes for a message
Hello,
I am downloading my messages with my Laptop in a Internet cafe, trans-
fering @home to my server and then let a filter roll over it...
Note: I am working Off-Line (No Internet @home)
Last weekend I was with my server @friends and conected it over ADSL to
the Internet and downloadd arround 30.000 messages from arround 10 days
since I was not in Strasbourg.
The first 18.000 messages went fin but then "spamassassin" begun to
buging, exactly I took arround 10 minutes for each message and when I
encountered the problem it was already runnin several hours with this
problem...
How can ths be?
----[ command 'cd .spamassassin && ls -Al' ]----------------------------
insgesamt 25528
-rw------- 1 michelle.konzack private 1327104 2008-08-31 18:29 auto-whitelist
-rw------- 1 michelle.konzack private 93840 2008-08-31 18:29 bayes_journal
-rw------- 1 michelle.konzack private 2629632 2008-08-31 18:29 bayes_seen
-rw------- 1 michelle.konzack private 20377600 2008-08-31 18:29 bayes_toks
-rw------- 1 michelle.konzack private 4718592 2008-08-30 19:46 bayes_toks.expire30302
-rw------- 1 michelle.konzack private 4513792 2008-08-30 20:42 bayes_toks.expire32331
-rw------- 1 michelle.konzack private 4517888 2008-08-31 18:29 bayes_toks.expire7360
-rw-r--r-- 1 michelle.konzack private 1510 2008-08-30 21:52 user_prefs
------------------------------------------------------------------------
As you can see, I had to stop spamassassin on 2008-08-31. And even if I
move the files out of the way. "spamassassin" refus to run with normal
speed (4-6 messages per second)-
Note: The Server is a Quad-Xeon with plenty of memory and
10.000 RpM SCSI drives in Raid-1.
Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
24V Electronic Engineer
Tamay Dogan Network
Debian GNU/Linux Consultant
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack Apt. 917 ICQ #328449886
+49/177/9351947 50, rue de Soultz MSN LinuxMichi
+33/6/61925193 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Re: spamassassin taks ten minutes for a message
Posted by Michelle Konzack <li...@tamay-dogan.net>.
Hello John,
Am 2008-09-21 09:40:38, schrieb John Hardin:
> Some questions:
>
> (1) How are you passing messages to spamassassin for scoring?
In procmail with:
:0fw
|spamc
> (2) Exactly what command line options are you using for
> spamc/spamassassin? Are network tests enabled?
Standard Debian installation without network tests since I am Off-Line
> (3) Do you have bayes auto-expire enabled?
It seems, it is the default if you install spamassassin. Now I have set
bayes_auto_expire 0
in my ~/.spamassassin/user_prefs and waiting if the error occor again.
Also I have setup a cronjob with
0 6 * * * /usr/bin/sa-learn --force-expire
> (4) Does it exhibit the same poor performance when you run one message
> through spamassassin manually?
Yes
> Please run one message through spamassassin with debugging turned on,
> capture the results, and post them to a website somewhere (e.g. pastebin)
> and send the URL for that to the list so we can see timing and such.
I think, it is not neccesary (see other message) but if the error happen
again, I will come back.
Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
24V Electronic Engineer
Tamay Dogan Network
Debian GNU/Linux Consultant
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack Apt. 917 ICQ #328449886
+49/177/9351947 50, rue de Soultz MSN LinuxMichi
+33/6/61925193 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Re: spamassassin taks ten minutes for a message
Posted by John Hardin <jh...@impsec.org>.
On Thu, 4 Sep 2008, Michelle Konzack wrote:
> The first 18.000 messages went fin but then "spamassassin" begun to
> buging, exactly I took arround 10 minutes for each message and when I
> encountered the problem it was already runnin several hours with this
> problem...
>
> How can ths be?
>
> As you can see, I had to stop spamassassin on 2008-08-31. And even if I
> move the files out of the way. "spamassassin" refus to run with normal
> speed (4-6 messages per second)-
Some questions:
(1) How are you passing messages to spamassassin for scoring?
(2) Exactly what command line options are you using for
spamc/spamassassin? Are network tests enabled?
(3) Do you have bayes auto-expire enabled?
(4) Does it exhibit the same poor performance when you run one message
through spamassassin manually?
Please run one message through spamassassin with debugging turned on,
capture the results, and post them to a website somewhere (e.g. pastebin)
and send the URL for that to the list so we can see timing and such.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Democrats '63: Ask not what your country can do for you,
ask what you can do for your country.
Democrats '07: Ask not what your country can do for you,
demand it!
-----------------------------------------------------------------------
44 days until the Presidential Election
Re: spamassassin taks ten minutes for a message
Posted by Bob Proulx <bo...@proulx.com>.
Michelle Konzack wrote:
> I have filtered in the last 4 month over 800.000 messages and it was
> working perfectly without and flaws and had stoped form one minute to
> another.
Well, something in your environment has changed. You might not ever
determine how things used to be but you will need to understand how
they are now and react to them.
> Since I am Off-Line, I had NO update for the system since 4 month, which
> mean, absolutely nothing has changed.
It is probably "Bit Rot". :-)
> Since online checks are to slow, I like to see a solution for very
> reliable RBL checks and such.
You would probably benefit by keeping statistics about which DNSBLs
are triggering on which messages.
> ################ list.dsbl.org ###################################
> ...
> { REV2CHECKIP=`host ${RECEIVIP2REV}.list.dsbl.org 2>&1 | grep -v 'not found.'` }
> ...
> host ${RECEIVIP2REV}.list.dsbl.org
>
> are very slow...
Note that dsbl.org is gone. Please see http://www.dsbl.org/ update
your configuration.
Bob
Re: spamassassin taks ten minutes for a message
Posted by mouss <mo...@netoyen.net>.
Michelle Konzack wrote:
> [snip]
>
> but unfortunately the two/four lookups with
>
> host ${RECEIVIP2REV}.zen.spamhaus.org
> host ${RECEIVIP2REV}.list.dsbl.org
>
> are very slow...
>
> My idea was already if I do not direct filtering, I could catch the IPs,
> put it into a cache file, sort and unify it and use an independant
> process which fetch the status and write out a file, which I can easyly
> import into my own DNS server (bind9) @home and then do the final
> filtering
>
> On my <samba3> I have with the Quad-Xeon enough resources to install
> some instances of bind9 as VHosts which could be setup as
> <zen.spamhaus.org> and <list.dsbl.org> which then would be deactivated
> if <samba3> get an internet connection...
>
> Question: Is it possibel to get (FTP) the lists from the two servers for
> private non-public use? If yes, how big are they?
dsbl was rsync-able but is now gone. for spamhaus, you would have to pay
a fee (too expensive if you don't receive a lot of mail).
> Since I am only 2-3 times per week On-Line, it would be nice
> if I could fetch the whole list. (I asume this takes less
> resources as making several 1000 lookups on the DNS)
It will reduce the latency of your "real time" checks, but will
certainly increase the overall bandwidth usage (if you add up the sizes
of the dns packets, the result will be much smaller than that of the list).
Re: spamassassin taks ten minutes for a message
Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
> Am 2008-09-20 18:22:25, schrieb Bob Proulx:
> > I don't really know and hopefully others will have better
> > suggestions. But the first thing I would try is to run spamassassin
> > in local mode.
> >
> > Options:
> > -L, --local Local tests only (no online tests)
On 23.09.08 13:04, Michelle Konzack wrote:
> I am using this since I have re-installed my Intranet Server 4 month ago.
[...]
> Since online checks are to slow, I like to see a solution for very
> reliable RBL checks and such.
>
> I have a procmail recupe which catch the first and second IP from the
> received header, reverse it and make DNS lookups like:
SA does lookups in parallel. You can even set timeout for them. I guess
lookups in procmail take longer time...
--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Enter any 12-digit prime number to continue.
Re: spamassassin taks ten minutes for a message
Posted by Michelle Konzack <li...@tamay-dogan.net>.
Hi Bob,
Am 2008-09-20 18:22:25, schrieb Bob Proulx:
> I don't really know and hopefully others will have better
> suggestions. But the first thing I would try is to run spamassassin
> in local mode.
>
> Options:
> -L, --local Local tests only (no online tests)
I am using this since I have re-installed my Intranet Server 4 month ago.
> Since you are running it offline I am guessing that SA is trying to do
> network lookups and this is taking the extra time.
I have filtered in the last 4 month over 800.000 messages and it was
working perfectly without and flaws and had stoped form one minute to
another.
Since I am Off-Line, I had NO update for the system since 4 month, which
mean, absolutely nothing has changed.
> Why did this start? I will make a second guess that something on your
> laptop is different in the networking system. The first file I would
> check would be /etc/resolv.conf to see if dns name lookup is different
> than you expect when offline. DNS lookups are "blocking" calls and
> can cause processes to wait during lookup. Double check everything
> and make sure that dns lookups fail quickly when offline.
Spamassassin is on <samba3.private.tamay-dogan.net> and my Laptop is on
<tp570.private.tamay-dogan.net>, Which mean, I download the messages in
a Internet Cafe onto my Laptop sorted hourly and if I a connect my Lapto
@home, the folders where transfered automaticaly to my <samba3> where a
script starts, reading one message after one and pass it to procmail
which do the filtering (including "spamc").
This setup is working since over 8 years...
But when spamassassin has stoped, I had over 30.000 messages in the
queue and it stoped after 12.000 or such...
I should nore, that I use a global lock file for procmail, which mean,
it will handel only one file at once and there can ba no problem several
spamc requests screw up spamassassin...
> I actually do my own spamassassin online before getting to the laptop
> where I read mail offline. The online tests and DNSBLs are much more
> effective than the offline tests. I fear that offline spam testing
I was from 2008-09-01 to 2008-09-18 not in Strasbourg and goten 78.000
messages in the mailboxes... whit a small TP570 is is not possibel to
do and spamassassin stuff...
Only fetchmail and procmail (which sort the messages into hourly folders)
where I get arround 3200 messages per hour.
If I would install spamassassin on my TP570, I would get less then 1000
per hour.
> isn't good enough. If you can get the spamassassin part running
> online before getting to your laptop I am sure you will have a
> superior result.
Since online checks are to slow, I like to see a solution for very
reliable RBL checks and such.
I have a procmail recupe which catch the first and second IP from the
received header, reverse it and make DNS lookups like:
----[ '/usr/share/tdtools-procmail/FLT_spamhaus' ]----------------------
<snip>
:0
* ? test -f "`which host`"
{
SUB1=`formail -zxSubject:`
DATE1=`date +"%d/%m/%Y %T"`
########## first IP ##########
:0 H
* Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
{
RECEIVIP=${MATCH}
:0
* ! RECEIVIP ?? 127.0.0.1
{
:0
* RECEIVIP ?? ()\/[0-9]+
{
QUAD1=${MATCH}
:0
* RECEIVIP ?? [0-9]+\.\/[0-9]+
{
QUAD2=${MATCH}
:0
* RECEIVIP ?? [0-9]+\.[0-9]+\.\/[0-9]+
{
QUAD3=${MATCH}
:0
* RECEIVIP ?? [0-9]+\.[0-9]+\.[0-9]+\.\/[0-9]+
{
RECEIVIPREV="${MATCH}.${QUAD3}.${QUAD2}.${QUAD1}"
}
}
}
################ sbl-xbl.spamhaus.org ##############################
:0
{ REVCHECKIP=`host ${RECEIVIPREV}.zen.spamhaus.org 2>&1 | grep -v 'not found.'` }
:0
* $ REVCHECKIP ?? 127\.0\.0\.(2|4)
{ IP=`echo $RECEIVIP >>$HOME/log/spamhaus/\`date +%Y-%m\`.log`
:0fhw
| formail -i "Subject: ***zen.spamhaus.org*** $SUB1" -i "X-TDSpamHaus: $RECEIVIP"
:0
* ^Subject:.*(\*\*\*zen.spamhaus.org\*\*\*)
${TDTP_SPAM_PREFIX}${MSG_DATE}${SPAMTAG}.FLT_spamhaus.zen_spamhaus_org/
}
################ list.dsbl.org #####################################
:0
{ REVCHECKIP=`host ${RECEIVIPREV}.list.dsbl.org 2>&1 | grep -v 'not found.'` }
:0
* $ REVCHECKIP ?? 127\.0\.0\.(2|4)
{ IP=`echo $RECEIVIP >>$HOME/log/spamhaus/\`date +%Y-%m\`.log`
:0fhw
| formail -i "Subject: ***list.dsbl.org*** $SUB1" -i "X-TDSpamHaus: $RECEIVIP"
:0
* ^Subject:.*(\*\*\*list.dsbl.org\*\*\*)
${TDTP_SPAM_PREFIX}${MSG_DATE}${SPAMTAG}.FLT_spamhaus.list_dsbl_org/
}
}
}
}
########## second IP ##########
:0 H
* Received: from.*\[.*\](.*$)+Received:.*\[\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
{
RECEIVIP2=${MATCH}
:0
* ! RECEIVIP2 ?? 127.0.0.1
{
:0
* RECEIVIP2 ?? ()\/[0-9]+
{
QUAD1=${MATCH}
:0
* RECEIVIP2 ?? [0-9]+\.\/[0-9]+
{
QUAD2=${MATCH}
:0
* RECEIVIP2 ?? [0-9]+\.[0-9]+\.\/[0-9]+
{
QUAD3=${MATCH}
:0
* RECEIVIP2 ?? [0-9]+\.[0-9]+\.[0-9]+\.\/[0-9]+
{
RECEIVIP2REV="${MATCH}.${QUAD3}.${QUAD2}.${QUAD1}"
}
}
}
################ sbl-xbl.spamhaus.org ##############################
:0
{ REV2CHECKIP=`host ${RECEIVIP2REV}.zen.spamhaus.org 2>&1 | grep -v 'not found.'` }
:0
* $ REV2CHECKIP ?? 127\.0\.0\.(2|4)
{ IP=`echo $RECEIVIP >>$HOME/log/spamhaus/\`date +%Y-%m\`.log`
:0fhw
| formail -i "Subject: ***zen.spamhaus.org*** $SUB1" -i "X-TDSpamHaus: $RECEIVIP2"
:0
* ^Subject:.*(\*\*\*zen.spamhaus.org\*\*\*)
${TDTP_SPAM_PREFIX}${MSG_DATE}${SPAMTAG}.FLT_spamhaus.zen_spamhaus_org/
}
################ list.dsbl.org ###################################
:0
{ REV2CHECKIP=`host ${RECEIVIP2REV}.list.dsbl.org 2>&1 | grep -v 'not found.'` }
:0
* $ REV2CHECKIP ?? 127\.0\.0\.(2|4)
{ IP=`echo $RECEIVIP >>$HOME/log/spamhaus/\`date +%Y-%m\`.log`
:0fhw
| formail -i "Subject: ***list.dsbl.org*** $SUB1" -i "X-TDSpamHaus: $RECEIVIP2"
:0
* ^Subject:.*(\*\*\*list.dsbl.org\*\*\*)
${TDTP_SPAM_PREFIX}${MSG_DATE}${SPAMTAG}.FLT_spamhaus.list_dsbl_org/
}
}
}
}
}
:0E
{ LOG="${SHOW_FILTER}executable \"host\" not found.${NL}" }
------------------------------------------------------------------------
but unfortunately the two/four lookups with
host ${RECEIVIP2REV}.zen.spamhaus.org
host ${RECEIVIP2REV}.list.dsbl.org
are very slow...
My idea was already if I do not direct filtering, I could catch the IPs,
put it into a cache file, sort and unify it and use an independant
process which fetch the status and write out a file, which I can easyly
import into my own DNS server (bind9) @home and then do the final
filtering
On my <samba3> I have with the Quad-Xeon enough resources to install
some instances of bind9 as VHosts which could be setup as
<zen.spamhaus.org> and <list.dsbl.org> which then would be deactivated
if <samba3> get an internet connection...
Question: Is it possibel to get (FTP) the lists from the two servers for
private non-public use? If yes, how big are they?
Since I am only 2-3 times per week On-Line, it would be nice
if I could fetch the whole list. (I asume this takes less
resources as making several 1000 lookups on the DNS)
Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
24V Electronic Engineer
Tamay Dogan Network
Debian GNU/Linux Consultant
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack Apt. 917 ICQ #328449886
+49/177/9351947 50, rue de Soultz MSN LinuxMichi
+33/6/61925193 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Re: spamassassin taks ten minutes for a message
Posted by Bob Proulx <bo...@proulx.com>.
Michelle Konzack wrote:
> I am downloading my messages with my Laptop in a Internet cafe, trans-
> fering @home to my server and then let a filter roll over it...
>
> Note: I am working Off-Line (No Internet @home)
I do something very similar.
> buging, exactly I took arround 10 minutes for each message and when I
> ...
> move the files out of the way. "spamassassin" refus to run with normal
> speed (4-6 messages per second)-
I don't really know and hopefully others will have better
suggestions. But the first thing I would try is to run spamassassin
in local mode.
Options:
-L, --local Local tests only (no online tests)
Since you are running it offline I am guessing that SA is trying to do
network lookups and this is taking the extra time.
Why did this start? I will make a second guess that something on your
laptop is different in the networking system. The first file I would
check would be /etc/resolv.conf to see if dns name lookup is different
than you expect when offline. DNS lookups are "blocking" calls and
can cause processes to wait during lookup. Double check everything
and make sure that dns lookups fail quickly when offline.
I actually do my own spamassassin online before getting to the laptop
where I read mail offline. The online tests and DNSBLs are much more
effective than the offline tests. I fear that offline spam testing
isn't good enough. If you can get the spamassassin part running
online before getting to your laptop I am sure you will have a
superior result.
Hope this helps,
Bob
Re: spamassassin taks ten minutes for a message
Posted by John Hardin <jh...@impsec.org>.
On Tue, 23 Sep 2008, Michelle Konzack wrote:
> Am 2008-09-21 08:56:15, schrieb Matt Kettler:
>> It looks like spamassassin is attempting to perform a bayes expiry, and
>> you keep killing it before it can finish. It does need to do that once
>> in a while, and it is slow.
>
> I was not killing it, I was only watching the logfiles using tail
> instead of a nice file on TV. Spamassassin took at the beginning
> several minutes and then after several 100 messages over 20 minutes per
> message...
SA does have an internal time for how long it is willing to wait on the
expiry. SA was probably killing the expiry process.
>> If you want to, you can run sa-learn --force-expire in order to make
>> expiry run manually. If no expiry has been run recently, SA will attempt
>> to do so during mail delivery.
>
> Oops... runing...
>
> Hmmm, I have found a bunch of "bayes_toks.expireNNNNN" in the folder...
Yes, that is a clear sign of interrupted bayes expiry attempts.
At this point you should probably turn off auto-expire and run a manual
expiry from cron daily.
> I asume, the "bayes_toks.expireNNNNN" are made by previously run of
> "expire" and left over... I deleted it...
They remain when an expiry has been interrupted.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
You cannot bring about prosperity by discouraging thrift. You
cannot help small men by tearing down big men. You cannot
strengthen the weak by weakening the strong. You cannot lift the
wage-earner by pulling down the wage-payer. You cannot help the
poor man by destroying the rich. You cannot keep out of trouble by
spending more than your income. You cannot further the brotherhood
of man by inciting class hatred. You cannot establish security on
borrowed money. You cannot build character and courage by taking
away men's initiative and independence. You cannot help men
permanently by doing for them what they could and should do for
themselves. -- William J. H. Boetcker
-----------------------------------------------------------------------
41 days until the Presidential Election
Re: spamassassin taks ten minutes for a message
Posted by Michelle Konzack <li...@tamay-dogan.net>.
Am 2008-09-21 08:56:15, schrieb Matt Kettler:
> It looks like spamassassin is attempting to perform a bayes expiry, and
> you keep killing it before it can finish. It does need to do that once
> in a while, and it is slow.
I was not killing it, I was only watching the logfiles using tail
instead of a nice file on TV. Spamassassin took at the beginning
several minutes and then after several 100 messages over 20 minutes per
message...
> If you want to, you can run sa-learn --force-expire in order to make
> expiry run manually. If no expiry has been run recently, SA will attempt
> to do so during mail delivery.
Oops... runing...
Hmmm, I have found a bunch of "bayes_toks.expireNNNNN" in the folder...
[michelle.konzackm@samba3:~] sa-learn --force-expire
expired old bayes database entries in 113 seconds
122163 entries kept, 12157 deleted
token frequency: 1-occurrence tokens: 48.69%
token frequency: less than 8 occurrences: 27.96%
I asume, the "bayes_toks.expireNNNNN" are made by previously run of
"expire" and left over... I deleted it...
Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
24V Electronic Engineer
Tamay Dogan Network
Debian GNU/Linux Consultant
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
##################### Debian GNU/Linux Consultant #####################
Michelle Konzack Apt. 917 ICQ #328449886
+49/177/9351947 50, rue de Soultz MSN LinuxMichi
+33/6/61925193 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Re: spamassassin taks ten minutes for a message
Posted by John Hardin <jh...@impsec.org>.
On Sun, 21 Sep 2008, Matt Kettler wrote:
> Michelle Konzack wrote:
>
>> The first 18.000 messages went fin but then "spamassassin" begun to
>> buging, exactly I took arround 10 minutes for each message and when I
>> encountered the problem it was already runnin several hours with this
>> problem...
>
> It looks like spamassassin is attempting to perform a bayes expiry, and
> you keep killing it before it can finish. It does need to do that once
> in a while, and it is slow.
That's what I thought of first as well, but...
>> -rw------- 1 michelle.konzack private 20377600 2008-08-31 18:29 bayes_toks
20MB of tokens doesn't seem all that large to me.
>> As you can see, I had to stop spamassassin on 2008-08-31. And even if
>> I move the files out of the way. "spamassassin" refus to run with
>> normal speed (4-6 messages per second)-
...and if the bayes files disappear, shouldn't expiry-related problems
stop? (granted, they got a lot better, and a manual expiry might help a
_lot_...)
I'd like to see a little more data on this one first. But if she does a
manual expiry and says "It is working now!" I won't complain. :)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Democrats '63: Ask not what your country can do for you,
ask what you can do for your country.
Democrats '07: Ask not what your country can do for you,
demand it!
-----------------------------------------------------------------------
44 days until the Presidential Election
Re: spamassassin taks ten minutes for a message
Posted by Matt Kettler <mk...@verizon.net>.
Michelle Konzack wrote:
> Hello,
>
> I am downloading my messages with my Laptop in a Internet cafe, trans-
> fering @home to my server and then let a filter roll over it...
>
> Note: I am working Off-Line (No Internet @home)
>
> Last weekend I was with my server @friends and conected it over ADSL to
> the Internet and downloadd arround 30.000 messages from arround 10 days
> since I was not in Strasbourg.
>
> The first 18.000 messages went fin but then "spamassassin" begun to
> buging, exactly I took arround 10 minutes for each message and when I
> encountered the problem it was already runnin several hours with this
> problem...
>
It looks like spamassassin is attempting to perform a bayes expiry, and
you keep killing it before it can finish. It does need to do that once
in a while, and it is slow.
If you want to, you can run sa-learn --force-expire in order to make
expiry run manually. If no expiry has been run recently, SA will attempt
to do so during mail delivery.
> How can ths be?
>
> ----[ command 'cd .spamassassin && ls -Al' ]----------------------------
> insgesamt 25528
> -rw------- 1 michelle.konzack private 1327104 2008-08-31 18:29 auto-whitelist
> -rw------- 1 michelle.konzack private 93840 2008-08-31 18:29 bayes_journal
> -rw------- 1 michelle.konzack private 2629632 2008-08-31 18:29 bayes_seen
> -rw------- 1 michelle.konzack private 20377600 2008-08-31 18:29 bayes_toks
> -rw------- 1 michelle.konzack private 4718592 2008-08-30 19:46 bayes_toks.expire30302
> -rw------- 1 michelle.konzack private 4513792 2008-08-30 20:42 bayes_toks.expire32331
> -rw------- 1 michelle.konzack private 4517888 2008-08-31 18:29 bayes_toks.expire7360
> -rw-r--r-- 1 michelle.konzack private 1510 2008-08-30 21:52 user_prefs
> ------------------------------------------------------------------------
>
> As you can see, I had to stop spamassassin on 2008-08-31. And even if I
> move the files out of the way. "spamassassin" refus to run with normal
> speed (4-6 messages per second)-
>
> Note: The Server is a Quad-Xeon with plenty of memory and
> 10.000 RpM SCSI drives in Raid-1.
>
> Thanks, Greetings and nice Day/Evening
> Michelle Konzack
> Systemadministrator
> 24V Electronic Engineer
> Tamay Dogan Network
> Debian GNU/Linux Consultant
>
>
>