You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/09/08 19:17:47 UTC

Re: timing/performance issues

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Ralf Hildebrandt writes:
> I'm using spamassassin from within amavisd-new. Loggin shows that
> amavis's SA-check is the most time-consuming step while processing an
> email (>>80% of the total time needed).
> 
> How can I run spamassassin against some mails to find out what part of
> spamassassin (maybe some sort of external DNS query) takes how much
> time?
> 
> I was thinking of a sort of "profile" (subrouting x takes y seconds)
> to find out where my particular installation of SA is "slow".

perldoc Devel::DProf -- that's the perl profiler.  but as you said,
it now appears to be bayes -- it could be that if a scan is taking
a *very* long time, what's actually taking place is a Bayes expiration
run, which happens once every N days (typically).

Run spamassassin with -D (debugs) to see if that's it.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBPz67QTcbUG5Y7woRAj7RAKCvYhL6zjakFVZUF18RdtTc4WMdHACcDwzN
nX1/6vVWqerMhcOCdcX/lNg=
=SGCj
-----END PGP SIGNATURE-----


Re: timing/performance issues

Posted by Lucas Albers <ad...@cs.montana.edu>.
and my dbase has a good number of entries:
sa-learn --dump| head

0.000          0          2          0  non-token data: bayes db version
0.000          0     333304          0  non-token data: nspam
0.000          0     237785          0  non-token data: nham
0.000          0    5807722          0  non-token data: ntokens



Ralf Hildebrandt said:
> * Lucas Albers <ad...@cs.montana.edu>:
>
>> I've had good results doing bayes learn_to_journal and then running a
>> rebuild every hour.
>
> Whoa, hourly? I can try that.
>
>> This runs quick, even with concurrent access's.
>> Bayes get's updated quickly.
>> Bayes is only locked for a few seconds every hour, less than 3 seconds.
>
> --
> _________________________________________________
>
>   Charité - Universitätsmedizin Berlin
> _________________________________________________
>
>   Ralf Hildebrandt
>    i.A. des IT-Zentrum | Netzwerkdienste
>    Stabsstelle des Klinikumsvorstandes
>    Campus Mitte
>    Schumannstr. 20/21 | D-10177 Berlin
>    Tel. +49 30 450 570155 | Fax +49 30 450 570916
>    Ralf.Hildebrandt@Charite.de
>    http://www.charite.de
>
>


-- 
Luke Computer Science System Administrator
Security Administrator,College of Engineering
Montana State University-Bozeman,Montana



Re: timing/performance issues

Posted by Ralf Hildebrandt <Ra...@charite.de>.
* Lucas Albers <ad...@cs.montana.edu>:

> I've had good results doing bayes learn_to_journal and then running a
> rebuild every hour.

Whoa, hourly? I can try that.

> This runs quick, even with concurrent access's.
> Bayes get's updated quickly.
> Bayes is only locked for a few seconds every hour, less than 3 seconds.

-- 
_________________________________________________

  Charité - Universitätsmedizin Berlin
_________________________________________________

  Ralf Hildebrandt
   i.A. des IT-Zentrum | Netzwerkdienste
   Stabsstelle des Klinikumsvorstandes
   Campus Mitte
   Schumannstr. 20/21 | D-10177 Berlin
   Tel. +49 30 450 570155 | Fax +49 30 450 570916
   Ralf.Hildebrandt@Charite.de
   http://www.charite.de



Re: timing/performance issues

Posted by Lucas Albers <ad...@cs.montana.edu>.
I've had good results doing bayes learn_to_journal and then running a
rebuild every hour.
This runs quick, even with concurrent access's.
Bayes get's updated quickly.
Bayes is only locked for a few seconds every hour, less than 3 seconds.


Ralf Hildebrandt said:
> * Justin Mason <jm...@jmason.org>:
>
>> perldoc Devel::DProf -- that's the perl profiler.  but as you said,
>> it now appears to be bayes -- it could be that if a scan is taking
>> a *very* long time, what's actually taking place is a Bayes expiration
>> run, which happens once every N days (typically).
>
> Bayes expiration is done daily at nighttime by a cron job
>
> --
> _________________________________________________
>
>   Charité - Universitätsmedizin Berlin
> _________________________________________________
>
>   Ralf Hildebrandt
>    i.A. des IT-Zentrum | Netzwerkdienste
>    Stabsstelle des Klinikumsvorstandes
>    Campus Mitte
>    Schumannstr. 20/21 | D-10177 Berlin
>    Tel. +49 30 450 570155 | Fax +49 30 450 570916
>    Ralf.Hildebrandt@Charite.de
>    http://www.charite.de
>
>


-- 
Luke Computer Science System Administrator
Security Administrator,College of Engineering
Montana State University-Bozeman,Montana



Re: timing/performance issues

Posted by Ralf Hildebrandt <Ra...@charite.de>.
* Justin Mason <jm...@jmason.org>:

> perldoc Devel::DProf -- that's the perl profiler.  but as you said,
> it now appears to be bayes -- it could be that if a scan is taking
> a *very* long time, what's actually taking place is a Bayes expiration
> run, which happens once every N days (typically).

Bayes expiration is done daily at nighttime by a cron job

-- 
_________________________________________________

  Charité - Universitätsmedizin Berlin
_________________________________________________

  Ralf Hildebrandt
   i.A. des IT-Zentrum | Netzwerkdienste
   Stabsstelle des Klinikumsvorstandes
   Campus Mitte
   Schumannstr. 20/21 | D-10177 Berlin
   Tel. +49 30 450 570155 | Fax +49 30 450 570916
   Ralf.Hildebrandt@Charite.de
   http://www.charite.de