You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by ar...@waikato.ac.nz on 2005/12/05 00:23:11 UTC

spamd 3.1.0 dies regularly.

Learned friends, my spamd is ill.  It dies so often I have a cron job
check it every three minutes.  Over the past week it has averaged
about one death per day, but it's not regular: on Saturday it died
twice, an hour apart, but has been fine for the 36 hours since.

Logs follow (apologies for the line length):

Dec  3 23:12:24 jess spamd[12331]: spamd: result: Y 27 - BAYES_99,DNS_FROM_RFC_ABUSE,HTML_50_60,HTML_MESSAGE,RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL,URIBL_AB_SURBL,URIBL_JP_SURBL,URIBL_OB_SURBL,URIBL_SBL,URIBL_SC_SURBL,URIBL_WS_SURBL,URI_NOVOWEL scantime=6.7,size=52775,user=user1,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=45705,mid=<20...@WYTTNRLE>,bayes=1,autolearn=unavailable 
[...many lines elided...]
Dec  3 23:13:15 host spamd[10689]: prefork: child states: BBBIIIIIIIIIIIIIIIIII 
Dec  3 23:13:19 host spamd[2000]: spamd: clean message (0.8/5.0) for user2:0 in 4.5 seconds, 7454 bytes. 
Dec  3 23:13:19 host spamd[2000]: spamd: result: .  0 - AWL,BAYES_50,FORGED_RCVD_HELO,HTML_MESSAGE,MIME_HTML_ONLY,UNPARSEABLE_RELAY scantime=4.5,size=7454,user=user2,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46290,mid=<28...@psc1927>,bayes=0.479493261214263,autolearn=no 
Dec  3 23:13:19 host spamd[10689]: prefork: child states: BIBIIIIIIIIIIIIIIIIII 
Dec  3 23:13:19 host spamd[1998]: spamd: clean message (4.8/5.0) for user3:0 in 5.6 seconds, 12978 bytes. 
Dec  3 23:13:19 host spamd[1998]: spamd: result: .  4 - ALL_TRUSTED,AWL,BAYES_99 scantime=5.6,size=12978,user=user3,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46287,mid=<20...@mx.waikato.ac.nz>,bayes=1,autolearn=no 
Dec  3 23:13:19 host spamd[4084]: spamd: clean message (0.8/5.0) for user4:0 in 5.3 seconds, 26114 bytes. 
Dec  3 23:13:19 host spamd[4084]: spamd: result: .  0 - AWL,BAYES_50,FORGED_RCVD_HELO,HTML_MESSAGE,MIME_HTML_ONLY,UNPARSEABLE_RELAY scantime=5.3,size=26114,user=user4,uid=0,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=46292,mid=<32...@psc1927>,bayes=0.50000000000005,autolearn=no 
Dec  3 23:13:19 host spamd[10689]: prefork: child states: IIBIIIIIIIIIIIIIIIIII 
Dec  3 23:13:20 host spamd[10689]: prefork: child states: IIIIIIIIIIIIIIIIIIIII 
Dec  3 23:13:20 host spamd[10689]: spamd: handled cleanup of child pid 12331 due to SIGCHLD 
Dec  3 23:13:23 host spamc[12302]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused
Dec  3 23:13:24 host spamc[12302]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#2 of 3): Connection refused

These errors continued until the three-minute check noticed spamd's
absence from the process list and restarted it at 23:15:

Dec  3 23:15:01 host spamc[13236]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused
Dec  3 23:15:02 host spamd[13232]: logger: removing stderr method 
Dec  3 23:15:07 host spamd[13239]: spamd: server started on port 783/tcp (running version 3.1.0) 
Dec  3 23:15:07 host spamd[13239]: spamd: server pid: 13239 
Dec  3 23:15:07 host spamd[13239]: spamd: server successfully spawned child process, pid 13287 
Dec  3 23:15:07 host spamd[13239]: spamd: server successfully spawned child process, pid 13288 
[...18 more children spawned...]

Spamd is started as follows (IPs obfuscated).

/usr/bin/spamd --daemonize --sql-config --nouser-config
  --listen-ip=0.0.0.0 
  --allowed-ips=127.0.0.1,a.b.c.d,a.b.c.e
  --max-children=30 --min-spare=10 --max-spare=20

spamc is called from procmail, either running as the recipient or as
root with "-u recipient".

One night last week, spamd stopped answering queries but remained
alive.  The three-minute sanity checker didn't see the need to restart
it and 35,000 incoming messages went unchecked before I arrived in the
morning.  That may not related to this problem - it's just a grumble,
and an explanation for why I'm feeling a bit sideways about spamd at
the moment.

This three-minute checker, by the way, was originally written to slap
dccifd 1.2 back into action.  It was worse than spamd is now.  I'm
happy to say that dccifd 1.3 is much better, though I still check it
every three minutes and kill it every 24 hours.

The box has 1.5G of RAM free, and oodles of empty disk.  I don't think
spamd's health concerns are environmental.

Because SA checks about 100,000 messages per day, I need to be
selective about the debugging I turn on.  Can you recommend a -D
option that will help me diagnose this problem?

Many thanks in advance.

-- 
_________________________________________________________________________
Andrew Donkin                  Waikato University, Hamilton,  New Zealand