You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Daniel O'Connor <da...@dons.net.au> on 2005/07/02 07:09:35 UTC

Perl crashing in sa-learn

Hi,
Currently I have MIMEDefang set up to call Spam Assassin for all incoming 
messages. I am trying to set up Bayes for Mailman lists so I have the 
script 'mmlearn' (attached) which runs sa-learn on pickled emails.

The problem is that for certain messages sa-learn crashes. I have attached
 a tar file of 2 examples (so they don't get marked as spam :)

[midget 14:31] ~ >sudo su -m mailman -c 'env HOME=/usr/local/mailman sa-learn -u mailman -D --showdots --mbox --spam' <crashmsg1
debug: SpamAssassin version 3.0.4
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/home/darius/bin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/games', keeping.
debug: PATH included '/usr/local/sbin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/usr/X11R6/bin', keeping.
debug: PATH included '/home/darius/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/sbin', keeping.
debug: Final PATH set to: /home/darius/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin:/home/darius/bin:/usr/sbin:/sbin:/usr/sbin:/sbin
debug: using "/usr/local/etc/mail/spamassassin/init.pre" for site rules init.pre
debug: config: read file /usr/local/etc/mail/spamassassin/init.pre
debug: using "/usr/local/share/spamassassin" for default rules dir
debug: config: read file /usr/local/share/spamassassin/10_misc.cf
debug: config: read file /usr/local/share/spamassassin/20_anti_ratware.cf
debug: config: read file /usr/local/share/spamassassin/20_body_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_compensate.cf
debug: config: read file /usr/local/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_drugs.cf
debug: config: read file /usr/local/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_head_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_html_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_meta_tests.cf
debug: config: read file /usr/local/share/spamassassin/20_phrases.cf
debug: config: read file /usr/local/share/spamassassin/20_porn.cf
debug: config: read file /usr/local/share/spamassassin/20_ratware.cf
debug: config: read file /usr/local/share/spamassassin/20_uri_tests.cf
debug: config: read file /usr/local/share/spamassassin/23_bayes.cf
debug: config: read file /usr/local/share/spamassassin/25_body_tests_es.cf
debug: config: read file /usr/local/share/spamassassin/25_hashcash.cf
debug: config: read file /usr/local/share/spamassassin/25_spf.cf
debug: config: read file /usr/local/share/spamassassin/25_uribl.cf
debug: config: read file /usr/local/share/spamassassin/30_text_de.cf
debug: config: read file /usr/local/share/spamassassin/30_text_fr.cf
debug: config: read file /usr/local/share/spamassassin/30_text_nl.cf
debug: config: read file /usr/local/share/spamassassin/30_text_pl.cf
debug: config: read file /usr/local/share/spamassassin/50_scores.cf
debug: config: read file /usr/local/share/spamassassin/60_whitelist.cf
debug: using "/usr/local/etc/mail/spamassassin" for site rules dir
debug: using "/usr/local/mailman/.spamassassin/user_prefs" for user prefs file
debug: config: read file /usr/local/mailman/.spamassassin/user_prefs
debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8a0bd90)
debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::Hashcash=HASH(0x8a1b794)
debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
debug: plugin: registered Mail::SpamAssassin::Plugin::SPF=HASH(0x8a33edc)
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x8a0bd90) implements 'parse_config'
debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0x8a1b794) implements 'parse_config'
debug: bayes: 98028 tie-ing to DB file R/O /usr/local/mailman/.spamassassin/bayes_toks
debug: bayes: 98028 tie-ing to DB file R/O /usr/local/mailman/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: Score set 2 chosen.
debug: Initialising learner
debug: Syncing Bayes and expiring old tokens...
debug: lock: 98028 created /usr/local/mailman/.spamassassin/bayes.lock.midget.dons.net.au.98028
debug: lock: 98028 trying to get lock on /usr/local/mailman/.spamassassin/bayes with 0 retries
debug: lock: 98028 link to /usr/local/mailman/.spamassassin/bayes.lock: link ok
debug: bayes: 98028 tie-ing to DB file R/W /usr/local/mailman/.spamassassin/bayes_toks
debug: bayes: 98028 tie-ing to DB file R/W /usr/local/mailman/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: refresh: 98028 refresh /usr/local/mailman/.spamassassin/bayes.lock
debug: Syncing complete.
debug: Learning Spam
debug: received-header: parsed as [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= envfrom=entertainers@artdirectors.com intl=0 id=j610d5Id081088 auth= ]
debug: received-header: parsed as [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ]
debug: is DNS available? 0
debug: received-header: parsed as [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ]
debug: received-header: cannot use DNS, do not trust any hosts from here on
debug: received-header: relay 80.68.88.245 trusted? no internal? no
debug: received-header: relay 195.47.42.5 trusted? no internal? no
debug: received-header: relay 92.132.203.224 trusted? no internal? no
debug: metadata: X-Spam-Relays-Trusted:
debug: metadata: X-Spam-Relays-Untrusted: [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= envfrom=entertainers@artdirectors.com intl=0 id=j610d5Id081088 auth= ] [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ] [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ]
debug: ---- MIME PARSER START ----
debug: main message type: text/plain
debug: parsing normal part
debug: added part, type: text/plain
debug: ---- MIME PARSER END ----
debug: decoding: other encoding type (7bit), ignoring
debug: uri found: http://uhdzu.azwpd9alp2az7ts.zorromf.info
debug: refresh: 98028 refresh /usr/local/mailman/.spamassassin/bayes.lock
debug: tokenize: header tokens for Mime-Version = " 1.0 (Apple Message framework v728)"
debug: tokenize: header tokens for Content-Transfer-Encoding = " 7bit"
debug: tokenize: header tokens for *m = "  1681980078 575569277 195 47 42 5 "
debug: tokenize: header tokens for *c = " /plain; charset=US-ASCII; format=flowed"
debug: tokenize: header tokens for To = "U*all D*fucs.org.au D*org.au D*au"
debug: tokenize: header tokens for *F = "U*entertainers D*artdirectors.com D*com"
debug: tokenize: header tokens for *x = " Apple Mail (2.728)"
debug: tokenize: header tokens for *RT = " "
debug: tokenize: header tokens for *RU = " [ ip=80.68.88.245 rdns=server1.aladan.net helo=server1.aladan.net by=midget.dons.net.au ident= envfrom=entertainers@artdirectors.com intl=0 id=j610d5Id081088 auth= ] [ ip=195.47.42.5 rdns=195.47.42.5 helo=195.47.42.5 by=server1.aladan.net ident= envfrom= intl=0 id=j610ctAg015021 auth= ] [ ip=92.132.203.224 rdns= helo= by=195.47.42.5 ident= envfrom= intl=0 id=2142659851detailing23659 auth= ]"
debug: tokenize: header tokens for *r = "   [92.132.203 ip*92.132.203.224 ] (port=4461 helo=[homesteaders]) by 195.47.42 ip*195.47.42.5    esmtp id 2142659851detailing23659   all@fucs.org.au; "
debug: tokenize: header tokens for *r = "   [92.132.203 ip*92.132.203.224 ] (port=4461 helo=[homesteaders]) by 195.47.42 ip*195.47.42.5    esmtp id 2142659851detailing23659   all@fucs.org.au;     195.47.42 ip*195.47.42.5  ([195.47.42 ip*195.47.42.5 ]) by server1.aladan.net (8.13.1/8.13.1)         <al...@fucs.org.au>; "
Segmentation fault

If I move the bayes_toks file out of the way it doesn't crash - I could 
accept that it's a corrupt file but I get the same result with 2 
separate systems so perhaps something is causing the toks file to become
bogus.

I am running SA v3.0.4 on both systems. One system is FreeBSD 4.11 with 
Perl 5.6.2, and the other is FreeBSD 5.4 with Perl 5.8.7 (both built from
ports)

Any help greatly appreciated!
Thanks.

-- 
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C