You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by John Narron <jn...@cdsinet.net> on 2005/07/26 18:43:57 UTC

SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!

I've searched the archives, loosely, and I haven't really seen this issue
discussed much, if at all, but I know several people have been having this
problem, including myself.  So, I took it upon myself to find the issue and
come up with answer.

The situation started when Perl 5.8.7 was released into the FreeBSD ports
collection.  Many people upgraded and many people noticed that upon
restarted amavisd-new that it would stop working and were seeing messages in
their syslog:

Jul 22 11:36:06 xxxx kernel: pid 67831 (perl5.8.7), uid xxxx: exited on
signal 10

So, of course, naturally, we were pointing our fingers at amavisd-new.
After running through all of its debug information, I found out that it was
SpamAssassin tripping the signal 10, causing amavisd-new to fail.  Skipping
ahead of some boring details, I found out that simply 'spamassassin --lint'
was enough to repeatedly cause this signal 10 (Bus Error)

User$ spamassassin --lint
Bus error (core dumped)

Ok, good, we got us a core to work with.  Not running a debug version of
Perl however made this core pretty useless.

Skipping ahead even more, and after using the perl debugger, I traced the
fault to line 1958 of Mail::SpamAssassin::PerMsgStatus

  DB<4> v
1955    EOT
1956 
1957      # and run it.
1958==>   eval $evalstr;
1959:     if ($@) {
1960:       warn("Failed to compile URI SpamAssassin tests, skipping:\n".
1961              "\t($@)\n");
1962:       $self->{rule_errors}++;
1963      }
1964      else {

Doing an examine (x) on $evalstr revealed over 1MB worth of statements to be
evaluated.  Copying the results of 'x $evalstr' to a separate file, adding
my own 'eval $evalstr' to it results in the same Bus Error.  Finally using a
debugging version of Perl, I find out where in $evalstr that the Bus Error
occurs.  Removing the offending line, and everything after results with a
successful eval call.  Leaving lots of technical details out, I've come to
the conclusion that the perl 5.8.7 process is running clean out of stack
space and bailing out with a Bus Error.

I left a lot of the technical details out, because I'm not really sure where
the fault lies.  This same set up worked just fine under Perl 5.8.6, the
test cases work fine w/ a larger stack and the test cases work just fine on
other platforms such as an OpenBSD (which had a reported stack size of only
8MB) and Linux running perl 5.8.7.  Increasing the default stack size (from
64M to 128M) on FreeBSD 5.4 seemed to help but only so much.  On a whim, I
recompiled perl 5.8.7 w/ no optimizations (-O0) and still it would result in
a Bus Error.  Recompiling perl 5.8.7 with -Os, however, seems to work.

So if you're experiencing this problem, here's some suggestions:
	1> Reduce the number of rules/rulesets
	2> Go back and stay using perl 5.8.6 until someone blows the
all-clear
	3> Increase your stack space (the MAXSSIZ setting in the kernel
configuration)
	4> Recompile perl 5.8.7 using CFLAGS=-Os

If you're a developer for perl or spamassassin and want to try a more
official fix for this, let me know and I can provide you with more
information.  I no longer have the coredumps or logs (liberal use of 'rm' is
bad), but they can be recreated easily.  Thanks to g32, flap, RacerX, and
Mongo for helping me test and confirm my suspicions.


John Narron            | "Sacrifice, they always say
Network Administration |  Is a sign of nobility
CDS/CDSinet, LLC       |  But where does one draw the line
http://www.cdsinet.net |  In the face of injury?"
(660) 886 4045         |     - Queensryche 



Re: SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!

Posted by jdow <jd...@earthlink.net>.
From: "John Narron" <jn...@cdsinet.net>

> Skipping ahead even more, and after using the perl debugger, I traced the
> fault to line 1958 of Mail::SpamAssassin::PerMsgStatus
> 
>   DB<4> v
> 1955    EOT
> 1956 
> 1957      # and run it.
> 1958==>   eval $evalstr;
> 1959:     if ($@) {
> 1960:       warn("Failed to compile URI SpamAssassin tests, skipping:\n".
> 1961              "\t($@)\n");
> 1962:       $self->{rule_errors}++;
> 1963      }
> 1964      else {
> 

Bingo! I think we have our various PerMsgStatus.pm evalstr problems
located now. Maybe a solution can be developed.

{^_^}


Re: SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!

Posted by Loren Wilton <lw...@earthlink.net>.
> The line in $evalstr that causes the stack fault is:
>
> if ($self->{conf}->{scores}->{q{WLS_URI_OPT_377}}) {
>
> Which is on line 21720 (out of ~30800 lines).  From what I've gathered,
each
> rule in the .cf files results in 4 lines being strung together in
$evalstr:
>
>       if ($self->{conf}->{scores}->{q{<RULE_NAME>}}) {
>         <RULE_NAME>_uri_test($self, @_); # call procedurally for speed
>       }
>
> It doesn't matter which rule is on line 21720 (as I've changed them around
> to make sure it wasn't specifically WLS_URI_OPT_377, from
> sa-blacklist.current.uri.cf),  It seems when you get to line 21720 w/ a
64MB
> stack on FreeBSD 5.4 w/ perl 5.8.7 etc, etc. that you've ran out of stack
> space.  This is part of the gdb bt from a perl5.8.7.core that I had:
>
> #0  0x280a2144 in Perl_malloc (nbytes=25) at malloc.c:1411
> #1  0x281075b5 in S_save_hek_flags (str=0x935b328 "WLS_URI_OPT_377",
len=15,
> hash=1020016264, flags=0) at hv.c:97
etc.

Ok.  This sounds to me like a perl bug that can probably be worked around.
It is dying in malloc, apparently trying to malloc heap data on the stack,
which gcc tends to do a lot.  Since it happens on the same line in the file
regardless of which rule is there, it is happening at the same 'call a rule'
number in the procedure containing the calls to the rules.

So the first useful question is to convert that line number to a rule call
number in the containing procedure.  Subtract the line number for the first
one of those 'if ($self...' lines from the failing line, and then divide by
4 to convert line numbers into rule counts.  (Assuming there are 4 lines
including a blank line, else divide by 3.)

Once that number is known, I suspect the trivial (but somewhat ugly) change
is to maintain a count in sa of the number of rules being generated, and
split the rule calling procedure into multiple procedures every N rules,
where N is some small number (16 or so) less than the number you compute
above.  Then generate a final wrapper procedure that will call all of these
split procedures.

The end result will be one extra level of call to evaluate the rules, but
the procedure flipping will probably be fairly rare (I'm assuming there will
be several hundred rules per sub-procedure), so there won't be any noticible
change in performance.  And the erroneous heap buildup will disappear each
time one of the sub-procedures is exited.

A possible second workaround is to try to figure out what hash it is that
Perl seems to be setting flags in as it is calling the rules.  I don't see
it obviously in the code; but then I don't really read Perl, and it may be
obvious to someone else.  If this is obvious, there may be some trivial code
formatting change that will unconvince Perl that it needs to be mucking with
the hash in the first place.

And of course I'd be inclined to supply the 1MB file as a bug attachment to
perl, explaining that it blows stack on BSD and works fine everywhere else.
However, that is likely a longer-term fix than changing the rule generating
code to build a second level of calling procedure and segment the rule calls
into multiple procedures.

        Loren


RE: SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!

Posted by John Narron <jn...@cdsinet.net>.
The line in $evalstr that causes the stack fault is:

if ($self->{conf}->{scores}->{q{WLS_URI_OPT_377}}) {

Which is on line 21720 (out of ~30800 lines).  From what I've gathered, each
rule in the .cf files results in 4 lines being strung together in $evalstr:

      if ($self->{conf}->{scores}->{q{<RULE_NAME>}}) {
        <RULE_NAME>_uri_test($self, @_); # call procedurally for speed
      }

It doesn't matter which rule is on line 21720 (as I've changed them around
to make sure it wasn't specifically WLS_URI_OPT_377, from
sa-blacklist.current.uri.cf),  It seems when you get to line 21720 w/ a 64MB
stack on FreeBSD 5.4 w/ perl 5.8.7 etc, etc. that you've ran out of stack
space.  This is part of the gdb bt from a perl5.8.7.core that I had:

#0  0x280a2144 in Perl_malloc (nbytes=25) at malloc.c:1411
#1  0x281075b5 in S_save_hek_flags (str=0x935b328 "WLS_URI_OPT_377", len=15,
hash=1020016264, flags=0) at hv.c:97
#2  0x2810aa51 in S_share_hek_flags (str=0x935b328 "WLS_URI_OPT_377",
len=15, hash=1020016264, flags=0) at hv.c:2114
#3  0x2810a940 in Perl_share_hek (str=0x935b328 "WLS_URI_OPT_377", len=15,
hash=1020016264) at hv.c:2074
#4  0x28129e29 in Perl_newSVpvn_share (src=0x935b328 "WLS_URI_OPT_377",
len=15, hash=1020016264) at sv.c:6876
#5  0x280def33 in Perl_peep (o=0x935f288) at op.c:6699
#6  0x280debcc in Perl_peep (o=0x935ef88) at op.c:6634
... (lots of calls to Perl_peep, o= different numbers each time.. I'm
assuming it's a recursive function)
#5432 0x280debcc in Perl_peep (o=0x8081108) at op.c:6634
#5433 0x280debcc in Perl_peep (o=0x8587a48) at op.c:6634
#5434 0x280debcc in Perl_peep (o=0x8062b08) at op.c:6634
#5435 0x280d10d5 in Perl_newPROG (o=0x8062b88) at op.c:1952
#5436 0x280cac39 in Perl_yyparse () at perly.y:140
#5437 0x28155eb7 in S_doeval (gimme=128, startop=0x0, outside=0x805829c,
seq=0) at pp_ctl.c:2892
#5438 0x281588ef in Perl_pp_entereval () at pp_ctl.c:3486
#5439 0x280f8ebe in Perl_runops_debug () at dump.c:1452
#5440 0x2809be1f in S_run_body (oldscope=1) at perl.c:2000
#5441 0x2809b8d5 in perl_run (my_perl=0x804e1f0) at perl.c:1919
#5442 0x08049188 in main ()

(5434 (first Perl_peep line) - 4 (line after last Perl_peep) * 4 = 21720

John Narron            | "Sacrifice, they always say
Network Administration |  Is a sign of nobility
CDS/CDSinet, LLC       |  But where does one draw the line
http://www.cdsinet.net |  In the face of injury?"
(660) 886 4045         |     - Queensryche  

> -----Original Message-----
> From: Loren Wilton [mailto:lwilton@earthlink.net] 
> Sent: Tuesday, July 26, 2005 12:06 PM
> To: users@spamassassin.apache.org
> Subject: Re: SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!
> 
> > I find out where in $evalstr that the Bus Error occurs.  
> Removing the 
> > offending line, and everything after results with a successful eval 
> > call.
> 
> Hum.  It might be mildly interesting to know what the line is 
> that causes the stack fault.  I assume this is generated 
> code, and possibly a slight change in what is being generated 
> could reduce or eliminate the recursion that I assume is 
> occuring.  I'd think that code should be reasonably simple but...
> 
>         Loren
> 
> 
> 



Re: SpamAssassin, FreeBSD, Perl 5.8.7, bus errors, oh my!

Posted by Loren Wilton <lw...@earthlink.net>.
> I find out where in $evalstr that the Bus Error
> occurs.  Removing the offending line, and everything after results with a
> successful eval call.

Hum.  It might be mildly interesting to know what the line is that causes
the stack fault.  I assume this is generated code, and possibly a slight
change in what is being generated could reduce or eliminate the recursion
that I assume is occuring.  I'd think that code should be reasonably simple
but...

        Loren