You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Karsten Bräckelmann <gu...@rudersport.de> on 2011/03/21 04:44:54 UTC
Fwd: Re: Reproducing Bug 6559
Lengthy, I mean *verbose* reply by Matt Elson below. Awesome! The
important bits of my original private reply have been communicated to
the users list already.
First a lot of test machine details (which I mainly ignored), then the
beef that might lead to the "use space rather than \s in body rules"
problem with re2c compilation, only exhibited in some highly specific
circumstances.
Approval to publish:
> > Matt, since you sent this off-list to me, is it OK to share the info
> > with the (public) dev list, also?
>
> Sure; I'm just feeling a little embarrassed/paranoid about the fact my
> initial email with debugging info triggered the issue on other people's
> systems - just wasn't thinking too clearly when I sent it out.
>
> So assuming it doesn't destroy systems, go ahead ;).
Don't worry. One of the original reports a few hours ago already sported
the patterns that trigger the issue. We'll be fine. ;)
-------- Forwarded Message --------
From: Matt Elson
To: Karsten Bräckelmann
Subject: Re: Reproducing Bug 6559
Date: Sun, 20 Mar 2011 22:45:43 -0400
> Since there have been offers for further testing: One data point is to
> collect details about systems, CPU architecture, instruction set used
> for compiling, versions (OS, kernel, compiler, re2c, Perl) and patch-
> level.
>
I've seen the issue on six different hosts now. Three of them are my
normal production machines, three were ones I'm testing on specifically
for this purpose. Here's the specs (let me know if I missed something or
misunderstood the request.. which I suspect I have).
Sorry, this is a bit long and conceivably unclear..
RHEL4 32bit machine, production box #1
---
CPU:
model name : Intel(R) Xeon(TM) CPU 2.80GHz
uname -a:
Linux spam2 2.6.9-89.0.19.ELsmp #1 SMP Wed Dec 30 12:53:30 EST 2009 i686
i686 i386 GNU/Linux
gcc -v:
Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.6/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--disable-checking --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-java-awt=gtk
--host=i386-redhat-linux
Thread model: posix
gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)
re2c -v:
re2c 0.13.2
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.8.5
(from spamassassin.apache.org)
/etc/redhat-release
Red Hat Enterprise Linux AS release 4 (Nahant Update 9)
(should be up to date)
RHEL5 32bit machine, production box #2
---
CPU:
model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz
uname -a
Linux spam3 2.6.18-238.5.1.el5PAE #1 SMP Mon Feb 21 06:01:16 EST 2011
i686 i686 i386 GNU/Linux
gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --disable-plugin
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
--with-cpu=generic --host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)
re2c -v
re2c 0.13.5
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.8.8
(from spamassassin.apache.org)
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.6 (Tikanga)
(should only be off on patches by a week or so)
RHEL5 32bit machine, production box #3
---
Intel(R) Xeon(R) CPU E5430 @ 2.66GHz
uname -a:
Linux spam4 2.6.18-194.32.1.el5PAE #1 SMP Mon Dec 20 11:00:23 EST 2010
i686 i686 i386 GNU/Linux
gcc -v:
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
--with-cpu=generic --host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)
re2c -v:
re2c 0.13.2
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.8.8
(from spamassassin.apache.org)
cat /etc/redhat-release:
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
(behind a bit)
New RHEL6 64bit machine, virtual (vmware)
---
CPU: model name : Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
uname -a:
Linux rhel6-x64 2.6.32-71.18.2.el6.x86_64 #1 SMP Wed Mar 2 14:17:40 EST
2011 x86_64 x86_64 x86_64 GNU/Linux
gcc -v
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic
--with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC)
re2c -v:
re2c 0.13.5
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.10.1
(redhat packaged)
Latest patches from RedHat as of today.
New RHEL6 32bit machine, virtual (vmware)
----
CPU reported: model name : Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
uname -a:
Linux rhel6-32 2.6.32-71.18.2.el6.i686 #1 SMP Wed Mar 2 14:38:52 EST
2011 i686 i686 i386 GNU/Linux
gcc -v:
Target: i686-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic
--with-arch=i686 --build=i686-redhat-linux
Thread model: posix
gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC)
re2c -v:
re2c 0.13.5
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.10.1
(redhat packaged)
Latest patches from all programs from RedHat as of today
Debian Desktop (work desktop)
----
CPU:
model name : Intel(R) Pentium(R) 4 CPU 2.66GHz
uname -a:
Linux workDesktop 2.6.37-2-686 #1 SMP Sun Feb 27 10:51:32 UTC 2011 i686
GNU/Linux
gcc -v:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.5.2/lto-wrapper
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.5.2-5'
--with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.5 --enable-shared --enable-multiarch
--enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib
--enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-plugin --enable-gold
--enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc
--enable-targets=all --with-arch-32=i586 --with-tune=generic
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
--target=i486-linux-gnu
Thread model: posix
gcc version 4.5.2 (Debian 4.5.2-5)
spamassassin -V:
SpamAssassin version 3.3.1
running on Perl version 5.10.1
(from Debian packaging)
re2c -v:
re2c 0.13.5
(from Debian packaging)
Up to date as of.. last week; Debian Unstable.
> Another might be to reproduce the issue, and get a minimal test-case.
I'm not super familiar with any advance usage of SpamAssassin, so
apologies for the next bits.
> For that, can you reproduce the problem with trivial REs for the three
> __PILL_PRICE_x sub-rules?
>
Not quite sure what this means (sorry); does this mean trying to get a
simplified form of the __PILL_PRICE rules that still trigger the problem?
If so here's a few crude manglings of __PILL_PRICE_3 that still causes
the loop on one of my test machines:
(using: http://pastebin.com/iGQ2RJ6v)
body __PILL_PRICE_3 /free\s(?:pill|cap(?:sule|let))s/i
body __PILL_PRICE_3 /free\s(?:pill|cap)s/i
body __PILL_PRICE_3 /free\spills/i
body __PILL_PRICE_3 /free\s/i
body __PILL_PRICE_3 /Free\s/
body __PILL_PRICE_3 /ree\s/
The following does *not* cause the problem, however:
body __PILL_PRICE_3 /ee\s/
body __PILL_PRICE_3 /free pills/i
body __PILL_PRICE_3 /free\ pills/i
body __PILL_PRICE_3 /\s/
body __PILL_PRICE_3 /s\s/
Playing around a bit, the following also causes the problem (using my
sample text from the pastebin)
body __PILL_PRICE_3 /shipping\s/i
body __PILL_PRICE_3 /ping\s/i
body __PILL_PRICE_3 /ing\s/i
body __PILL_PRICE_3 /quality\s/i
body __PILL_PRICE_3 /ity\s/i
Whereas the following do not:
body __PILL_PRICE_3 /ty\s/i
body __PILL_PRICE_3 /ng\s/i
I have no idea why, but it seems:
\s proceeded by three or more characters and tflags multiple
regularly hits the problem for me.
> Can you reproduce the problem by keeping (a renamed copy of) the
> original sub-rules and tflags, using a simple meta rule?
Not quite sure what do here either; if I disable the rules with the meta
trick and then make one that mirrors it (based on __PILL_PRICE3), I get
the same behavior:
i.e.
meta __PILL_PRICE_1 (0)
meta __PILL_PRICE_3 (0)
meta __PILL_PRICE_2 (0)
body LOCAL_TEST /free\s(?:pill|tablet|cap(?:sule|let))s/i
tflags LOCAL_TEST multiple
Will still cause the problem, but of course hitting LOCAL_TEST as
opposed to the __PILL_PRICE rules.
I think I completely misunderstood the question though ;).
> Are two of them
> sufficient? Or even one?
I disabled __PILL_PRICE1, __PILL_PRICE2 with meta (0) and can still get
the error with just __PILL_PRICE_3 being active.
And not sure if it helps, but I ran into a similar behavior w/ re2c a
long time ago:
http://mail-archives.apache.org/mod_mbox/spamassassin-users/200907.mbox/%3C4A4A4EC6.2000301@fastmail.net%3E
Like I said, it may just be noise but I figured it can't hurt to have
another data point.
Anyway, hope this all helps.
Matt
--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: [SA-dev] Fwd: Re: Reproducing Bug 6559
Posted by John Hardin <jh...@impsec.org>.
On Mon, 21 Mar 2011, Adam Katz wrote:
> I would want to try this, which should be a faster regex anyway:
Thanks for your suggestions, I'll take a look at them soonest.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Watch... Wallet... Gun... Knee... -- Denny Crane
-----------------------------------------------------------------------
8 days until the M1911 is 100 years old - and still going strong!
Re: [SA-dev] Fwd: Re: Reproducing Bug 6559
Posted by Adam Katz <an...@khopis.com>.
On 03/20/2011 08:44 PM, Karsten Bräckelmann forwarded From: Matt Elson
> I have no idea why, but it seems:
> \s proceeded by three or more characters and tflags multiple
> regularly hits the problem for me.
I don't have much experience with non-production re2c; how do I properly
reproduce (and therefore test) this bug on svn trunk?
I would want to try this, which should be a faster regex anyway:
/free\s[ptc](?:ill|ablet|ap(?:sule|let)s/i
I also wanted to try a leading word-break ("\b") in front of the regex,
though I don't know how many spams that will skip.
While looking at the PILL_PRICE rules,
body __PILL_PRICE_1
m;\$?[\d\s.]{3,8}(?:/|per|each)\s?(?:pill|tablet|cap(?:sule|let));i
What is the point of leading with an optional piece? That regex is
identical to this simpler one:
m;[\d\s.]{3}(?:/|per|each)\s?(?:pill|tablet|cap(?:sule|let));i
Another point; what if we merge _1 and _3 from
_1 m;\$?[\d\s.]{3,8}(?:/|per|each)\s?(?:pill|tablet|cap(?:sule|let));i
_2 /(?:pill|tablet|cap(?:sule|let))s\s\$?[\d\s.]{3,8}/i
_3 /free\s(?:pill|tablet|cap(?:sule|let))s/i
into (note removal of _1's optional lead)
m;(?:[\d\s.]{3}(?:/|per|each)|free)\s?(?:pill|tablet|cap(?:sule|let));i
Matt already showed that disabling _1 and _2 didn't prevent the problem
with _3, so this isn't as much of a potential remedy as it initially
seems, but it should be slightly more efficient and might avoid the re2c
bug.
Re: Fwd: Re: Reproducing Bug 6559
Posted by Karsten Bräckelmann <gu...@rudersport.de>.
On Mon, 2011-03-21 at 04:44 +0100, Karsten Bräckelmann wrote:
> > > Matt, since you sent this off-list to me, is it OK to share the info
> > > with the (public) dev list, also?
> >
> > Sure; I'm just feeling a little embarrassed/paranoid about the fact my
> > initial email with debugging info triggered the issue on other people's
> > systems - just wasn't thinking too clearly when I sent it out.
> >
> > So assuming it doesn't destroy systems, go ahead ;).
>
> Don't worry. One of the original reports a few hours ago already sported
> the patterns that trigger the issue. We'll be fine. ;)
Oh, wait, now I realize -- you where actually speaking about exactly
that, and your report, which was the very first! :-D
Anyway, don't worry -- it's either that, or the next male enhancement
spam coming in to trigger the bug. And I still advise not to scan SA
list mail.
--
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}