You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Russell Gulli <ru...@nortel.com> on 2007/05/11 23:52:38 UTC

SAX parser causing stack overflow on Solaris

Hi All,

Our interpreter receives a page with an attribute that has a very long
value (actually the value is made up of thousands of small words).
Parsing this page results in a stack overflow.

Would anyone know if there is a patch available (on Solaris) that would
guard against stack overflow while processing such a page?
We are currently using: Xerces-C++ Version  2.3.0

The following is a sample of the stack trace:

  [1] 0x168c94(0x16d368, 0x137238, 0x48f6c0, 0x1c, 0x1, 0xffffffff), at
0x168c93
  [2] 0x168c90(0x0, 0x3fd00088, 0xbdd00000, 0x3fd00088, 0xb,
0xfda00088), at 0x168c8f
  [3] xercesc_2_3::RangeToken::match(0x282de8, 0x20, 0xff400224, 0x20,
0x1, 0xdc00), at 0xfe326484
  [4] xercesc_2_3::RegularExpression::matchRange(0x5adaf0, 0x27ef18,
0x31caf0, 0xff4002d8, 0x1, 0x0), at 0xfe335c98
  [5] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x31caf0, 0xc, 0x1, 0xfe309e1c), at 0xfe334aa0
  [6] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [7] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [8] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [9] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [10] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [11] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x48f6c0, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
...
<snip>
...
  [87320] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x27ed18, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [87321] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x27ed18, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [87322] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x27ed18, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [87323] xercesc_2_3::RegularExpression::match(0x5adaf0, 0x27ef18,
0x27ed18, 0x1c, 0x1, 0xffffffff), at 0xfe334dac
  [87324] xercesc_2_3::RegularExpression::matches(0x5adaf0, 0x56ce00,
0x200, 0xffbfeb30, 0x0, 0x0), at 0xfe331740
  [87325] xercesc_2_3::AbstractStringValidator::checkContent(0x294470,
0x56ce00, 0x0, 0xfe1bde3c, 0xfe4c992c, 0x0), at 0xfe1be06c
  [87326] xercesc_2_3::SchemaValidator::validateAttrValue(0x20e160,
0x479100, 0x56ce00, 0x0, 0x3f8f30, 0x4), at 0xfe37de28
  [87327] xercesc_2_3::IGXMLScanner::buildAttList(0x209360, 0x479100,
0x465038, 0x3f8f30, 0x56ce00, 0xfe52c640), at 0xfe2d98a8
  [87328] xercesc_2_3::IGXMLScanner::scanStartTagNS(0xc, 0x3f8f30, 0x2,
0x5, 0x20b068, 0xfe4c992c), at 0xfe2d2e28
  [87329] xercesc_2_3::IGXMLScanner::scanContent(0x209360, 0x0,
0x3c0000, 0xfe4f6730, 0xfe2e2e40, 0x1), at 0xfe2cb10c
=>[87330] xercesc_2_3::IGXMLScanner::scanDocument(0x209360, 0x0,
0x3feffc0b, 0xfd80fc0b, 0x1, 0xfe4f6730), at 0xfe2c8b38
  [87331] xercesc_2_3::SAX2XMLReaderImpl::parse(0x208490, 0xffbff0dc,
0x19b0d, 0x2accf8, 0x0, 0xfe3518e8), at 0xfe349118

Thanks in advance.

Regards,
Russ


Re: SAX parser causing stack overflow on Solaris

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Boris Kolpackov <bo...@codesynthesis.com> wrote on 05/21/2007 03:35:21 PM:

> Michael Glavassevich <mr...@ca.ibm.com> writes:
> 
> > Looks like the same problem [1] Xerces-J has where the number of 
recursive
> > calls to match() is proportional to the length of the input. If the 
string
> > is too long the stack overflows.
> >
> > [1] https://issues.apache.org/jira/browse/XERCESJ-589
> 
> Wow, that bug was opened in Jan 2003 and it is still not fixed. Anyway,
> I've created a Xerces-C++ equivalent and captured all the information
> there:
> 
> https://issues.apache.org/jira/browse/XERCESC-1708

The regex bug reports (at least in Xerces-J) tend to sit around for years 
because none of the current developers know this part of the code very 
well. Not that many of the former developers did either. I'd say this one 
is most likely to get fixed in the foreseeable future if someone from the 
community contributes a patch with some unit tests to help verify it.

> thanks,
> -boris
> 
> 
> -- 
> Boris Kolpackov
> Code Synthesis Tools CC
> http://www.codesynthesis.com
> Open-Source, Cross-Platform C++ XML Data Binding

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Re: SAX parser causing stack overflow on Solaris

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Michael Glavassevich <mr...@ca.ibm.com> writes:

> Looks like the same problem [1] Xerces-J has where the number of recursive
> calls to match() is proportional to the length of the input. If the string
> is too long the stack overflows.
>
> [1] https://issues.apache.org/jira/browse/XERCESJ-589

Wow, that bug was opened in Jan 2003 and it is still not fixed. Anyway,
I've created a Xerces-C++ equivalent and captured all the information
there:

https://issues.apache.org/jira/browse/XERCESC-1708

thanks,
-boris


-- 
Boris Kolpackov
Code Synthesis Tools CC
http://www.codesynthesis.com
Open-Source, Cross-Platform C++ XML Data Binding


Re: SAX parser causing stack overflow on Solaris

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Looks like the same problem [1] Xerces-J has where the number of recursive 
calls to match() is proportional to the length of the input. If the string 
is too long the stack overflows.

[1] https://issues.apache.org/jira/browse/XERCESJ-589

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

news <ne...@sea.gmane.org> wrote on 05/18/2007 02:03:50 PM:

> Hi Russell,
> 
> "Russell Gulli" <ru...@nortel.com> writes:
> 
> > Thanks for the quick reply.  Xerces 2.7.0 exhibits the same behavior.
> >
> > One interesting point is that the Windows version of Xerces-C++ 
Version
> > 2.3.0 does not crash when given the same document.  It throws a parse
> > exception instead of overflowing the stack (Note: same schema for
> > Solaris and Windows)?
> 
> Hm, that's strange. Can you provide a test case and file a bug report
> (with the information above) so that this won't get missed:
> 
> https://issues.apache.org/jira/secure/Dashboard.jspa
> 
> Thanks!
> 
> -boris
> -- 
> Boris Kolpackov
> Code Synthesis Tools CC
> http://www.codesynthesis.com
> Open-Source, Cross-Platform C++ XML Data Binding


Re: SAX parser causing stack overflow on Solaris

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Russell,

"Russell Gulli" <ru...@nortel.com> writes:

> Thanks for the quick reply.  Xerces 2.7.0 exhibits the same behavior.
>
> One interesting point is that the Windows version of Xerces-C++ Version
> 2.3.0 does not crash when given the same document.  It throws a parse
> exception instead of overflowing the stack (Note: same schema for
> Solaris and Windows)?

Hm, that's strange. Can you provide a test case and file a bug report
(with the information above) so that this won't get missed:

https://issues.apache.org/jira/secure/Dashboard.jspa

Thanks!

-boris
-- 
Boris Kolpackov
Code Synthesis Tools CC
http://www.codesynthesis.com
Open-Source, Cross-Platform C++ XML Data Binding


RE: Re: SAX parser causing stack overflow on Solaris

Posted by Russell Gulli <ru...@nortel.com>.
Hi Boris,

Thanks for the quick reply.  Xerces 2.7.0 exhibits the same behavior.

One interesting point is that the Windows version of Xerces-C++ Version
2.3.0 does not crash when given the same document.  It throws a parse
exception instead of overflowing the stack (Note: same schema for
Solaris and Windows)?

Changing the schema should work, however the schema is standard for
interpreting VoiceXML pages.

Thanks,

- - Russ

-----Original Message-----
From: news [mailto:news@sea.gmane.org] On Behalf Of Boris Kolpackov
Sent: Monday, May 14, 2007 3:15 PM
To: c-users@xerces.apache.org
Subject: Re: SAX parser causing stack overflow on Solaris


Hi Russell,

"Russell Gulli" <ru...@nortel.com> writes:

> Would anyone know if there is a patch available (on Solaris) that 
> would guard against stack overflow while processing such a page? We 
> are currently using: Xerces-C++ Version  2.3.0

The stack appears to overflow in the regular expression validation code.
Two things you may want to try are to upgrade to version 2.7.0 or
change/remove the regex pattern in your schema.

hth,
-boris

-- 
Boris Kolpackov
Code Synthesis Tools CC
http://www.codesynthesis.com
Open-Source, Cross-Platform C++ XML Data Binding


Re: SAX parser causing stack overflow on Solaris

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi Russell,

"Russell Gulli" <ru...@nortel.com> writes:

> Would anyone know if there is a patch available (on Solaris) that would
> guard against stack overflow while processing such a page?
> We are currently using: Xerces-C++ Version  2.3.0

The stack appears to overflow in the regular expression validation
code. Two things you may want to try are to upgrade to version 2.7.0
or change/remove the regex pattern in your schema.

hth,
-boris

-- 
Boris Kolpackov
Code Synthesis Tools CC
http://www.codesynthesis.com
Open-Source, Cross-Platform C++ XML Data Binding