You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Alberto Massari (JIRA)" <xe...@xml.apache.org> on 2008/07/14 09:49:33 UTC

[jira] Issue Comment Edited: (XERCESC-1816) Multi-character escape classes don't work correctly in regular expressions

    [ https://issues.apache.org/jira/browse/XERCESC-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613253#action_12613253 ] 

amassari edited comment on XERCESC-1816 at 7/14/08 12:48 AM:
--------------------------------------------------------------------

>From what I can see, the \c, \C, \i and \I are implemented by the regex parser of Xerces-C, but only when you specify that you want to use the regex as defined by the XMLSchema specs (implemented in the ParserForXMLSchema.cpp file). The code that is listed here is for the "normal" regex, that I guess doesn't specify these XML-specific features.

Clearly, it shouldn't hang....

      was (Author: amassari):
    From what I can see, the \c, \C, \i and \I are implemented by the regex parser of Xerces-C, but only when you specify that you want to use the regex as defined by the XMLSchema specs (implemented in the ParserForXMLSchema.cpp file). The code that is listed here is for the "normal" regex, that I guess doesn't specify these XML-specific features.
  
> Multi-character escape classes don't work correctly in regular expressions
> --------------------------------------------------------------------------
>
>                 Key: XERCESC-1816
>                 URL: https://issues.apache.org/jira/browse/XERCESC-1816
>             Project: Xerces-C++
>          Issue Type: Bug
>          Components: Validating Parser (XML Schema)
>    Affects Versions: 2.8.0, 3.0.0
>            Reporter: John Snelson
>
> The regular expressions "\i", "\I", "\c" and "\C" do not work as specified in the XML Schema specification:
> http://www.w3.org/TR/xmlschema-2/#nt-MultiCharEsc
> In fact, "\I" and "\C" cause an infinite loop during the parsing of the regular expression, "\i" seems to only match the letter "i", and "\c" gives the error:
> A character in U+0040-U+005f must follow '\c'.
> I'd be happy to attempt to fix this bug, but I need some guidance as to what the code for "\c" is actually meant to be doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org