You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "David Bertoni (JIRA)" <xe...@xml.apache.org> on 2005/03/29 23:42:16 UTC

[jira] Created: (XERCESC-1390) Regular expressions with unions do not work properly with replacing and tokenizing.

Regular expressions with unions do not work properly with replacing and tokenizing.
-----------------------------------------------------------------------------------

         Key: XERCESC-1390
         URL: http://issues.apache.org/jira/browse/XERCESC-1390
     Project: Xerces-C++
        Type: Bug
  Components: Utilities  
    Versions: 2.6.0    
    Reporter: David Bertoni
    Priority: Critical
 Attachments: patch.txt

Consider the following regular expression:

"(ab) | (a)"

with the following input string:

"abracadabra"

If you use an instance the RegularExpression class to replace any matching substrings with the empty string, the result should be the following string:

"rcdr"

Instead, just the last "a" in the string is replaced:

"abracadabr"

If you use the same RegularExpression instance to tokenize the expression, the result should be the following set of strings:

""
"r"
"c"
"d"
"r"
""

Instead, the result is

"abracadabr"
""

I will attach a proposed patch, but I don't know this code well, so it would be great if someone could review it.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


[jira] Assigned: (XERCESC-1390) Regular expressions with unions do not work properly with replacing and tokenizing.

Posted by "David Bertoni (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESC-1390?page=history ]

David Bertoni reassigned XERCESC-1390:
--------------------------------------

    Assign To: David Bertoni

> Regular expressions with unions do not work properly with replacing and tokenizing.
> -----------------------------------------------------------------------------------
>
>          Key: XERCESC-1390
>          URL: http://issues.apache.org/jira/browse/XERCESC-1390
>      Project: Xerces-C++
>         Type: Bug
>   Components: Utilities
>     Versions: 2.6.0
>     Reporter: David Bertoni
>     Assignee: David Bertoni
>     Priority: Critical
>  Attachments: patch.txt
>
> Consider the following regular expression:
> "(ab) | (a)"
> with the following input string:
> "abracadabra"
> If you use an instance the RegularExpression class to replace any matching substrings with the empty string, the result should be the following string:
> "rcdr"
> Instead, just the last "a" in the string is replaced:
> "abracadabr"
> If you use the same RegularExpression instance to tokenize the expression, the result should be the following set of strings:
> ""
> "r"
> "c"
> "d"
> "r"
> ""
> Instead, the result is
> "abracadabr"
> ""
> I will attach a proposed patch, but I don't know this code well, so it would be great if someone could review it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


[jira] Commented: (XERCESC-1390) Regular expressions with unions do not work properly with replacing and tokenizing.

Posted by "Gareth Reakes (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESC-1390?page=comments#action_63035 ]
     
Gareth Reakes commented on XERCESC-1390:
----------------------------------------

I have added functionality to this code in the past, and the patch looks OK to me but I don't think I in that much of a better position than you as I have not touched this particular piece of it.

> Regular expressions with unions do not work properly with replacing and tokenizing.
> -----------------------------------------------------------------------------------
>
>          Key: XERCESC-1390
>          URL: http://issues.apache.org/jira/browse/XERCESC-1390
>      Project: Xerces-C++
>         Type: Bug
>   Components: Utilities
>     Versions: 2.6.0
>     Reporter: David Bertoni
>     Assignee: David Bertoni
>     Priority: Critical
>  Attachments: patch.txt
>
> Consider the following regular expression:
> "(ab) | (a)"
> with the following input string:
> "abracadabra"
> If you use an instance the RegularExpression class to replace any matching substrings with the empty string, the result should be the following string:
> "rcdr"
> Instead, just the last "a" in the string is replaced:
> "abracadabr"
> If you use the same RegularExpression instance to tokenize the expression, the result should be the following set of strings:
> ""
> "r"
> "c"
> "d"
> "r"
> ""
> Instead, the result is
> "abracadabr"
> ""
> I will attach a proposed patch, but I don't know this code well, so it would be great if someone could review it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[jira] Updated: (XERCESC-1390) Regular expressions with unions do not work properly with replacing and tokenizing.

Posted by "David Bertoni (JIRA)" <xe...@xml.apache.org>.
     [ http://issues.apache.org/jira/browse/XERCESC-1390?page=history ]

David Bertoni updated XERCESC-1390:
-----------------------------------

    Attachment: patch.txt

> Regular expressions with unions do not work properly with replacing and tokenizing.
> -----------------------------------------------------------------------------------
>
>          Key: XERCESC-1390
>          URL: http://issues.apache.org/jira/browse/XERCESC-1390
>      Project: Xerces-C++
>         Type: Bug
>   Components: Utilities
>     Versions: 2.6.0
>     Reporter: David Bertoni
>     Priority: Critical
>  Attachments: patch.txt
>
> Consider the following regular expression:
> "(ab) | (a)"
> with the following input string:
> "abracadabra"
> If you use an instance the RegularExpression class to replace any matching substrings with the empty string, the result should be the following string:
> "rcdr"
> Instead, just the last "a" in the string is replaced:
> "abracadabr"
> If you use the same RegularExpression instance to tokenize the expression, the result should be the following set of strings:
> ""
> "r"
> "c"
> "d"
> "r"
> ""
> Instead, the result is
> "abracadabr"
> ""
> I will attach a proposed patch, but I don't know this code well, so it would be great if someone could review it.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org