You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Hollenbeck, Scott" <sh...@verisign.com> on 2001/11/16 14:24:39 UTC

Pattern Error or Xerces-J Bug?

I have a simpleType declaration that looks like this:

  <simpleType name="fooType">
    <restriction base="token">
      <pattern value="[a-zA-Z0-9_-]{4,24}"/>
    </restriction>
  </simpleType>

I want to accept values containing a minimum of 4 and a maximum of 24
characters in the set {a-z|A-Z|0-9|_|-}.  Xerces-J 1.4.3 and 1.4.4 do not
reject element values that contain more than 24 of the allowable characters.
They do reject elements that contain less than 4 characters, though.  This
element gets rejected:

<foo>abc</foo>
(error:  "does not match regular expression facet '[a-zA-Z0-9_-]{4,24}'")

This one doesn't:

<foo>abc-123_AB1111111111111111111</foo>

Is the error in my pattern, or is Xerces-J doing something wrong?  I can
make this work by putting a maxLength facet inside the <restriction>, but I
don't see why that should be necessary.

FWIW Xerces-J 2.0.0 beta 3 catches the error:

"[Error] test.xml:15:83: cvc-type.3.1.3: The value
'abc-123_AB1111111111111111111' of element 'foo' is not valid."

-Scott- 

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Pattern Error or Xerces-J Bug?

Posted by Akikatsu Nakagita <na...@ssgw.ss.ntts.co.jp>.
Hello,
I am an English beginner.

This bug's Attach.

diff -c 
"c:/nakagita/download/java/Xerces/xerces-1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java" 
"c:/java/xerces/1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java"
*** c:/nakagita/download/java/Xerces/xerces-1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java	Mon Nov 19 10:52:59 2001
--- c:/java/xerces/1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java	Mon Nov 12 17:22:57 2001
***************
*** 669,674 ****
--- 669,682 ----
           //

           else if (ch == ',') {
+           // add
+           if (off < this.regexlen) {
+             ch = this.regex.charAt(off++);
+             if (ch != '}' && (ch < '0' || ch > '9'))
+                                 // REVISIT: This should be ex()
+               throw new RuntimeException("Invalid quantifier");
+           }
+           // add
             if (ch == '}') {
               max = -1;           // {min,}
             } else {

Diff finished at Mon Nov 19 10:54:10

reference at http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=532

Judd Wilcox wrote:

> Scott,
> 
> I experimented with {a,b} in regexp counting some time ago in 1.4.3 and 
> convinced myself it
> wasn't implemented correctly even though it's use is documented. I never 
> did follow up to see if there's
> a bug report on it, though.
> 
> Judd Wilcox
> Lucent Technologies
> 
> Klaus Malorny wrote:
> 
>> Hollenbeck, Scott wrote:
>>
>>> I have a simpleType declaration that looks like this:
>>>
>>>   <simpleType name="fooType">
>>>     <restriction base="token">
>>>       <pattern value="[a-zA-Z0-9_-]{4,24}"/>
>>>     </restriction>
>>>   </simpleType>
>>>
>>> I want to accept values containing a minimum of 4 and a maximum of 24
>>> characters in the set {a-z|A-Z|0-9|_|-}.  Xerces-J 1.4.3 and 1.4.4 do 
>>> not
>>> reject element values that contain more than 24 of the allowable 
>>> characters.
>>> They do reject elements that contain less than 4 characters, though.  
>>> This
>>> element gets rejected:
>>>
>>> <foo>abc</foo>
>>> (error:  "does not match regular expression facet 
>>> '[a-zA-Z0-9_-]{4,24}'")
>>>
>>> This one doesn't:
>>>
>>> <foo>abc-123_AB1111111111111111111</foo>
>>>
>>> Is the error in my pattern, or is Xerces-J doing something wrong?  I can
>>> make this work by putting a maxLength facet inside the <restriction>, 
>>> but I
>>> don't see why that should be necessary.
>>>
>>> FWIW Xerces-J 2.0.0 beta 3 catches the error:
>>>
>>> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
>>> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
>>>
>>> -Scott- 
>>
>>
>>
>>
>>
>> Hi Scott,
>>
>> haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash 
>> as the dash of a range, like the ones before, and gets a bit confused. 
>> Perhaps try to quote it.
>>
>> [a-zA-Z0-9_-]{4,24}
>>            ^
>>
>> regards,
>>
>> Klaus Malorny
>>
>>
>> ___________________________________________________________________________ 
>>
>>      |       |
>>      | knipp |                   Knipp  Medien und Kommunikation GmbH
>>       -------                           Technologiepark
>>                                         Martin-Schmeißer-Weg 9
>>      Dipl. Inf. Klaus Malorny           44227 Dortmund
>>      Klaus.Malorny@knipp.de             Tel. +49 231 9703 0
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Pattern Error or Xerces-J Bug?

Posted by Judd Wilcox <jw...@lucent.com>.
Scott,

I experimented with {a,b} in regexp counting some time ago in 1.4.3 and 
convinced myself it
wasn't implemented correctly even though it's use is documented. I never 
did follow up to see if there's
a bug report on it, though.

Judd Wilcox
Lucent Technologies

Klaus Malorny wrote:

> Hollenbeck, Scott wrote:
>
>> I have a simpleType declaration that looks like this:
>>
>>   <simpleType name="fooType">
>>     <restriction base="token">
>>       <pattern value="[a-zA-Z0-9_-]{4,24}"/>
>>     </restriction>
>>   </simpleType>
>>
>> I want to accept values containing a minimum of 4 and a maximum of 24
>> characters in the set {a-z|A-Z|0-9|_|-}.  Xerces-J 1.4.3 and 1.4.4 do 
>> not
>> reject element values that contain more than 24 of the allowable 
>> characters.
>> They do reject elements that contain less than 4 characters, though.  
>> This
>> element gets rejected:
>>
>> <foo>abc</foo>
>> (error:  "does not match regular expression facet 
>> '[a-zA-Z0-9_-]{4,24}'")
>>
>> This one doesn't:
>>
>> <foo>abc-123_AB1111111111111111111</foo>
>>
>> Is the error in my pattern, or is Xerces-J doing something wrong?  I can
>> make this work by putting a maxLength facet inside the <restriction>, 
>> but I
>> don't see why that should be necessary.
>>
>> FWIW Xerces-J 2.0.0 beta 3 catches the error:
>>
>> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
>> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
>>
>> -Scott- 
>
>
>
>
> Hi Scott,
>
> haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash 
> as the dash of a range, like the ones before, and gets a bit confused. 
> Perhaps try to quote it.
>
> [a-zA-Z0-9_-]{4,24}
>            ^
>
> regards,
>
> Klaus Malorny
>
>
> ___________________________________________________________________________ 
>
>      |       |
>      | knipp |                   Knipp  Medien und Kommunikation GmbH
>       -------                           Technologiepark
>                                         Martin-Schmeißer-Weg 9
>      Dipl. Inf. Klaus Malorny           44227 Dortmund
>      Klaus.Malorny@knipp.de             Tel. +49 231 9703 0
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Pattern Error or Xerces-J Bug?

Posted by Klaus Malorny <Kl...@knipp.de>.
Hollenbeck, Scott wrote:

> I have a simpleType declaration that looks like this:
> 
>   <simpleType name="fooType">
>     <restriction base="token">
>       <pattern value="[a-zA-Z0-9_-]{4,24}"/>
>     </restriction>
>   </simpleType>
> 
> I want to accept values containing a minimum of 4 and a maximum of 24
> characters in the set {a-z|A-Z|0-9|_|-}.  Xerces-J 1.4.3 and 1.4.4 do not
> reject element values that contain more than 24 of the allowable characters.
> They do reject elements that contain less than 4 characters, though.  This
> element gets rejected:
> 
> <foo>abc</foo>
> (error:  "does not match regular expression facet '[a-zA-Z0-9_-]{4,24}'")
> 
> This one doesn't:
> 
> <foo>abc-123_AB1111111111111111111</foo>
> 
> Is the error in my pattern, or is Xerces-J doing something wrong?  I can
> make this work by putting a maxLength facet inside the <restriction>, but I
> don't see why that should be necessary.
> 
> FWIW Xerces-J 2.0.0 beta 3 catches the error:
> 
> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
> 
> -Scott- 



Hi Scott,

haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash as the 
dash of a range, like the ones before, and gets a bit confused. Perhaps try to 
quote it.

[a-zA-Z0-9_-]{4,24}
            ^

regards,

Klaus Malorny


___________________________________________________________________________
      |       |
      | knipp |                   Knipp  Medien und Kommunikation GmbH
       -------                           Technologiepark
                                         Martin-Schmeißer-Weg 9
      Dipl. Inf. Klaus Malorny           44227 Dortmund
      Klaus.Malorny@knipp.de             Tel. +49 231 9703 0




---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org