You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Hollenbeck, Scott" <sh...@verisign.com> on 2001/11/16 14:24:39 UTC
Pattern Error or Xerces-J Bug?
I have a simpleType declaration that looks like this:
<simpleType name="fooType">
<restriction base="token">
<pattern value="[a-zA-Z0-9_-]{4,24}"/>
</restriction>
</simpleType>
I want to accept values containing a minimum of 4 and a maximum of 24
characters in the set {a-z|A-Z|0-9|_|-}. Xerces-J 1.4.3 and 1.4.4 do not
reject element values that contain more than 24 of the allowable characters.
They do reject elements that contain less than 4 characters, though. This
element gets rejected:
<foo>abc</foo>
(error: "does not match regular expression facet '[a-zA-Z0-9_-]{4,24}'")
This one doesn't:
<foo>abc-123_AB1111111111111111111</foo>
Is the error in my pattern, or is Xerces-J doing something wrong? I can
make this work by putting a maxLength facet inside the <restriction>, but I
don't see why that should be necessary.
FWIW Xerces-J 2.0.0 beta 3 catches the error:
"[Error] test.xml:15:83: cvc-type.3.1.3: The value
'abc-123_AB1111111111111111111' of element 'foo' is not valid."
-Scott-
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Pattern Error or Xerces-J Bug?
Posted by Akikatsu Nakagita <na...@ssgw.ss.ntts.co.jp>.
Hello,
I am an English beginner.
This bug's Attach.
diff -c
"c:/nakagita/download/java/Xerces/xerces-1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java"
"c:/java/xerces/1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java"
*** c:/nakagita/download/java/Xerces/xerces-1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java Mon Nov 19 10:52:59 2001
--- c:/java/xerces/1_4_3/src/org/apache/xerces/utils/regex/RegexParser.java Mon Nov 12 17:22:57 2001
***************
*** 669,674 ****
--- 669,682 ----
//
else if (ch == ',') {
+ // add
+ if (off < this.regexlen) {
+ ch = this.regex.charAt(off++);
+ if (ch != '}' && (ch < '0' || ch > '9'))
+ // REVISIT: This should be ex()
+ throw new RuntimeException("Invalid quantifier");
+ }
+ // add
if (ch == '}') {
max = -1; // {min,}
} else {
Diff finished at Mon Nov 19 10:54:10
reference at http://nagoya.apache.org/bugzilla/showattachment.cgi?attach_id=532
Judd Wilcox wrote:
> Scott,
>
> I experimented with {a,b} in regexp counting some time ago in 1.4.3 and
> convinced myself it
> wasn't implemented correctly even though it's use is documented. I never
> did follow up to see if there's
> a bug report on it, though.
>
> Judd Wilcox
> Lucent Technologies
>
> Klaus Malorny wrote:
>
>> Hollenbeck, Scott wrote:
>>
>>> I have a simpleType declaration that looks like this:
>>>
>>> <simpleType name="fooType">
>>> <restriction base="token">
>>> <pattern value="[a-zA-Z0-9_-]{4,24}"/>
>>> </restriction>
>>> </simpleType>
>>>
>>> I want to accept values containing a minimum of 4 and a maximum of 24
>>> characters in the set {a-z|A-Z|0-9|_|-}. Xerces-J 1.4.3 and 1.4.4 do
>>> not
>>> reject element values that contain more than 24 of the allowable
>>> characters.
>>> They do reject elements that contain less than 4 characters, though.
>>> This
>>> element gets rejected:
>>>
>>> <foo>abc</foo>
>>> (error: "does not match regular expression facet
>>> '[a-zA-Z0-9_-]{4,24}'")
>>>
>>> This one doesn't:
>>>
>>> <foo>abc-123_AB1111111111111111111</foo>
>>>
>>> Is the error in my pattern, or is Xerces-J doing something wrong? I can
>>> make this work by putting a maxLength facet inside the <restriction>,
>>> but I
>>> don't see why that should be necessary.
>>>
>>> FWIW Xerces-J 2.0.0 beta 3 catches the error:
>>>
>>> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
>>> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
>>>
>>> -Scott-
>>
>>
>>
>>
>>
>> Hi Scott,
>>
>> haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash
>> as the dash of a range, like the ones before, and gets a bit confused.
>> Perhaps try to quote it.
>>
>> [a-zA-Z0-9_-]{4,24}
>> ^
>>
>> regards,
>>
>> Klaus Malorny
>>
>>
>> ___________________________________________________________________________
>>
>> | |
>> | knipp | Knipp Medien und Kommunikation GmbH
>> ------- Technologiepark
>> Martin-Schmeißer-Weg 9
>> Dipl. Inf. Klaus Malorny 44227 Dortmund
>> Klaus.Malorny@knipp.de Tel. +49 231 9703 0
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Pattern Error or Xerces-J Bug?
Posted by Judd Wilcox <jw...@lucent.com>.
Scott,
I experimented with {a,b} in regexp counting some time ago in 1.4.3 and
convinced myself it
wasn't implemented correctly even though it's use is documented. I never
did follow up to see if there's
a bug report on it, though.
Judd Wilcox
Lucent Technologies
Klaus Malorny wrote:
> Hollenbeck, Scott wrote:
>
>> I have a simpleType declaration that looks like this:
>>
>> <simpleType name="fooType">
>> <restriction base="token">
>> <pattern value="[a-zA-Z0-9_-]{4,24}"/>
>> </restriction>
>> </simpleType>
>>
>> I want to accept values containing a minimum of 4 and a maximum of 24
>> characters in the set {a-z|A-Z|0-9|_|-}. Xerces-J 1.4.3 and 1.4.4 do
>> not
>> reject element values that contain more than 24 of the allowable
>> characters.
>> They do reject elements that contain less than 4 characters, though.
>> This
>> element gets rejected:
>>
>> <foo>abc</foo>
>> (error: "does not match regular expression facet
>> '[a-zA-Z0-9_-]{4,24}'")
>>
>> This one doesn't:
>>
>> <foo>abc-123_AB1111111111111111111</foo>
>>
>> Is the error in my pattern, or is Xerces-J doing something wrong? I can
>> make this work by putting a maxLength facet inside the <restriction>,
>> but I
>> don't see why that should be necessary.
>>
>> FWIW Xerces-J 2.0.0 beta 3 catches the error:
>>
>> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
>> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
>>
>> -Scott-
>
>
>
>
> Hi Scott,
>
> haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash
> as the dash of a range, like the ones before, and gets a bit confused.
> Perhaps try to quote it.
>
> [a-zA-Z0-9_-]{4,24}
> ^
>
> regards,
>
> Klaus Malorny
>
>
> ___________________________________________________________________________
>
> | |
> | knipp | Knipp Medien und Kommunikation GmbH
> ------- Technologiepark
> Martin-Schmeißer-Weg 9
> Dipl. Inf. Klaus Malorny 44227 Dortmund
> Klaus.Malorny@knipp.de Tel. +49 231 9703 0
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org
Re: Pattern Error or Xerces-J Bug?
Posted by Klaus Malorny <Kl...@knipp.de>.
Hollenbeck, Scott wrote:
> I have a simpleType declaration that looks like this:
>
> <simpleType name="fooType">
> <restriction base="token">
> <pattern value="[a-zA-Z0-9_-]{4,24}"/>
> </restriction>
> </simpleType>
>
> I want to accept values containing a minimum of 4 and a maximum of 24
> characters in the set {a-z|A-Z|0-9|_|-}. Xerces-J 1.4.3 and 1.4.4 do not
> reject element values that contain more than 24 of the allowable characters.
> They do reject elements that contain less than 4 characters, though. This
> element gets rejected:
>
> <foo>abc</foo>
> (error: "does not match regular expression facet '[a-zA-Z0-9_-]{4,24}'")
>
> This one doesn't:
>
> <foo>abc-123_AB1111111111111111111</foo>
>
> Is the error in my pattern, or is Xerces-J doing something wrong? I can
> make this work by putting a maxLength facet inside the <restriction>, but I
> don't see why that should be necessary.
>
> FWIW Xerces-J 2.0.0 beta 3 catches the error:
>
> "[Error] test.xml:15:83: cvc-type.3.1.3: The value
> 'abc-123_AB1111111111111111111' of element 'foo' is not valid."
>
> -Scott-
Hi Scott,
haven't tried out, but maybe Xerces-J 1.4.3 interprets the last dash as the
dash of a range, like the ones before, and gets a bit confused. Perhaps try to
quote it.
[a-zA-Z0-9_-]{4,24}
^
regards,
Klaus Malorny
___________________________________________________________________________
| |
| knipp | Knipp Medien und Kommunikation GmbH
------- Technologiepark
Martin-Schmeißer-Weg 9
Dipl. Inf. Klaus Malorny 44227 Dortmund
Klaus.Malorny@knipp.de Tel. +49 231 9703 0
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org