You are viewing a plain text version of this content. The canonical link for it is here.
Posted to cvs@httpd.apache.org by pg...@apache.org on 2007/11/26 18:04:37 UTC
svn commit: r598343 [22/22] - in /httpd/httpd/vendor/pcre/current: ./ doc/
doc/html/ testdata/
Modified: httpd/httpd/vendor/pcre/current/testdata/testoutput3
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/testdata/testoutput3?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/testdata/testoutput3 (original)
+++ httpd/httpd/vendor/pcre/current/testdata/testoutput3 Mon Nov 26 09:04:19 2007
@@ -1,3 +1,5 @@
+PCRE version 5.0 13-Sep-2004
+
/^[\w]+/
*** Failers
No match
@@ -93,8 +95,8 @@
No need char
Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P
Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z
- ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â
- ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ
+ µ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä
+ å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ
/^[\xc8-\xc9]/iLfr_FR
École
@@ -109,55 +111,5 @@
No match
école
No match
-
-/\W+/Lfr_FR
- >>>\xaa<<<
- 0: >>>
- >>>\xba<<<
- 0: >>>
-
-/[\W]+/Lfr_FR
- >>>\xaa<<<
- 0: >>>
- >>>\xba<<<
- 0: >>>
-
-/[^[:alpha:]]+/Lfr_FR
- >>>\xaa<<<
- 0: >>>
- >>>\xba<<<
- 0: >>>
-
-/\w+/Lfr_FR
- >>>\xaa<<<
- 0: ª
- >>>\xba<<<
- 0: º
-
-/[\w]+/Lfr_FR
- >>>\xaa<<<
- 0: ª
- >>>\xba<<<
- 0: º
-
-/[[:alpha:]]+/Lfr_FR
- >>>\xaa<<<
- 0: ª
- >>>\xba<<<
- 0: º
-
-/[[:alpha:]][[:lower:]][[:upper:]]/DZLfr_FR
-------------------------------------------------------------------
- Bra
- [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
- [a-z\xb5\xdf-\xf6\xf8-\xff]
- [A-Z\xc0-\xd6\xd8-\xde]
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-No options
-No first char
-No need char
/ End of testinput3 /
Modified: httpd/httpd/vendor/pcre/current/testdata/testoutput4
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/testdata/testoutput4?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/testdata/testoutput4 (original)
+++ httpd/httpd/vendor/pcre/current/testdata/testoutput4 Mon Nov 26 09:04:19 2007
@@ -1,3 +1,5 @@
+PCRE version 5.0 13-Sep-2004
+
/-- Do not use the \x{} construct except with patterns that have the --/
/-- /8 option set, because PCRE doesn't recognize them as UTF-8 unless --/
No match
@@ -897,45 +899,5 @@
/^\x{85}$/8i
\x{85}
0: \x{85}
-
-/^á´/8
- á´
- 0: \x{1234}
-
-/^\á´/8
- á´
- 0: \x{1234}
-
-"(?s)(.{1,5})"8
- abcdefg
- 0: abcde
- 1: abcde
- ab
- 0: ab
- 1: ab
-
-/a*\x{100}*\w/8
- a
- 0: a
-
-/\S\S/8g
- A\x{a3}BC
- 0: A\x{a3}
- 0: BC
-
-/\S{2}/8g
- A\x{a3}BC
- 0: A\x{a3}
- 0: BC
-
-/\W\W/8g
- +\x{a3}==
- 0: +\x{a3}
- 0: ==
-
-/\W{2}/8g
- +\x{a3}==
- 0: +\x{a3}
- 0: ==
/ End of testinput4 /
Modified: httpd/httpd/vendor/pcre/current/testdata/testoutput5
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/testdata/testoutput5?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/testdata/testoutput5 (original)
+++ httpd/httpd/vendor/pcre/current/testdata/testoutput5 Mon Nov 26 09:04:19 2007
@@ -1,105 +1,116 @@
-/\x{100}/8DZ
+PCRE version 5.0 13-Sep-2004
+
+/\x{100}/8DM
+Memory allocation (code space): 10
------------------------------------------------------------------
- Bra
- \x{100}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{100}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 196
Need char = 128
-/\x{1000}/8DZ
+/\x{1000}/8DM
+Memory allocation (code space): 11
------------------------------------------------------------------
- Bra
- \x{1000}
- Ket
- End
+ 0 7 Bra 0
+ 3 \x{1000}
+ 7 7 Ket
+ 10 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 225
Need char = 128
-/\x{10000}/8DZ
+/\x{10000}/8DM
+Memory allocation (code space): 12
------------------------------------------------------------------
- Bra
- \x{10000}
- Ket
- End
+ 0 8 Bra 0
+ 3 \x{10000}
+ 8 8 Ket
+ 11 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 240
Need char = 128
-/\x{100000}/8DZ
+/\x{100000}/8DM
+Memory allocation (code space): 12
------------------------------------------------------------------
- Bra
- \x{100000}
- Ket
- End
+ 0 8 Bra 0
+ 3 \x{100000}
+ 8 8 Ket
+ 11 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 244
Need char = 128
-/\x{1000000}/8DZ
+/\x{1000000}/8DM
+Memory allocation (code space): 13
------------------------------------------------------------------
- Bra
- \x{1000000}
- Ket
- End
+ 0 9 Bra 0
+ 3 \x{1000000}
+ 9 9 Ket
+ 12 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 249
Need char = 128
-/\x{4000000}/8DZ
+/\x{4000000}/8DM
+Memory allocation (code space): 14
------------------------------------------------------------------
- Bra
- \x{4000000}
- Ket
- End
+ 0 10 Bra 0
+ 3 \x{4000000}
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 252
Need char = 128
-/\x{7fffFFFF}/8DZ
+/\x{7fffFFFF}/8DM
+Memory allocation (code space): 14
------------------------------------------------------------------
- Bra
- \x{7fffffff}
- Ket
- End
+ 0 10 Bra 0
+ 3 \x{7fffffff}
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 253
Need char = 191
-/[\x{ff}]/8DZ
+/[\x{ff}]/8DM
+Memory allocation (code space): 10
------------------------------------------------------------------
- Bra
- \x{ff}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{ff}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 195
Need char = 191
-/[\x{100}]/8DZ
+/[\x{100}]/8DM
+Memory allocation (code space): 47
------------------------------------------------------------------
- Bra
- [\x{100}]
- Ket
- End
+ 0 11 Bra 0
+ 3 [\x{100}]
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -116,36 +127,36 @@
\x{100}a\x{1234}bcd
0: \x{100}a\x{1234}
-/\x80/8DZ
+/\x80/8D
------------------------------------------------------------------
- Bra
- \x{80}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{80}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 194
Need char = 128
-/\xff/8DZ
+/\xff/8D
------------------------------------------------------------------
- Bra
- \x{ff}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{ff}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 195
Need char = 191
-/\x{0041}\x{2262}\x{0391}\x{002e}/DZ8
+/\x{0041}\x{2262}\x{0391}\x{002e}/D8
------------------------------------------------------------------
- Bra
- A\x{2262}\x{391}.
- Ket
- End
+ 0 14 Bra 0
+ 3 A\x{2262}\x{391}.
+ 14 14 Ket
+ 17 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -154,12 +165,12 @@
\x{0041}\x{2262}\x{0391}\x{002e}
0: A\x{2262}\x{391}.
-/\x{D55c}\x{ad6d}\x{C5B4}/DZ8
+/\x{D55c}\x{ad6d}\x{C5B4}/D8
------------------------------------------------------------------
- Bra
- \x{d55c}\x{ad6d}\x{c5b4}
- Ket
- End
+ 0 15 Bra 0
+ 3 \x{d55c}\x{ad6d}\x{c5b4}
+ 15 15 Ket
+ 18 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -168,12 +179,12 @@
\x{D55c}\x{ad6d}\x{C5B4}
0: \x{d55c}\x{ad6d}\x{c5b4}
-/\x{65e5}\x{672c}\x{8a9e}/DZ8
+/\x{65e5}\x{672c}\x{8a9e}/D8
------------------------------------------------------------------
- Bra
- \x{65e5}\x{672c}\x{8a9e}
- Ket
- End
+ 0 15 Bra 0
+ 3 \x{65e5}\x{672c}\x{8a9e}
+ 15 15 Ket
+ 18 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -182,74 +193,74 @@
\x{65e5}\x{672c}\x{8a9e}
0: \x{65e5}\x{672c}\x{8a9e}
-/\x{80}/DZ8
+/\x{80}/D8
------------------------------------------------------------------
- Bra
- \x{80}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{80}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 194
Need char = 128
-/\x{084}/DZ8
+/\x{084}/D8
------------------------------------------------------------------
- Bra
- \x{84}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{84}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 194
Need char = 132
-/\x{104}/DZ8
+/\x{104}/D8
------------------------------------------------------------------
- Bra
- \x{104}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{104}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 196
Need char = 132
-/\x{861}/DZ8
+/\x{861}/D8
------------------------------------------------------------------
- Bra
- \x{861}
- Ket
- End
+ 0 7 Bra 0
+ 3 \x{861}
+ 7 7 Ket
+ 10 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 224
Need char = 161
-/\x{212ab}/DZ8
+/\x{212ab}/D8
------------------------------------------------------------------
- Bra
- \x{212ab}
- Ket
- End
+ 0 8 Bra 0
+ 3 \x{212ab}
+ 8 8 Ket
+ 11 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 240
Need char = 171
-/.{3,5}X/DZ8
+/.{3,5}X/D8
------------------------------------------------------------------
- Bra
- Any{3}
- Any{0,2}
- X
- Ket
- End
+ 0 13 Bra 0
+ 3 Any{3}
+ 7 Any{0,2}
+ 11 X
+ 13 13 Ket
+ 16 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -260,13 +271,13 @@
0: \x{212ab}\x{212ab}\x{212ab}\x{861}X
-/.{3,5}?/DZ8
+/.{3,5}?/D8
------------------------------------------------------------------
- Bra
- Any{3}
- Any{0,2}?
- Ket
- End
+ 0 11 Bra 0
+ 3 Any{3}
+ 7 Any{0,2}?
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -276,9 +287,11 @@
\x{212ab}\x{212ab}\x{212ab}\x{861}
0: \x{212ab}\x{212ab}\x{212ab}
-/-- These tests are here rather than in testinput4 because Perl 5.6 has some
-problems with UTF-8 support, in the area of \x{..} where the value is < 255.
-It grumbles about invalid UTF-8 strings. --/
+/-- These tests are here rather than in testinput4 because Perl 5.6 has --/
+/-- some problems with UTF-8 support, in the area of \x{..} where the --/
+No match
+/-- value is < 255. It grumbles about invalid UTF-8 strings. --/
+No match
/^[a\x{c0}]b/8
\x{c0}b
@@ -318,9 +331,11 @@
/(?<=\C)X/8
Failed: \C not allowed in lookbehind assertion at offset 6
-/-- This one is here not because it's different to Perl, but because the way
-the captured single-byte is displayed. (In Perl it becomes a character, and you
-can't tell the difference.) --/
+/-- This one is here not because it's different to Perl, but because the --/
+/-- way the captured single-byte is displayed. (In Perl it becomes a --/
+No match
+/-- character, and you can't tell the difference.) --/
+No match
/X(\C)(.*)/8
X\x{1234}
@@ -332,13 +347,13 @@
1: \x{0a}
2: abc
-/^[ab]/8DZ
+/^[ab]/8D
------------------------------------------------------------------
- Bra
- ^
- [ab]
- Ket
- End
+ 0 37 Bra 0
+ 3 ^
+ 4 [ab]
+ 37 37 Ket
+ 40 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: anchored utf8
@@ -355,13 +370,13 @@
\x{100}
No match
-/^[^ab]/8DZ
+/^[^ab]/8D
------------------------------------------------------------------
- Bra
- ^
- [\x00-`c-\xff] (neg)
- Ket
- End
+ 0 37 Bra 0
+ 3 ^
+ 4 [\x00-`c-\xff] (neg)
+ 37 37 Ket
+ 40 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: anchored utf8
@@ -378,12 +393,12 @@
aaa
No match
-/[^ab\xC0-\xF0]/8SDZ
+/[^ab\xC0-\xF0]/8SD
------------------------------------------------------------------
- Bra
- [\x00-`c-\xbf\xf1-\xff] (neg)
- Ket
- End
+ 0 36 Bra 0
+ 3 [\x00-`c-\xbf\xf1-\xff] (neg)
+ 36 36 Ket
+ 39 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -414,13 +429,13 @@
\x{f0}
No match
-/Ä{3,4}/8SDZ
+/Ä{3,4}/8SD
------------------------------------------------------------------
- Bra
- \x{100}{3}
- \x{100}?
- Ket
- End
+ 0 13 Bra 0
+ 3 \x{100}{3}
+ 8 \x{100}{,1}
+ 13 13 Ket
+ 16 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -431,16 +446,16 @@
\x{100}\x{100}\x{100}\x{100\x{100}
0: \x{100}\x{100}\x{100}
-/(\x{100}+|x)/8SDZ
+/(\x{100}+|x)/8SD
------------------------------------------------------------------
- Bra
- CBra 1
- \x{100}+
- Alt
- x
- Ket
- Ket
- End
+ 0 17 Bra 0
+ 3 6 Bra 1
+ 6 \x{100}+
+ 9 5 Alt
+ 12 x
+ 14 11 Ket
+ 17 17 Ket
+ 20 End
------------------------------------------------------------------
Capturing subpattern count = 1
Partial matching not supported
@@ -449,17 +464,17 @@
No need char
Starting byte set: x \xc4
-/(\x{100}*a|x)/8SDZ
+/(\x{100}*a|x)/8SD
------------------------------------------------------------------
- Bra
- CBra 1
- \x{100}*+
- a
- Alt
- x
- Ket
- Ket
- End
+ 0 19 Bra 0
+ 3 8 Bra 1
+ 6 \x{100}*
+ 9 a
+ 11 5 Alt
+ 14 x
+ 16 13 Ket
+ 19 19 Ket
+ 22 End
------------------------------------------------------------------
Capturing subpattern count = 1
Partial matching not supported
@@ -468,17 +483,17 @@
No need char
Starting byte set: a x \xc4
-/(\x{100}{0,2}a|x)/8SDZ
+/(\x{100}{0,2}a|x)/8SD
------------------------------------------------------------------
- Bra
- CBra 1
- \x{100}{0,2}
- a
- Alt
- x
- Ket
- Ket
- End
+ 0 21 Bra 0
+ 3 10 Bra 1
+ 6 \x{100}{,2}
+ 11 a
+ 13 5 Alt
+ 16 x
+ 18 15 Ket
+ 21 21 Ket
+ 24 End
------------------------------------------------------------------
Capturing subpattern count = 1
Partial matching not supported
@@ -487,18 +502,18 @@
No need char
Starting byte set: a x \xc4
-/(\x{100}{1,2}a|x)/8SDZ
+/(\x{100}{1,2}a|x)/8SD
------------------------------------------------------------------
- Bra
- CBra 1
- \x{100}
- \x{100}{0,1}
- a
- Alt
- x
- Ket
- Ket
- End
+ 0 24 Bra 0
+ 3 13 Bra 1
+ 6 \x{100}
+ 9 \x{100}{,1}
+ 14 a
+ 16 5 Alt
+ 19 x
+ 21 18 Ket
+ 24 24 Ket
+ 27 End
------------------------------------------------------------------
Capturing subpattern count = 1
Partial matching not supported
@@ -531,24 +546,24 @@
\x{100}\x{100}abcd
No match
-/\x{100}/8DZ
+/\x{100}/8D
------------------------------------------------------------------
- Bra
- \x{100}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{100}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 196
Need char = 128
-/\x{100}*/8DZ
+/\x{100}*/8D
------------------------------------------------------------------
- Bra
- \x{100}*
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{100}*
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -556,13 +571,13 @@
No first char
No need char
-/a\x{100}*/8DZ
+/a\x{100}*/8D
------------------------------------------------------------------
- Bra
- a
- \x{100}*
- Ket
- End
+ 0 8 Bra 0
+ 3 a
+ 5 \x{100}*
+ 8 8 Ket
+ 11 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -570,13 +585,13 @@
First char = 'a'
No need char
-/ab\x{100}*/8DZ
+/ab\x{100}*/8D
------------------------------------------------------------------
- Bra
- ab
- \x{100}*
- Ket
- End
+ 0 10 Bra 0
+ 3 ab
+ 7 \x{100}*
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -584,13 +599,13 @@
First char = 'a'
Need char = 'b'
-/a\x{100}\x{101}*/8DZ
+/a\x{100}\x{101}*/8D
------------------------------------------------------------------
- Bra
- a\x{100}
- \x{101}*
- Ket
- End
+ 0 11 Bra 0
+ 3 a\x{100}
+ 8 \x{101}*
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -598,13 +613,13 @@
First char = 'a'
Need char = 128
-/a\x{100}\x{101}+/8DZ
+/a\x{100}\x{101}+/8D
------------------------------------------------------------------
- Bra
- a\x{100}
- \x{101}+
- Ket
- End
+ 0 11 Bra 0
+ 3 a\x{100}
+ 8 \x{101}+
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -612,13 +627,13 @@
First char = 'a'
Need char = 129
-/\x{100}*A/8DZ
+/\x{100}*A/8D
------------------------------------------------------------------
- Bra
- \x{100}*+
- A
- Ket
- End
+ 0 8 Bra 0
+ 3 \x{100}*
+ 6 A
+ 8 8 Ket
+ 11 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -628,16 +643,14 @@
A
0: A
-/\x{100}*\d(?R)/8DZ
+/\x{100}*\d(?R)/8D
------------------------------------------------------------------
- Bra
- \x{100}*+
- \d
- Once
- Recurse
- Ket
- Ket
- End
+ 0 10 Bra 0
+ 3 \x{100}*
+ 6 \d
+ 7 0 Recurse
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -645,36 +658,37 @@
No first char
No need char
-/[^\x{c4}]/DZ
+/[^\x{c4}]/D
------------------------------------------------------------------
- Bra
- [^\xc4]
- Ket
- End
+ 0 36 Bra 0
+ 3 [\x01-35-bd-z|~-\xff] (neg)
+ 36 36 Ket
+ 39 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[^\x{c4}]/8DZ
+/[^\x{c4}]/8D
------------------------------------------------------------------
- Bra
- [\x00-\xc3\xc5-\xff] (neg)
- Ket
- End
+ 0 36 Bra 0
+ 3 [\x00-\xc3\xc5-\xff] (neg)
+ 36 36 Ket
+ 39 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
No first char
No need char
-/[\x{100}]/8DZ
+/[\x{100}]/8DM
+Memory allocation (code space): 47
------------------------------------------------------------------
- Bra
- [\x{100}]
- Ket
- End
+ 0 11 Bra 0
+ 3 [\x{100}]
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -689,12 +703,13 @@
*** Failers
No match
-/[Z\x{100}]/8DZ
+/[Z\x{100}]/8DM
+Memory allocation (code space): 47
------------------------------------------------------------------
- Bra
- [Z\x{100}]
- Ket
- End
+ 0 43 Bra 0
+ 3 [Z\x{100}]
+ 43 43 Ket
+ 46 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -724,24 +739,24 @@
\x{ff}
No match
-/[z-\x{100}]/8DZ
+/[z-\x{100}]/8D
------------------------------------------------------------------
- Bra
- [z-\x{100}]
- Ket
- End
+ 0 12 Bra 0
+ 3 [z-\x{100}]
+ 12 12 Ket
+ 15 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
No first char
No need char
-/[z\Qa-d]Ä\E]/8DZ
+/[z\Qa-d]Ä\E]/8D
------------------------------------------------------------------
- Bra
- [\-\]adz\x{100}]
- Ket
- End
+ 0 43 Bra 0
+ 3 [\-\]adz\x{100}]
+ 43 43 Ket
+ 46 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -752,12 +767,12 @@
Ä
0: \x{100}
-/[\xFF]/DZ
+/[\xFF]/D
------------------------------------------------------------------
- Bra
- \xff
- Ket
- End
+ 0 5 Bra 0
+ 3 \xff
+ 5 5 Ket
+ 8 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
@@ -766,12 +781,12 @@
>\xff<
0: \xff
-/[\xff]/DZ8
+/[\xff]/D8
------------------------------------------------------------------
- Bra
- \x{ff}
- Ket
- End
+ 0 6 Bra 0
+ 3 \x{ff}
+ 6 6 Ket
+ 9 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -780,24 +795,24 @@
>\x{ff}<
0: \x{ff}
-/[^\xFF]/DZ
+/[^\xFF]/D
------------------------------------------------------------------
- Bra
- [^\xff]
- Ket
- End
+ 0 5 Bra 0
+ 3 [^\xff]
+ 5 5 Ket
+ 8 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[^\xff]/8DZ
+/[^\xff]/8D
------------------------------------------------------------------
- Bra
- [\x00-\xfe] (neg)
- Ket
- End
+ 0 36 Bra 0
+ 3 [\x00-\xfe] (neg)
+ 36 36 Ket
+ 39 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -837,12 +852,12 @@
/ÃÃÃxxx/8
Failed: invalid UTF-8 string at offset 1
-/ÃÃÃxxx/8?DZ
+/ÃÃÃxxx/8?D
------------------------------------------------------------------
- Bra
- \X{c0}\X{c0}\X{c0}xxx
- Ket
- End
+ 0 15 Bra 0
+ 3 \X{c0}\X{c0}\X{c0}xxx
+ 15 15 Ket
+ 18 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8 no_utf8_check
@@ -887,186 +902,160 @@
\xf1\x8f\x80\x80
No match
\xf8\x88\x80\x80\x80
-Error -10
- \xf9\x87\x80\x80\x80
-Error -10
- \xfc\x84\x80\x80\x80\x80
-Error -10
- \xfd\x83\x80\x80\x80\x80
-Error -10
- \?\xf8\x88\x80\x80\x80
No match
- \?\xf9\x87\x80\x80\x80
+ \xf9\x87\x80\x80\x80
No match
- \?\xfc\x84\x80\x80\x80\x80
+ \xfc\x84\x80\x80\x80\x80
No match
- \?\xfd\x83\x80\x80\x80\x80
+ \xfd\x83\x80\x80\x80\x80
No match
-/\x{100}abc(xyz(?1))/8DZ
+/\x{100}abc(xyz(?1))/8D
------------------------------------------------------------------
- Bra
- \x{100}abc
- CBra 1
- xyz
- Once
- Recurse
- Ket
- Ket
- Ket
- End
+ 0 27 Bra 0
+ 3 \x{100}abc
+ 12 12 Bra 1
+ 15 xyz
+ 21 12 Recurse
+ 24 12 Ket
+ 27 27 Ket
+ 30 End
------------------------------------------------------------------
Capturing subpattern count = 1
Options: utf8
First char = 196
Need char = 'z'
-/[^\x{100}]abc(xyz(?1))/8DZ
+/[^\x{100}]abc(xyz(?1))/8D
------------------------------------------------------------------
- Bra
- [^\x{100}]
- abc
- CBra 1
- xyz
- Once
- Recurse
- Ket
- Ket
- Ket
- End
+ 0 32 Bra 0
+ 3 [^\x{100}]
+ 11 abc
+ 17 12 Bra 1
+ 20 xyz
+ 26 17 Recurse
+ 29 12 Ket
+ 32 32 Ket
+ 35 End
------------------------------------------------------------------
Capturing subpattern count = 1
Options: utf8
No first char
Need char = 'z'
-/[ab\x{100}]abc(xyz(?1))/8DZ
+/[ab\x{100}]abc(xyz(?1))/8D
------------------------------------------------------------------
- Bra
- [ab\x{100}]
- abc
- CBra 1
- xyz
- Once
- Recurse
- Ket
- Ket
- Ket
- End
+ 0 64 Bra 0
+ 3 [ab\x{100}]
+ 43 abc
+ 49 12 Bra 1
+ 52 xyz
+ 58 49 Recurse
+ 61 12 Ket
+ 64 64 Ket
+ 67 End
------------------------------------------------------------------
Capturing subpattern count = 1
Options: utf8
No first char
Need char = 'z'
-/(\x{100}(b(?2)c))?/DZ8
+/(\x{100}(b(?2)c))?/D8
------------------------------------------------------------------
- Bra
- Brazero
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Ket
- End
+ 0 26 Bra 0
+ 3 Brazero
+ 4 19 Bra 1
+ 7 \x{100}
+ 10 10 Bra 2
+ 13 b
+ 15 10 Recurse
+ 18 c
+ 20 10 Ket
+ 23 19 Ket
+ 26 26 Ket
+ 29 End
------------------------------------------------------------------
Capturing subpattern count = 2
Options: utf8
No first char
No need char
-/(\x{100}(b(?2)c)){0,2}/DZ8
+/(\x{100}(b(?2)c)){0,2}/D8
------------------------------------------------------------------
- Bra
- Brazero
- Bra
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Brazero
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Ket
- Ket
- End
+ 0 55 Bra 0
+ 3 Brazero
+ 4 48 Bra 0
+ 7 19 Bra 1
+ 10 \x{100}
+ 13 10 Bra 2
+ 16 b
+ 18 13 Recurse
+ 21 c
+ 23 10 Ket
+ 26 19 Ket
+ 29 Brazero
+ 30 19 Bra 1
+ 33 \x{100}
+ 36 10 Bra 2
+ 39 b
+ 41 13 Recurse
+ 44 c
+ 46 10 Ket
+ 49 19 Ket
+ 52 48 Ket
+ 55 55 Ket
+ 58 End
------------------------------------------------------------------
Capturing subpattern count = 2
Options: utf8
No first char
No need char
-/(\x{100}(b(?1)c))?/DZ8
+/(\x{100}(b(?1)c))?/D8
------------------------------------------------------------------
- Bra
- Brazero
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Ket
- End
+ 0 26 Bra 0
+ 3 Brazero
+ 4 19 Bra 1
+ 7 \x{100}
+ 10 10 Bra 2
+ 13 b
+ 15 4 Recurse
+ 18 c
+ 20 10 Ket
+ 23 19 Ket
+ 26 26 Ket
+ 29 End
------------------------------------------------------------------
Capturing subpattern count = 2
Options: utf8
No first char
No need char
-/(\x{100}(b(?1)c)){0,2}/DZ8
+/(\x{100}(b(?1)c)){0,2}/D8
------------------------------------------------------------------
- Bra
- Brazero
- Bra
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Brazero
- CBra 1
- \x{100}
- CBra 2
- b
- Once
- Recurse
- Ket
- c
- Ket
- Ket
- Ket
- Ket
- End
+ 0 55 Bra 0
+ 3 Brazero
+ 4 48 Bra 0
+ 7 19 Bra 1
+ 10 \x{100}
+ 13 10 Bra 2
+ 16 b
+ 18 7 Recurse
+ 21 c
+ 23 10 Ket
+ 26 19 Ket
+ 29 Brazero
+ 30 19 Bra 1
+ 33 \x{100}
+ 36 10 Bra 2
+ 39 b
+ 41 7 Recurse
+ 44 c
+ 46 10 Ket
+ 49 19 Ket
+ 52 48 Ket
+ 55 55 Ket
+ 58 End
------------------------------------------------------------------
Capturing subpattern count = 2
Options: utf8
@@ -1083,516 +1072,4 @@
\x{100}X
0: X
-/a\x{1234}b/P8
- a\x{1234}b
- 0: a\x{1234}b
-
-/^\á´/8DZ
-------------------------------------------------------------------
- Bra
- ^
- \x{1234}
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Options: anchored utf8
-No first char
-No need char
-
-/\777/I
-Failed: octal value is greater than \377 (not in UTF-8 mode) at offset 3
-
-/\777/8I
-Capturing subpattern count = 0
-Options: utf8
-First char = 199
-Need char = 191
- \x{1ff}
- 0: \x{1ff}
- \777
- 0: \x{1ff}
-
-/\x{100}*\d/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*+
- \d
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}*\s/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*+
- \s
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}*\w/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*+
- \w
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}*\D/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*
- \D
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}*\S/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*
- \S
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}*\W/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}*
- \W
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-No first char
-No need char
-
-/\x{100}+\x{200}/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}++
- \x{200}
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-First char = 196
-Need char = 128
-
-/\x{100}+X/8DZ
-------------------------------------------------------------------
- Bra
- \x{100}++
- X
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-First char = 196
-Need char = 'X'
-
-/X+\x{200}/8DZ
-------------------------------------------------------------------
- Bra
- X++
- \x{200}
- Ket
- End
-------------------------------------------------------------------
-Capturing subpattern count = 0
-Partial matching not supported
-Options: utf8
-First char = 'X'
-Need char = 128
-
-/()()()()()()()()()()
- ()()()()()()()()()()
- ()()()()()()()()()()
- ()()()()()()()()()()
- A (x) (?41) B/8x
- AxxB
-Matched, but too many substrings
- 0: AxxB
- 1:
- 2:
- 3:
- 4:
- 5:
- 6:
- 7:
- 8:
- 9:
-10:
-11:
-12:
-13:
-14:
-
-/^[\x{100}\E-\Q\E\x{150}]/BZ8
-------------------------------------------------------------------
- Bra
- ^
- [\x{100}-\x{150}]
- Ket
- End
-------------------------------------------------------------------
-
-/^[\QÄ\E-\QÅ\E]/BZ8
-------------------------------------------------------------------
- Bra
- ^
- [\x{100}-\x{150}]
- Ket
- End
-------------------------------------------------------------------
-
-/^[\QÄ\E-\QÅ\E/BZ8
-Failed: missing terminating ] for character class at offset 15
-
-/^abc./mgx8<any>
- abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x{0085}abc7 \x{2028}abc8 \x{2029}abc9 JUNK
- 0: abc1
- 0: abc2
- 0: abc3
- 0: abc4
- 0: abc5
- 0: abc6
- 0: abc7
- 0: abc8
- 0: abc9
-
-/abc.$/mgx8<any>
- abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x{0085} abc7\x{2028} abc8\x{2029} abc9
- 0: abc1
- 0: abc2
- 0: abc3
- 0: abc4
- 0: abc5
- 0: abc6
- 0: abc7
- 0: abc8
- 0: abc9
-
-/^a\Rb/8<bsr_unicode>
- a\nb
- 0: a\x{0a}b
- a\rb
- 0: a\x{0d}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- a\x0bb
- 0: a\x{0b}b
- a\x0cb
- 0: a\x{0c}b
- a\x{85}b
- 0: a\x{85}b
- a\x{2028}b
- 0: a\x{2028}b
- a\x{2029}b
- 0: a\x{2029}b
- ** Failers
-No match
- a\n\rb
-No match
-
-/^a\R*b/8<bsr_unicode>
- ab
- 0: ab
- a\nb
- 0: a\x{0a}b
- a\rb
- 0: a\x{0d}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- a\x0bb
- 0: a\x{0b}b
- a\x0c\x{2028}\x{2029}b
- 0: a\x{0c}\x{2028}\x{2029}b
- a\x{85}b
- 0: a\x{85}b
- a\n\rb
- 0: a\x{0a}\x{0d}b
- a\n\r\x{85}\x0cb
- 0: a\x{0a}\x{0d}\x{85}\x{0c}b
-
-/^a\R+b/8<bsr_unicode>
- a\nb
- 0: a\x{0a}b
- a\rb
- 0: a\x{0d}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- a\x0bb
- 0: a\x{0b}b
- a\x0c\x{2028}\x{2029}b
- 0: a\x{0c}\x{2028}\x{2029}b
- a\x{85}b
- 0: a\x{85}b
- a\n\rb
- 0: a\x{0a}\x{0d}b
- a\n\r\x{85}\x0cb
- 0: a\x{0a}\x{0d}\x{85}\x{0c}b
- ** Failers
-No match
- ab
-No match
-
-/^a\R{1,3}b/8<bsr_unicode>
- a\nb
- 0: a\x{0a}b
- a\n\rb
- 0: a\x{0a}\x{0d}b
- a\n\r\x{85}b
- 0: a\x{0a}\x{0d}\x{85}b
- a\r\n\r\nb
- 0: a\x{0d}\x{0a}\x{0d}\x{0a}b
- a\r\n\r\n\r\nb
- 0: a\x{0d}\x{0a}\x{0d}\x{0a}\x{0d}\x{0a}b
- a\n\r\n\rb
- 0: a\x{0a}\x{0d}\x{0a}\x{0d}b
- a\n\n\r\nb
- 0: a\x{0a}\x{0a}\x{0d}\x{0a}b
- ** Failers
-No match
- a\n\n\n\rb
-No match
- a\r
-No match
-
-/\H\h\V\v/8
- X X\x0a
- 0: X X\x{0a}
- X\x09X\x0b
- 0: X\x{09}X\x{0b}
- ** Failers
-No match
- \x{a0} X\x0a
-No match
-
-/\H*\h+\V?\v{3,4}/8
- \x09\x20\x{a0}X\x0a\x0b\x0c\x0d\x0a
- 0: \x{09} \x{a0}X\x{0a}\x{0b}\x{0c}\x{0d}
- \x09\x20\x{a0}\x0a\x0b\x0c\x0d\x0a
- 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}\x{0d}
- \x09\x20\x{a0}\x0a\x0b\x0c
- 0: \x{09} \x{a0}\x{0a}\x{0b}\x{0c}
- ** Failers
-No match
- \x09\x20\x{a0}\x0a\x0b
-No match
-
-/\H\h\V\v/8
- \x{3001}\x{3000}\x{2030}\x{2028}
- 0: \x{3001}\x{3000}\x{2030}\x{2028}
- X\x{180e}X\x{85}
- 0: X\x{180e}X\x{85}
- ** Failers
-No match
- \x{2009} X\x0a
-No match
-
-/\H*\h+\V?\v{3,4}/8
- \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x0c\x0d\x0a
- 0: \x{1680}\x{180e}\x{2007}X\x{2028}\x{2029}\x{0c}\x{0d}
- \x09\x{205f}\x{a0}\x0a\x{2029}\x0c\x{2028}\x0a
- 0: \x{09}\x{205f}\x{a0}\x{0a}\x{2029}\x{0c}\x{2028}
- \x09\x20\x{202f}\x0a\x0b\x0c
- 0: \x{09} \x{202f}\x{0a}\x{0b}\x{0c}
- ** Failers
-No match
- \x09\x{200a}\x{a0}\x{2028}\x0b
-No match
-
-/[\h]/8BZ
-------------------------------------------------------------------
- Bra
- [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]
- Ket
- End
-------------------------------------------------------------------
- >\x{1680}
- 0: \x{1680}
-
-/[\h]{3,}/8BZ
-------------------------------------------------------------------
- Bra
- [\x09 \xa0\x{1680}\x{180e}\x{2000}-\x{200a}\x{202f}\x{205f}\x{3000}]{3,}
- Ket
- End
-------------------------------------------------------------------
- >\x{1680}\x{180e}\x{2000}\x{2003}\x{200a}\x{202f}\x{205f}\x{3000}<
- 0: \x{1680}\x{180e}\x{2000}\x{2003}\x{200a}\x{202f}\x{205f}\x{3000}
-
-/[\v]/8BZ
-------------------------------------------------------------------
- Bra
- [\x0a-\x0d\x85\x{2028}-\x{2029}]
- Ket
- End
-------------------------------------------------------------------
-
-/[\H]/8BZ
-------------------------------------------------------------------
- Bra
- [\x00-\x08\x0a-\x1f!-\x9f\xa1-\xff\x{100}-\x{167f}\x{1681}-\x{180d}\x{180f}-\x{1fff}\x{200b}-\x{202e}\x{2030}-\x{205e}\x{2060}-\x{2fff}\x{3001}-\x{7fffffff}]
- Ket
- End
-------------------------------------------------------------------
-
-/[\V]/8BZ
-------------------------------------------------------------------
- Bra
- [\x00-\x09\x0e-\x84\x86-\xff\x{100}-\x{2027}\x{2029}-\x{7fffffff}]
- Ket
- End
-------------------------------------------------------------------
-
-/.*$/8<any>
- \x{1ec5}
- 0: \x{1ec5}
-
-/-- This tests the stricter UTF-8 check according to RFC 3629. --/
-
-/X/8
- \x{0}\x{d7ff}\x{e000}\x{10ffff}
-No match
- \x{d800}
-Error -10
- \x{d800}\?
-No match
- \x{da00}
-Error -10
- \x{da00}\?
-No match
- \x{dfff}
-Error -10
- \x{dfff}\?
-No match
- \x{110000}
-Error -10
- \x{110000}\?
-No match
- \x{2000000}
-Error -10
- \x{2000000}\?
-No match
- \x{7fffffff}
-Error -10
- \x{7fffffff}\?
-No match
-
-/a\Rb/I8<bsr_anycrlf>
-Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
-First char = 'a'
-Need char = 'b'
- a\rb
- 0: a\x{0d}b
- a\nb
- 0: a\x{0a}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- ** Failers
-No match
- a\x{85}b
-No match
- a\x0bb
-No match
-
-/a\Rb/I8<bsr_unicode>
-Capturing subpattern count = 0
-Options: bsr_unicode utf8
-First char = 'a'
-Need char = 'b'
- a\rb
- 0: a\x{0d}b
- a\nb
- 0: a\x{0a}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- a\x{85}b
- 0: a\x{85}b
- a\x0bb
- 0: a\x{0b}b
- ** Failers
-No match
- a\x{85}b\<bsr_anycrlf>
-No match
- a\x0bb\<bsr_anycrlf>
-No match
-
-/a\R?b/I8<bsr_anycrlf>
-Capturing subpattern count = 0
-Options: bsr_anycrlf utf8
-First char = 'a'
-Need char = 'b'
- a\rb
- 0: a\x{0d}b
- a\nb
- 0: a\x{0a}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- ** Failers
-No match
- a\x{85}b
-No match
- a\x0bb
-No match
-
-/a\R?b/I8<bsr_unicode>
-Capturing subpattern count = 0
-Options: bsr_unicode utf8
-First char = 'a'
-Need char = 'b'
- a\rb
- 0: a\x{0d}b
- a\nb
- 0: a\x{0a}b
- a\r\nb
- 0: a\x{0d}\x{0a}b
- a\x{85}b
- 0: a\x{85}b
- a\x0bb
- 0: a\x{0b}b
- ** Failers
-No match
- a\x{85}b\<bsr_anycrlf>
-No match
- a\x0bb\<bsr_anycrlf>
-No match
-
/ End of testinput5 /
Modified: httpd/httpd/vendor/pcre/current/testdata/testoutput6
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/testdata/testoutput6?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/testdata/testoutput6 (original)
+++ httpd/httpd/vendor/pcre/current/testdata/testoutput6 Mon Nov 26 09:04:19 2007
@@ -1,3 +1,5 @@
+PCRE version 5.0 13-Sep-2004
+
/^\pC\pL\pM\pN\pP\pS\pZ</8
\x7f\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
0: \x{7f}\x{c0}\x{30f}\x{660}\x{66c}\x{f01}\x{1680}<
@@ -83,8 +85,6 @@
No match
/^\p{Cn}/8
- \x{e0000}
- 0: \x{e0000}
** Failers
No match
\x{09f}
@@ -99,7 +99,7 @@
No match
/^\p{Cs}/8
- \?\x{dfff}
+ \x{dfff}
0: \x{dfff}
** Failers
No match
@@ -113,7 +113,7 @@
No match
Z
No match
- \x{e000}
+ \x{dfff}
No match
/^\p{Lm}/8
@@ -127,24 +127,12 @@
/^\p{Lo}/8
\x{1bb}
0: \x{1bb}
- \x{3400}
- 0: \x{3400}
- \x{3401}
- 0: \x{3401}
- \x{4d00}
- 0: \x{4d00}
- \x{4db4}
- 0: \x{4db4}
- \x{4db5}
- 0: \x{4db5}
** Failers
No match
a
No match
\x{2b0}
No match
- \x{4db6}
-No match
/^\p{Lt}/8
\x{1c5}
@@ -548,72 +536,73 @@
WXYZ
No match
-/[\p{L}]/DZ
+/[\p{L}]/D
------------------------------------------------------------------
- Bra
- [\p{L}]
- Ket
- End
+ 0 10 Bra 0
+ 3 [\p{L}]
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[\p{^L}]/DZ
+/[\p{^L}]/D
------------------------------------------------------------------
- Bra
- [\P{L}]
- Ket
- End
+ 0 10 Bra 0
+ 3 [\P{L}]
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[\P{L}]/DZ
+/[\P{L}]/D
------------------------------------------------------------------
- Bra
- [\P{L}]
- Ket
- End
+ 0 10 Bra 0
+ 3 [\P{L}]
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[\P{^L}]/DZ
+/[\P{^L}]/D
------------------------------------------------------------------
- Bra
- [\p{L}]
- Ket
- End
+ 0 10 Bra 0
+ 3 [\p{L}]
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
-/[abc\p{L}\x{0660}]/8DZ
+/[abc\p{L}\x{0660}]/8D
------------------------------------------------------------------
- Bra
- [a-c\p{L}\x{660}]
- Ket
- End
+ 0 45 Bra 0
+ 3 [a-c\p{L}\x{660}]
+ 45 45 Ket
+ 48 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
No first char
No need char
-/[\p{Nd}]/8DZ
+/[\p{Nd}]/8DM
+Memory allocation (code space): 46
------------------------------------------------------------------
- Bra
- [\p{Nd}]
- Ket
- End
+ 0 10 Bra 0
+ 3 [\p{Nd}]
+ 10 10 Ket
+ 13 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
@@ -622,12 +611,13 @@
1234
0: 1
-/[\p{Nd}+-]+/8DZ
+/[\p{Nd}+-]+/8DM
+Memory allocation (code space): 47
------------------------------------------------------------------
- Bra
- [+\-\p{Nd}]+
- Ket
- End
+ 0 43 Bra 0
+ 3 [+\-\p{Nd}]+
+ 43 43 Ket
+ 46 End
------------------------------------------------------------------
Capturing subpattern count = 0
Partial matching not supported
@@ -777,48 +767,48 @@
A\x{391}\x{10427}\x{ff3a}\x{1fb8}
0: A\x{391}\x{10427}\x{ff3a}\x{1fb8}
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iDZ
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8iD
------------------------------------------------------------------
- Bra
- NC A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- Ket
- End
+ 0 21 Bra 0
+ 3 NC A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 21 21 Ket
+ 24 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf8
First char = 'A' (caseless)
No need char
-/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8DZ
+/A\x{391}\x{10427}\x{ff3a}\x{1fb0}/8D
------------------------------------------------------------------
- Bra
- A\x{391}\x{10427}\x{ff3a}\x{1fb0}
- Ket
- End
+ 0 21 Bra 0
+ 3 A\x{391}\x{10427}\x{ff3a}\x{1fb0}
+ 21 21 Ket
+ 24 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 'A'
Need char = 176
-/AB\x{1fb0}/8DZ
+/AB\x{1fb0}/8D
------------------------------------------------------------------
- Bra
- AB\x{1fb0}
- Ket
- End
+ 0 11 Bra 0
+ 3 AB\x{1fb0}
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: utf8
First char = 'A'
Need char = 176
-/AB\x{1fb0}/8DZi
+/AB\x{1fb0}/8Di
------------------------------------------------------------------
- Bra
- NC AB\x{1fb0}
- Ket
- End
+ 0 11 Bra 0
+ 3 NC AB\x{1fb0}
+ 11 11 Ket
+ 14 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf8
@@ -855,12 +845,12 @@
\x{e0}
0: \x{e0}
-/[\x{105}-\x{109}]/8iDZ
+/[\x{105}-\x{109}]/8iD
------------------------------------------------------------------
- Bra
- [\x{104}-\x{109}]
- Ket
- End
+ 0 13 Bra 0
+ 3 [\x{104}-\x{109}]
+ 13 13 Ket
+ 16 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf8
@@ -879,12 +869,12 @@
\x{10a}
No match
-/[z-\x{100}]/8iDZ
+/[z-\x{100}]/8iD
------------------------------------------------------------------
- Bra
- [Z\x{39c}\x{178}z-\x{101}]
- Ket
- End
+ 0 20 Bra 0
+ 3 [Z\x{39c}\x{178}z-\x{101}]
+ 20 20 Ket
+ 23 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf8
@@ -917,12 +907,12 @@
y
No match
-/[z-\x{100}]/8DZi
+/[z-\x{100}]/8Di
------------------------------------------------------------------
- Bra
- [Z\x{39c}\x{178}z-\x{101}]
- Ket
- End
+ 0 20 Bra 0
+ 3 [Z\x{39c}\x{178}z-\x{101}]
+ 20 20 Ket
+ 23 End
------------------------------------------------------------------
Capturing subpattern count = 0
Options: caseless utf8
@@ -1020,506 +1010,4 @@
0: A\x{300}\x{301}B\x{300}C
1: C
-/^\p{Han}+/8
- \x{2e81}\x{3007}\x{2f804}\x{31a0}
- 0: \x{2e81}\x{3007}\x{2f804}
- ** Failers
-No match
- \x{2e7f}
-No match
-
-/^\P{Katakana}+/8
- \x{3105}
- 0: \x{3105}
- ** Failers
- 0: ** Failers
- \x{30ff}
-No match
-
-/^[\p{Arabic}]/8
- \x{06e9}
- 0: \x{6e9}
- \x{060b}
- 0: \x{60b}
- ** Failers
-No match
- X\x{06e9}
-No match
-
-/^[\P{Yi}]/8
- \x{2f800}
- 0: \x{2f800}
- ** Failers
- 0: *
- \x{a014}
-No match
- \x{a4c6}
-No match
-
-/^\p{Any}X/8
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- ** Failers
-No match
- X
-No match
-
-/^\P{Any}X/8
- ** Failers
-No match
- AX
-No match
-
-/^\p{Any}?X/8
- XYZ
- 0: X
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- ** Failers
-No match
- ABXYZ
-No match
-
-/^\P{Any}?X/8
- XYZ
- 0: X
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- ABXYZ
-No match
-
-/^\p{Any}+X/8
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- A\x{1234}XYZ
- 0: A\x{1234}X
- ** Failers
-No match
- XYZ
-No match
-
-/^\P{Any}+X/8
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- A\x{1234}XYZ
-No match
- XYZ
-No match
-
-/^\p{Any}*X/8
- XYZ
- 0: X
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- A\x{1234}XYZ
- 0: A\x{1234}X
- ** Failers
-No match
-
-/^\P{Any}*X/8
- XYZ
- 0: X
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- A\x{1234}XYZ
-No match
-
-/^[\p{Any}]X/8
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- ** Failers
-No match
- X
-No match
-
-/^[\P{Any}]X/8
- ** Failers
-No match
- AX
-No match
-
-/^[\p{Any}]?X/8
- XYZ
- 0: X
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- ** Failers
-No match
- ABXYZ
-No match
-
-/^[\P{Any}]?X/8
- XYZ
- 0: X
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- ABXYZ
-No match
-
-/^[\p{Any}]+X/8
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- A\x{1234}XYZ
- 0: A\x{1234}X
- ** Failers
-No match
- XYZ
-No match
-
-/^[\P{Any}]+X/8
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- A\x{1234}XYZ
-No match
- XYZ
-No match
-
-/^[\p{Any}]*X/8
- XYZ
- 0: X
- AXYZ
- 0: AX
- \x{1234}XYZ
- 0: \x{1234}X
- A\x{1234}XYZ
- 0: A\x{1234}X
- ** Failers
-No match
-
-/^[\P{Any}]*X/8
- XYZ
- 0: X
- ** Failers
-No match
- AXYZ
-No match
- \x{1234}XYZ
-No match
- A\x{1234}XYZ
-No match
-
-/^\p{Any}{3,5}?/8
- abcdefgh
- 0: abc
- \x{1234}\n\r\x{3456}xyz
- 0: \x{1234}\x{0a}\x{0d}
-
-/^\p{Any}{3,5}/8
- abcdefgh
- 0: abcde
- \x{1234}\n\r\x{3456}xyz
- 0: \x{1234}\x{0a}\x{0d}\x{3456}x
-
-/^\P{Any}{3,5}?/8
- ** Failers
-No match
- abcdefgh
-No match
- \x{1234}\n\r\x{3456}xyz
-No match
-
-/^\p{L&}X/8
- AXY
- 0: AX
- aXY
- 0: aX
- \x{1c5}XY
- 0: \x{1c5}X
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^[\p{L&}]X/8
- AXY
- 0: AX
- aXY
- 0: aX
- \x{1c5}XY
- 0: \x{1c5}X
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^\p{L&}+X/8
- AXY
- 0: AX
- aXY
- 0: aX
- AbcdeXyz
- 0: AbcdeX
- \x{1c5}AbXY
- 0: \x{1c5}AbX
- abcDEXypqreXlmn
- 0: abcDEXypqreX
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^[\p{L&}]+X/8
- AXY
- 0: AX
- aXY
- 0: aX
- AbcdeXyz
- 0: AbcdeX
- \x{1c5}AbXY
- 0: \x{1c5}AbX
- abcDEXypqreXlmn
- 0: abcDEXypqreX
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^\p{L&}+?X/8
- AXY
- 0: AX
- aXY
- 0: aX
- AbcdeXyz
- 0: AbcdeX
- \x{1c5}AbXY
- 0: \x{1c5}AbX
- abcDEXypqreXlmn
- 0: abcDEX
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^[\p{L&}]+?X/8
- AXY
- 0: AX
- aXY
- 0: aX
- AbcdeXyz
- 0: AbcdeX
- \x{1c5}AbXY
- 0: \x{1c5}AbX
- abcDEXypqreXlmn
- 0: abcDEX
- ** Failers
-No match
- \x{1bb}XY
-No match
- \x{2b0}XY
-No match
- !XY
-No match
-
-/^\P{L&}X/8
- !XY
- 0: !X
- \x{1bb}XY
- 0: \x{1bb}X
- \x{2b0}XY
- 0: \x{2b0}X
- ** Failers
-No match
- \x{1c5}XY
-No match
- AXY
-No match
-
-/^[\P{L&}]X/8
- !XY
- 0: !X
- \x{1bb}XY
- 0: \x{1bb}X
- \x{2b0}XY
- 0: \x{2b0}X
- ** Failers
-No match
- \x{1c5}XY
-No match
- AXY
-No match
-
-/^(\p{Z}[^\p{C}\p{Z}]+)*$/
- \xa0!
- 0: \xa0!
- 1: \xa0!
-
-/^[\pL](abc)(?1)/
- AabcabcYZ
- 0: Aabcabc
- 1: abc
-
-/([\pL]=(abc))*X/
- L=abcX
- 0: L=abcX
- 1: L=abc
- 2: abc
-
-/The next two should be Perl-compatible, but it fails to match \x{e0}. PCRE
-will match it only with UCP support, because without that it has no notion
-of case for anything other than the ASCII letters. /
-
-/((?i)[\x{c0}])/8
- \x{c0}
- 0: \x{c0}
- 1: \x{c0}
- \x{e0}
- 0: \x{e0}
- 1: \x{e0}
-
-/(?i:[\x{c0}])/8
- \x{c0}
- 0: \x{c0}
- \x{e0}
- 0: \x{e0}
-
-/^\p{Balinese}\p{Cuneiform}\p{Nko}\p{Phags_Pa}\p{Phoenician}/8
- \x{1b00}\x{12000}\x{7c0}\x{a840}\x{10900}
- 0: \x{1b00}\x{12000}\x{7c0}\x{a840}\x{10900}
-
-/The next two are special cases where the lengths of the different cases of the
-same character differ. The first went wrong with heap fram storage; the 2nd
-was broken in all cases./
-
-/^\x{023a}+?(\x{0130}+)/8i
- \x{023a}\x{2c65}\x{0130}
- 0: \x{23a}\x{2c65}\x{130}
- 1: \x{130}
-
-/^\x{023a}+([^X])/8i
- \x{023a}\x{2c65}X
- 0: \x{23a}\x{2c65}
- 1: \x{2c65}
-
-/Check property support in non-UTF-8 mode/
-
-/\p{L}{4}/
- 123abcdefg
- 0: abcd
- 123abc\xc4\xc5zz
- 0: abc\xc4
-
-/\X{1,3}\d/
- \x8aBCD
-No match
-
-/\X?\d/
- \x8aBCD
-No match
-
-/\P{L}?\d/
- \x8aBCD
-No match
-
-/[\PPP\x8a]{1,}\x80/
- A\x80
- 0: A\x80
-
-/(?:[\PPa*]*){8,}/
-
-/[\P{Any}]/BZ
-------------------------------------------------------------------
- Bra
- [\P{Any}]
- Ket
- End
-------------------------------------------------------------------
-
-/[\P{Any}\E]/BZ
-------------------------------------------------------------------
- Bra
- [\P{Any}]
- Ket
- End
-------------------------------------------------------------------
-
-/(\P{Yi}+\277)/
-
-/(\P{Yi}+\277)?/
-
-/(?<=\P{Yi}{3}A)X/
-
-/\p{Yi}+(\P{Yi}+)(?1)/
-
-/(\P{Yi}{2}\277)?/
-
-/[\P{Yi}A]/
-
-/[\P{Yi}\P{Yi}\P{Yi}A]/
-
-/[^\P{Yi}A]/
-
-/[^\P{Yi}\P{Yi}\P{Yi}A]/
-
-/(\P{Yi}*\277)*/
-
-/(\P{Yi}*?\277)*/
-
-/(\p{Yi}*+\277)*/
-
-/(\P{Yi}?\277)*/
-
-/(\P{Yi}??\277)*/
-
-/(\p{Yi}?+\277)*/
-
-/(\P{Yi}{0,3}\277)*/
-
-/(\P{Yi}{0,3}?\277)*/
-
-/(\p{Yi}{0,3}+\277)*/
-
/ End of testinput6 /
Modified: httpd/httpd/vendor/pcre/current/ucp.h
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/ucp.h?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/ucp.h (original)
+++ httpd/httpd/vendor/pcre/current/ucp.h Mon Nov 26 09:04:19 2007
@@ -1,16 +1,8 @@
/*************************************************
-* Unicode Property Table handler *
+* libucp - Unicode Property Table handler *
*************************************************/
-#ifndef _UCP_H
-#define _UCP_H
-
-/* This file contains definitions of the property values that are returned by
-the function _pcre_ucp_findprop(). New values that are added for new releases
-of Unicode should always be at the end of each enum, for backwards
-compatibility. */
-
-/* These are the general character categories. */
+/* These are the character categories that are returned by ucp_findchar */
enum {
ucp_C, /* Other */
@@ -22,7 +14,7 @@
ucp_Z /* Separator */
};
-/* These are the particular character types. */
+/* These are the detailed character types that are returned by ucp_findchar */
enum {
ucp_Cc, /* Control */
@@ -57,77 +49,10 @@
ucp_Zs /* Space separator */
};
-/* These are the script identifications. */
-
-enum {
- ucp_Arabic,
- ucp_Armenian,
- ucp_Bengali,
- ucp_Bopomofo,
- ucp_Braille,
- ucp_Buginese,
- ucp_Buhid,
- ucp_Canadian_Aboriginal,
- ucp_Cherokee,
- ucp_Common,
- ucp_Coptic,
- ucp_Cypriot,
- ucp_Cyrillic,
- ucp_Deseret,
- ucp_Devanagari,
- ucp_Ethiopic,
- ucp_Georgian,
- ucp_Glagolitic,
- ucp_Gothic,
- ucp_Greek,
- ucp_Gujarati,
- ucp_Gurmukhi,
- ucp_Han,
- ucp_Hangul,
- ucp_Hanunoo,
- ucp_Hebrew,
- ucp_Hiragana,
- ucp_Inherited,
- ucp_Kannada,
- ucp_Katakana,
- ucp_Kharoshthi,
- ucp_Khmer,
- ucp_Lao,
- ucp_Latin,
- ucp_Limbu,
- ucp_Linear_B,
- ucp_Malayalam,
- ucp_Mongolian,
- ucp_Myanmar,
- ucp_New_Tai_Lue,
- ucp_Ogham,
- ucp_Old_Italic,
- ucp_Old_Persian,
- ucp_Oriya,
- ucp_Osmanya,
- ucp_Runic,
- ucp_Shavian,
- ucp_Sinhala,
- ucp_Syloti_Nagri,
- ucp_Syriac,
- ucp_Tagalog,
- ucp_Tagbanwa,
- ucp_Tai_Le,
- ucp_Tamil,
- ucp_Telugu,
- ucp_Thaana,
- ucp_Thai,
- ucp_Tibetan,
- ucp_Tifinagh,
- ucp_Ugaritic,
- ucp_Yi,
- ucp_Balinese, /* New for Unicode 5.0.0 */
- ucp_Cuneiform, /* New for Unicode 5.0.0 */
- ucp_Nko, /* New for Unicode 5.0.0 */
- ucp_Phags_Pa, /* New for Unicode 5.0.0 */
- ucp_Phoenician /* New for Unicode 5.0.0 */
-};
+/* For use in PCRE we make this function static so that there is no conflict if
+PCRE is linked with an application that makes use of an external version -
+assuming an external version is ever released... */
-#endif
+static int ucp_findchar(const int, int *, int *);
/* End of ucp.h */
Modified: httpd/httpd/vendor/pcre/current/ucpinternal.h
URL: http://svn.apache.org/viewvc/httpd/httpd/vendor/pcre/current/ucpinternal.h?rev=598343&r1=598342&r2=598343&view=diff
==============================================================================
--- httpd/httpd/vendor/pcre/current/ucpinternal.h (original)
+++ httpd/httpd/vendor/pcre/current/ucpinternal.h Mon Nov 26 09:04:19 2007
@@ -1,92 +1,91 @@
/*************************************************
-* Unicode Property Table handler *
+* libucp - Unicode Property Table handler *
*************************************************/
-#ifndef _UCPINTERNAL_H
-#define _UCPINTERNAL_H
-
-/* Internal header file defining the layout of the bits in each pair of 32-bit
-words that form a data item in the table. */
+/* Internal header file defining the layout of compact nodes in the tree. */
typedef struct cnode {
- pcre_uint32 f0;
- pcre_uint32 f1;
+ unsigned short int f0;
+ unsigned short int f1;
+ unsigned short int f2;
} cnode;
/* Things for the f0 field */
-#define f0_scriptmask 0xff000000 /* Mask for script field */
-#define f0_scriptshift 24 /* Shift for script value */
-#define f0_rangeflag 0x00f00000 /* Flag for a range item */
-#define f0_charmask 0x001fffff /* Mask for code point value */
-
-/* Things for the f1 field */
-
-#define f1_typemask 0xfc000000 /* Mask for char type field */
-#define f1_typeshift 26 /* Shift for the type field */
-#define f1_rangemask 0x0000ffff /* Mask for a range offset */
-#define f1_casemask 0x0000ffff /* Mask for a case offset */
-#define f1_caseneg 0xffff8000 /* Bits for negation */
-
-/* The data consists of a vector of structures of type cnode. The two unsigned
-32-bit integers are used as follows:
-
-(f0) (1) The most significant byte holds the script number. The numbers are
- defined by the enum in ucp.h.
-
- (2) The 0x00800000 bit is set if this entry defines a range of characters.
- It is not set if this entry defines a single character
-
- (3) The 0x00600000 bits are spare.
-
- (4) The 0x001fffff bits contain the code point. No Unicode code point will
- ever be greater than 0x0010ffff, so this should be OK for ever.
-
-(f1) (1) The 0xfc000000 bits contain the character type number. The numbers are
- defined by an enum in ucp.h.
-
- (2) The 0x03ff0000 bits are spare.
-
- (3) The 0x0000ffff bits contain EITHER the unsigned offset to the top of
- range if this entry defines a range, OR the *signed* offset to the
- character's "other case" partner if this entry defines a single
- character. There is no partner if the value is zero.
-
--------------------------------------------------------------------------------
-| script (8) |.|.|.| codepoint (21) || type (6) |.|.| spare (8) | offset (16) |
--------------------------------------------------------------------------------
- | | | | |
- | | |-> spare | |-> spare
- | | |
- | |-> spare |-> spare
- |
- |-> range flag
+#define f0_leftexists 0x8000 /* Left child exists */
+#define f0_typemask 0x3f00 /* Type bits */
+#define f0_typeshift 8 /* Type shift */
+#define f0_chhmask 0x00ff /* Character high bits */
+
+/* Things for the f2 field */
+
+#define f2_rightmask 0xf000 /* Mask for right offset bits */
+#define f2_rightshift 12 /* Shift for right offset */
+#define f2_casemask 0x0fff /* Mask for case offset */
+
+/* The tree consists of a vector of structures of type cnode, with the root
+node as the first element. The three short ints (16-bits) are used as follows:
+
+(f0) (1) The 0x8000 bit of f0 is set if a left child exists. The child's node
+ is the next node in the vector.
+ (2) The 0x4000 bits of f0 is spare.
+ (3) The 0x3f00 bits of f0 contain the character type; this is a number
+ defined by the enumeration in ucp.h (e.g. ucp_Lu).
+ (4) The bottom 8 bits of f0 contain the most significant byte of the
+ character's 24-bit codepoint.
+
+(f1) (1) The f1 field contains the two least significant bytes of the
+ codepoint.
+
+(f2) (1) The 0xf000 bits of f2 contain zero if there is no right child of this
+ node. Otherwise, they contain one plus the exponent of the power of
+ two of the offset to the right node (e.g. a value of 3 means 8). The
+ units of the offset are node items.
+
+ (2) The 0x0fff bits of f2 contain the signed offset from this character to
+ its alternate cased value. They are zero if there is no such
+ character.
+
+
+-----------------------------------------------------------------------------
+||.|.| type (6) | ms char (8) || ls char (16) ||....| case offset (12) ||
+-----------------------------------------------------------------------------
+ | | |
+ | |-> spare |
+ | exponent of right
+ |-> left child exists child offset
+
The upper/lower casing information is set only for characters that come in
-pairs. The non-one-to-one mappings in the Unicode data are ignored.
+pairs. There are (at present) four non-one-to-one mappings in the Unicode data.
+These are ignored. They are:
+
+ 1FBE Greek Prosgegrammeni (lower, with upper -> capital iota)
+ 2126 Ohm
+ 212A Kelvin
+ 212B Angstrom
-When searching the data, proceed as follows:
+Certainly for the last three, having an alternate case would seem to be a
+mistake. I don't know any Greek, so cannot comment on the first one.
-(1) Set up for a binary chop search.
-(2) If the top is not greater than the bottom, the character is not in the
- table. Its type must therefore be "Cn" ("Undefined").
+When searching the tree, proceed as follows:
-(3) Find the middle vector element.
+(1) Start at the first node.
-(4) Extract the code point and compare. If equal, we are done.
+(2) Extract the character value from f1 and the bottom 8 bits of f0;
-(5) If the test character is smaller, set the top to the current point, and
- goto (2).
+(3) Compare with the character being sought. If equal, we are done.
-(6) If the current entry defines a range, compute the last character by adding
- the offset, and see if the test character is within the range. If it is,
- we are done.
+(4) If the test character is smaller, inspect the f0_leftexists flag. If it is
+ not set, the character is not in the tree. If it is set, move to the next
+ node, and go to (2).
-(7) Otherwise, set the bottom to one element past the current point and goto
- (2).
+(5) If the test character is bigger, extract the f2_rightmask bits from f2, and
+ shift them right by f2_rightshift. If the result is zero, the character is
+ not in the tree. Otherwise, calculate the number of nodes to skip by
+ shifting the value 1 left by this number minus one. Go to (2).
*/
-#endif /* _UCPINTERNAL_H */
-/* End of ucpinternal.h */
+/* End of internal.h */