You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@turbine.apache.org by Jindrich Vimr <vi...@hsf.cz> on 2006/01/10 12:25:34 UTC

[PATCH] - Intake - unicode regexps

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

for our turbine-based application we started use intake for form fields
validation. Our customers use non-Latin charsets for data entered into
form fields, but we were unable to validate them using intake.
Intake currently uses ORO for regexp validation, but ORO doesn't handle
the unicode perl5 regexps (like \\p{L} for Letters in any encoding, not
only A-Za-z).
The sun's java.util.regexp. package these unicode regexps implement, so
I decided to rewrite the StringValidator in intake to use
java.util.regexp classes instead of ORO. The patch against turbine-2.3.2
 is attached bellow. We test this patch internally, but for now it seems
to work OK. This way turbine is able to handle unicode inputs checked by
intake.
If someone has some complains about choosing sun's java.util.regexp for
this patch, drop me a note. The only downside of choosing
java.util.regexp is the dependency on JDK1.4 or higher.



Jindrich Vimr

the patch:
- ---------------
vimr@bene2:/var/data/src/java/turbine> diff
turbine-unchanged/turbine-2.3.2/src/java/org/apache/turbine/services/intake/validator/StringValidator.java
turbine-2.3.2/src/java/org/apache/turbine/services/intake/validator/StringValidator.java

19a20,22
> import java.util.regex.Matcher;
> import java.util.regex.Pattern;
> import java.util.regex.PatternSyntaxException;
23,26d25
< import org.apache.oro.text.regex.MalformedPatternException;
< import org.apache.oro.text.regex.Pattern;
< import org.apache.oro.text.regex.Perl5Compiler;
< import org.apache.oro.text.regex.Perl5Matcher;
43a43
>  * @author <a href="mailto:vimr@hsf.cz">Jindrich Vimr HSF.cz</a> -
patch to use java.util.regexp
52c52
<     /** The compiled perl5 Regular expression from the ORO
Perl5Compiler */
- ---
>     /** The compiled perl5 Regular expression from the
java.util.regexp.Pattern */
142,143c142,143
<                 /** perl5 matcher */
<                 Perl5Matcher patternMatcher = new Perl5Matcher();
- ---
>                 /** java.util.regexp.Matcher */
>                 Matcher patternMatcher = maskPattern.matcher(testValue);
146c146
<                         patternMatcher.matches(testValue, maskPattern);
- ---
>                         patternMatcher.matches();
184d183
<         Perl5Compiler patternCompiler = new Perl5Compiler();
189c188
<         int maskOptions = Perl5Compiler.DEFAULT_MASK;
- ---
>         //int maskOptions = Perl5Compiler.DEFAULT_MASK;
194c193
<             maskPattern = patternCompiler.compile(maskString,
maskOptions);
- ---
>             maskPattern = Pattern.compile(maskString);
196c195
<         catch (MalformedPatternException mpe)
- ---
>         catch (PatternSyntaxException mpe)

- ---------------

- --
Jindrich Vimr                                      <vi...@hsf.cz>
HSF Sokolov, spol. s r.o.                  +420 724 293 903
Morseova 3, Plzen                            http://www.hsf.cz

GPG public key: http://shop.hsf.cz/gpg/vimr/gpgpubkey-vimr_at_hsf.cz.gpg
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDw5mu2n3TnEdZCPgRAorpAKDJv5fXAkgaBmIyv0+8tzo4qKxX/wCg4nQu
TE/nYM/A8VssjWCJwhOtcjQ=
=SRkk
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: turbine-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: turbine-dev-help@jakarta.apache.org


Re: [PATCH] - Intake - unicode regexps

Posted by Jindrich Vimr <vi...@hsf.cz>.
Hi again,

the attachment no contains the same patch, but created using "diff
-Naur" command:

diff -Naur
turbine-unchanged/turbine-2.3.2/src/java/org/apache/turbine/services/intake/validator/StringValidator.java
turbine-2.3.2-JV-20060109/turbine-2.3.2/src/java/org/apache/turbine/services/intake/validator/StringValidator.java
 > StringValidator.java.JV-20060109.patch


also patch to project.properties which changes rependency to JDK1.4 attached

Jindrich Vimr


Jindrich Vimr wrote:
> Hi,
> 
> for our turbine-based application we started use intake for form fields
> validation. Our customers use non-Latin charsets for data entered into
> form fields, but we were unable to validate them using intake.
> Intake currently uses ORO for regexp validation, but ORO doesn't handle
> the unicode perl5 regexps (like \\p{L} for Letters in any encoding, not
> only A-Za-z).
> The sun's java.util.regexp. package these unicode regexps implement, so
> I decided to rewrite the StringValidator in intake to use
> java.util.regexp classes instead of ORO. The patch against turbine-2.3.2
>  is attached bellow. We test this patch internally, but for now it seems
> to work OK. This way turbine is able to handle unicode inputs checked by
> intake.
> If someone has some complains about choosing sun's java.util.regexp for
> this patch, drop me a note. The only downside of choosing
> java.util.regexp is the dependency on JDK1.4 or higher.
> 
> 
> 
> Jindrich Vimr
> 
> the patch:






-- 
Jindrich Vimr                                      <vi...@hsf.cz>
HSF Sokolov, spol. s r.o.                  +420 724 293 903
Morseova 3, Plzen                            http://www.hsf.cz

GPG public key: http://shop.hsf.cz/gpg/vimr/gpgpubkey-vimr_at_hsf.cz.gpg