You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Sven Schliesing <sc...@subshell.com> on 2004/04/20 11:40:38 UTC

validator: email-validation not accepting german "umlaute"

As for version 1.1.1 the Jakarta Commons-Validator is not accepting 
german "umlaute" as parts of valid domain names like müller.de or münchen.de

Is this a known issue in validator or might this be a setting in Struts?

Thanks!

Sven Schliesing

PS: No problems with "umlaute" and struts in other parts.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: validator: email-validation not accepting german "umlaute"

Posted by Robert Leland <rl...@apache.org>.
Michael Davey wrote:

>
> So, there are at least three things going on here:
>
> 1.  The testcase above should fail in validator, but it doesn't 
> because the validation check isn't good enough.
> 2.  validator doesn't support punycode and doesn't support the 
> quoted-printable unicode encoding mechanism used in email addresses.
> 3.  The problem you describe in your emails.
>
> [1] could be fixed easily enough.  [2] could be fixed by enhancing 
> validator.  Your testcase shows that the problem isn't with validator, 
> so [1] and [2] are not really of consequence to you right now, but 
> they have got my interest.  After re-reading your original mail and 
> your latest mail together, I don't understand exactly what it is you 
> are trying to achieve - could you demonstrate with some code or 
> describe how you would demonstrate the problem to me?


If you look at the EmailValidator.java stripComments() there was an 
attempt to strip email comments based on a very succesfull perl 
script(Mail::RFC822::Address),
unfortunately it's only a partial translation. If you are familar with 
perls ~=  and can translate that to Java using ORO and javascript then 
you'll have part of solution [2].


>
> If you are now fairly sure that the problem lies within Struts, it may 
> be beneficial to post to the struts mailing list and copy me personally.
>
> Cheers,

-Rob


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: validator: email-validation not accepting german "umlaute"

Posted by Michael Davey <Mi...@coderage.org>.
Sven Schliesing wrote:

> I wrote a test to make sure where the problem is:
>
> public class ValidatorTest extends TestCase {
>     public void testEmail() {
>         EmailValidator emailValidator = EmailValidator.getInstance();
>         boolean result = emailValidator.isValid("test@müller.de");
>         assertTrue("invalid email", result);
>     }
> }
>
> Runs with success. So the address "test@müller.de" is validated by the 
> EmailValidator with success.
>
> Seems that the problem is with struts. I also explicitly set the 
> charset in the struts-config:
>
> <controller contentType="text/html;charset=iso-8859-1" 
> processorClass="org.apache.struts.action.RequestProcessor" />
>
> No change.
>
> Any other ideas?

So, there are at least three things going on here:

1.  The testcase above should fail in validator, but it doesn't because 
the validation check isn't good enough.
2.  validator doesn't support punycode and doesn't support the 
quoted-printable unicode encoding mechanism used in email addresses.
3.  The problem you describe in your emails.

[1] could be fixed easily enough.  [2] could be fixed by enhancing 
validator.  Your testcase shows that the problem isn't with validator, 
so [1] and [2] are not really of consequence to you right now, but they 
have got my interest.  After re-reading your original mail and your 
latest mail together, I don't understand exactly what it is you are 
trying to achieve - could you demonstrate with some code or describe how 
you would demonstrate the problem to me?

If you are now fairly sure that the problem lies within Struts, it may 
be beneficial to post to the struts mailing list and copy me personally.

Cheers,
-- 
Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: validator: email-validation not accepting german "umlaute"

Posted by Sven Schliesing <sc...@subshell.com>.
> Runs with success. So the address "test@müller.de" is validated by the 
> EmailValidator with success.

never mind, wrong charset in eclipse.
thanks anyway


Sven


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: validator: email-validation not accepting german "umlaute"

Posted by Sven Schliesing <sc...@subshell.com>.
I wrote a test to make sure where the problem is:

public class ValidatorTest extends TestCase {
     public void testEmail() {
         EmailValidator emailValidator = EmailValidator.getInstance();
         boolean result = emailValidator.isValid("test@müller.de");
         assertTrue("invalid email", result);
     }
}

Runs with success. So the address "test@müller.de" is validated by the 
EmailValidator with success.

Seems that the problem is with struts. I also explicitly set the charset 
in the struts-config:

<controller contentType="text/html;charset=iso-8859-1" 
processorClass="org.apache.struts.action.RequestProcessor" />

No change.

Any other ideas?


Michael Davey wrote:
  > Valid domain names must contain only the characters a-z, A-Z, 0-9, "."
> and "-".  They must start with a letter and end with
> a letter or digit.  The "." symbol is used exclusively to seperate 
> subdomains (see RFC 1035 section 2.3.1 
> <http://www.ietf.org/rfc/rfc1035.txt>).
> 
> To support internationalised domain names (IDN), both the client and the 
> server must be punycode aware.  Punycode is a fairly new standards 
> proposal (rfc3492) that encodes non-ascii characters into an ascii 
> string, prefixed with "xn--".  For instance, müller.de is encoded as 
> xn--mller-kva.de.
> 
> <http://www.faqs.org/rfcs/rfc3492.html>
> <http://www.afilias.info/cgi-bin/convert_punycode.cgi>
> 
> Commons-Validator would need to be made Punycode-aware to achieve what 
> you need, or alternatively, you could do the punycode translation in 
> your own code, before passing the string to validator.
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: validator: email-validation not accepting german "umlaute"

Posted by Michael Davey <Mi...@coderage.org>.
Sven Schliesing wrote:

> As for version 1.1.1 the Jakarta Commons-Validator is not accepting 
> german "umlaute" as parts of valid domain names like müller.de or 
> münchen.de
>
> Is this a known issue in validator or might this be a setting in Struts?

Valid domain names must contain only the characters a-z, A-Z, 0-9, "." 
and "-".  They must start with a letter and end with
a letter or digit.  The "." symbol is used exclusively to seperate 
subdomains (see RFC 1035 section 2.3.1 
<http://www.ietf.org/rfc/rfc1035.txt>).

To support internationalised domain names (IDN), both the client and the 
server must be punycode aware.  Punycode is a fairly new standards 
proposal (rfc3492) that encodes non-ascii characters into an ascii 
string, prefixed with "xn--".  For instance, müller.de is encoded as 
xn--mller-kva.de.

<http://www.faqs.org/rfcs/rfc3492.html>
<http://www.afilias.info/cgi-bin/convert_punycode.cgi>

Commons-Validator would need to be made Punycode-aware to achieve what 
you need, or alternatively, you could do the punycode translation in 
your own code, before passing the string to validator.

-- 
Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org