You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Sorin Postelnicu (JIRA)" <ji...@apache.org> on 2015/04/09 19:13:13 UTC

[jira] [Comment Edited] (VALIDATOR-361) UrlValidator rejects new gTLDs with more than 4 characters,

    [ https://issues.apache.org/jira/browse/VALIDATOR-361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487677#comment-14487677 ] 

Sorin Postelnicu edited comment on VALIDATOR-361 at 4/9/15 5:12 PM:
--------------------------------------------------------------------

By inspecting the source code of DomainValidator (which is the common component for validating domain names, used by UrlValidator and EmailValidator), we can see the following possibilities for improvement:

1) The list of generic TLDs is hard-coded in the source, which is not very feasible: Every time the list of generic TLDs is updated by ICANN, the commons-validator library needs to be updated.

2) The class is defined as a singleton with a private constructor, so this makes it difficult for a programmer (encountering this problem) to subclass it and override the isValidGenericTld() to bypass the problem.

3) The first and most simple improvement is to replace the private constructor with a protected constructor (similar to the one in EmailValidator):
{code}
    /**
     * Protected constructor for subclasses to use.
     *
     * @param allowLocal Should local addresses be considered valid?
     */
    protected DomainValidator(boolean allowLocal) {
       this.allowLocal = allowLocal;
    }
{code}

4) The next step for improvement would be to extract the hard-coded list of domains to a separate class called DomainValidatorGenericTldsLoader. Preferably this would be an interface, with the most simple implementation (DomainValidatorGenericTldsLoaderHardCodedImpl) being the current hard-coded list of values.

5) The next step for improvement would be to extract the list of generic TLDs into a separate file (possibly a .txt file located in the classpath at /org/apache/commons/validator/routines/DomainValidatorGenericTLDs.txt), and then to load (and sort) the list of domains in another implementation of the DomainValidatorGenericTldsLoader interface, that will be called when DomainValidator is initialized.

6) And another possibility is also to replace the singleton pattern with a Dependency-Injection pattern, in which the DomainValidatorGenericTldsLoader is injected by the user of the DomainValidator, with the possibility to use any implementation (DomainValidatorGenericTldsLoaderHardCodedImpl, DomainValidatorGenericTldsLoaderTextFileImpl, or any custom implementation).



was (Author: sorin_postelnicu):
By inspecting the source code of DomainValidator (which is the common component for validating domain names, used by UrlValidator and EmailValidator), we can see the following possibilities for improvement:

1) The list of generic TLDs is hard-coded in the source, which is not very feasible: Every time the list of generic TLDs is updated by ICANN, the commons-validator library needs to be updated.

2) The class is defined as a singleton with a private constructor, so this makes it difficult for a programmer (encountering this problem) to subclass it and override the isValidGenericTld() to bypass the problem.

3) The first and most simple improvement is to replace the private constructor with a protected constructor (similar to the one in EmailValidator):
    /**
     * Protected constructor for subclasses to use.
     *
     * @param allowLocal Should local addresses be considered valid?
     */
    protected DomainValidator(boolean allowLocal) {
       this.allowLocal = allowLocal;
    }

4) The next step for improvement would be to extract the hard-coded list of domains to a separate class called DomainValidatorGenericTldsLoader. Preferably this would be an interface, with the most simple implementation (DomainValidatorGenericTldsLoaderHardCodedImpl) being the current hard-coded list of values.

5) The next step for improvement would be to extract the list of generic TLDs into a separate file (possibly a .txt file located in the classpath at /org/apache/commons/validator/routines/DomainValidatorGenericTLDs.txt), and then to load (and sort) the list of domains in another implementation of the DomainValidatorGenericTldsLoader interface, that will be called when DomainValidator is initialized.

6) And another possibility is also to replace the singleton pattern with a Dependency-Injection pattern, in which the DomainValidatorGenericTldsLoader is injected by the user of the DomainValidator, with the possibility to use any implementation (DomainValidatorGenericTldsLoaderHardCodedImpl, DomainValidatorGenericTldsLoaderTextFileImpl, or any custom implementation).


> UrlValidator rejects new gTLDs with more than 4 characters, 
> ------------------------------------------------------------
>
>                 Key: VALIDATOR-361
>                 URL: https://issues.apache.org/jira/browse/VALIDATOR-361
>             Project: Commons Validator
>          Issue Type: Bug
>    Affects Versions: 1.4.1 Release
>            Reporter: Hiroyuki, Ohnaka
>
> org.apache.commons.validator.UrlValidator#isValid rejects TLD more than 4 characters.(for example,  http://hello.tokyo/ )
> A lot of new gTLDs has more than 4 characters(  http://www.icann.org/registries/listing.html ), and these domains cannnot pass URL validation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)