You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (Jira)" <ji...@apache.org> on 2019/09/23 08:48:00 UTC
[jira] [Updated] (NUTCH-2705) urlfilter-validator rejects IPv6 URLs
[ https://issues.apache.org/jira/browse/NUTCH-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel updated NUTCH-2705:
-----------------------------------
Fix Version/s: (was: 1.16)
1.17
> urlfilter-validator rejects IPv6 URLs
> -------------------------------------
>
> Key: NUTCH-2705
> URL: https://issues.apache.org/jira/browse/NUTCH-2705
> Project: Nutch
> Issue Type: Bug
> Components: plugin
> Affects Versions: 1.15
> Reporter: Sebastian Nagel
> Priority: Minor
> Fix For: 1.17
>
>
> The plugin urlfilter-validator rejects URLs with an IPv6 address as hostname/authority (given according to [RFC 2732|https://tools.ietf.org/html/rfc2732]:
> {noformat}
> % echo "http://[2010:836B:4179::836B:4179]/" \
> | bin/nutch filterchecker -filterName urlfilter-validator -stdin
> Checking combination of these URLFilters: UrlValidator
> -http://[2010:836B:4179::836B:4179]/
> {noformat}
> We should also consider to use the class [UrlValidator|https://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/routines/UrlValidator.html] from commons-validator directly instead of a modified copy. This would help to get updates and improvements with little effort - IPv6 is already supported, see the [class implementation|https://commons.apache.org/proper/commons-validator/apidocs/src-html/org/apache/commons/validator/routines/UrlValidator.html#line.380].
--
This message was sent by Atlassian Jira
(v8.3.4#803005)