You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2014/01/23 14:07:37 UTC
[jira] [Created] (NUTCH-1711) Normalizer does not encode
exclamation mark
Markus Jelsma created NUTCH-1711:
------------------------------------
Summary: Normalizer does not encode exclamation mark
Key: NUTCH-1711
URL: https://issues.apache.org/jira/browse/NUTCH-1711
Project: Nutch
Issue Type: Bug
Affects Versions: 1.7
Reporter: Markus Jelsma
Assignee: Markus Jelsma
Fix For: 1.8
{code}
$ bin/nutch org.apache.nutch.net.URLNormalizerChecker
Checking combination of all URLNormalizers available
http://nutch.apache.org/bla!
http://nutch.apache.org/bla!
{code}
I never noticed that many URL encoders do not encode the exclamation mark until just now. SolrCloud uses the character to delimit the composite ID in SolrCloud, if you end with the exclamation mark, you will get an error!
Any thoughts on this?
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)