You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "stack@archive.org (JIRA)" <ji...@apache.org> on 2005/11/10 23:34:03 UTC
[jira] Commented: (NUTCH-110) OpenSearchServlet outputs illegal xml characters
[ http://issues.apache.org/jira/browse/NUTCH-110?page=comments#action_12357300 ]
stack@archive.org commented on NUTCH-110:
-----------------------------------------
Scrub NUTCH-110-version2.patch. This patch double-encode certain entities (First by the new toValidXmlText method, second by the javax.xml.transform.Transformer transformer used by OpenSearchServlet).
Use the original patch, fixIllegalXmlChars.patch, to address the problem described in this issue.
> OpenSearchServlet outputs illegal xml characters
> ------------------------------------------------
>
> Key: NUTCH-110
> URL: http://issues.apache.org/jira/browse/NUTCH-110
> Project: Nutch
> Type: Bug
> Components: searcher
> Versions: 0.7
> Environment: linux, jdk 1.5
> Reporter: stack@archive.org
> Attachments: NUTCH-110-version2.patch, fixIllegalXmlChars.patch
>
> OpenSearchServlet does not check text-to-output for illegal xml characters; dependent on search result, its possible for OSS to output xml that is not well-formed. For example, if text has the character FF character in it -- -- i.e. the ascii character at position (decimal) 12 -- the produced XML will show the FF character as '' The character/entity '' is not legal in XML according to http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira