You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@knox.apache.org by "Kevin Risden (Jira)" <ji...@apache.org> on 2020/01/27 19:04:00 UTC

[jira] [Work started] (KNOX-2202) Knox should use UTF-8 as default encoding instead of ISO-8859-1

     [ https://issues.apache.org/jira/browse/KNOX-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on KNOX-2202 started by Kevin Risden.
------------------------------------------
> Knox should use UTF-8 as default encoding instead of ISO-8859-1
> ---------------------------------------------------------------
>
>                 Key: KNOX-2202
>                 URL: https://issues.apache.org/jira/browse/KNOX-2202
>             Project: Apache Knox
>          Issue Type: Bug
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>            Priority: Major
>             Fix For: 1.4.0
>
>
> If you send in an XML doc with unicode characters you get the following:
> {code:java}
> ...
> Caused by: com.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog
>  at [row,col {unknown-source}]: [1,0]
>         at com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:687)
>         at com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2220)
>         at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2126)
>         at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1181)
>         at org.codehaus.stax2.ri.Stax2EventReaderImpl.nextEvent(Stax2EventReaderImpl.java:255)
>         at org.apache.knox.gateway.filter.rewrite.impl.xml.XmlFilterReader.read(XmlFilterReader.java:122)
>         ... 133 more
> {code}
> Knox default falls back to ISO-8859-1 encoding instead of UTF-8.
> I did some research and the default encoding specification has changed over the years. It looks like ISO-8859-1 was the default historically, but currently it should be UTF-8.
> https://stackoverflow.com/questions/58337900/how-to-change-default-character-encoding-configuration-in-jetty-app-server-from
> There are very few cases where ISO-8859-1 and UTF-8 are incompatible and it would be outside the default ASCII charset.
> I also found that the default XML encoding is UTF-8 so even if we don't change all the defaults to UTF-8 we should do so for XML.
> https://www.w3schools.com/xml/xml_syntax.asp



--
This message was sent by Atlassian Jira
(v8.3.4#803005)