You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@jclouds.apache.org by Andrew Gaul <no...@github.com> on 2017/07/27 00:57:03 UTC

[jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

This caused failures to parse Azure Queue Storage list requests.
You can view, comment on, or merge this pull request online at:

  https://github.com/jclouds/jclouds/pull/1124

-- Commit Summary --

  * Consume Unicode byte order mark in XML parser

-- File Changes --

    M core/src/main/java/org/jclouds/http/functions/ParseXMLWithJAXB.java (7)

-- Patch Links --

https://github.com/jclouds/jclouds/pull/1124.patch
https://github.com/jclouds/jclouds/pull/1124.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124

Re: [jclouds/jclouds] JCLOUDS-1325: Ignore Unicode BOM in XML parser (#1124)

Posted by Andrew Gaul <no...@github.com>.
Thanks for the background @neykov!

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#issuecomment-318434112

Re: [jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

Posted by Andrew Gaul <no...@github.com>.
Added test.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#issuecomment-318290320

Re: [jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

Posted by Svet <no...@github.com>.
neykov approved this pull request.





-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#pullrequestreview-52592661

Re: [jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

Posted by Ignasi Barrera <no...@github.com>.
nacx approved this pull request.

Thanks! Worth registering a JIRA issue for this?



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#pullrequestreview-52586145

Re: [jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

Posted by Svet <no...@github.com>.
The change was surprising to me, I expected Java to handle the BOM. After digging deeper I found [[1]](https://stackoverflow.com/questions/4897876/reading-utf-8-bom-marker) which points to two JDK bugs [[2]](http://bugs.java.com/view_bug.do?bug_id=4508058) and [[3]](http://bugs.java.com/view_bug.do?bug_id=6378911). Turns out they fixed it at some point to consume the BOM but then reverted because it breaks backwards compatibility.

Also of interest, the UTF character `0xFEFF` is serialized as `EF BB BF` in the UTF-8 byte sequence [[4]](http://www.unicode.org/faq/utf_bom.html#BOM)

\[1\] https://stackoverflow.com/questions/4897876/reading-utf-8-bom-marker
\[2\] http://bugs.java.com/view_bug.do?bug_id=4508058
\[3\] http://bugs.java.com/view_bug.do?bug_id=6378911
\[4\] http://www.unicode.org/faq/utf_bom.html#BOM

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#issuecomment-318299404

Re: [jclouds/jclouds] Consume Unicode byte order mark in XML parser (#1124)

Posted by Ignasi Barrera <no...@github.com>.
LGTM so far. Is there any existing test that verifies the behavior of this class? If not, could we add some of them to make sure we don't introduce regressions?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/jclouds/jclouds/pull/1124#issuecomment-318272069