You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Claus Ibsen (JIRA)" <ji...@apache.org> on 2015/07/16 07:51:04 UTC

[jira] [Comment Edited] (CAMEL-8905) encoding problems in jsonpath

    [ https://issues.apache.org/jira/browse/CAMEL-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628255#comment-14628255 ] 

Claus Ibsen edited comment on CAMEL-8905 at 7/16/15 5:50 AM:
-------------------------------------------------------------

Also it seems a bit too much we have to do this in Camel, and its not a function of jsonpath library itself. Have you got in contact with them? 

Isn't this a general problem in json-path and better to be fixed/improved there?


was (Author: davsclaus):
Also it seems a bit strange we have to do this in Camel, and its not a function of jsonpath library itself. Have you got in contact with them?

> encoding problems in jsonpath
> -----------------------------
>
>                 Key: CAMEL-8905
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8905
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-jsonpath
>    Affects Versions: 2.15.2
>            Reporter: Franz Forsthofer
>             Fix For: 2.16.0, 2.15.3
>
>         Attachments: 0001-jsonpath-automatic-charset-detection.patch, booksUTF16BE.json, booksUTF16LE.json, jsonUCS2BigEndianWithBOM.txt, jsonUCS2BigEndianWithoutBOM.txt, jsonUCS2LittleEndianWithBom.txt, jsonUCS2LittleEndianWithoutBOM.txt, jsonUTF32BEWithBOM.txt, jsonUTF32BEWithoutBOM.txt, jsonUTF32LEWithBOM.txt, jsonUTF32LEWithoutBOM.txt
>
>
> I detected three different encoding problems in jsonpath:
> - if jsonpath is called with an input stream which has an encoding different from the default encoding (given by Charset.defaultCharset()) then jsonpath still uses the default encoding. Error location in JsonPathEngine:
>         else if (json instanceof InputStream) {
>             InputStream is = (InputStream) json;
>             return path.read(is, Charset.defaultCharset().displayName(), 
> configuration);}
>       
> - if jsonpath is called with a json file whose encoding is different from UTF-8, then jsonpath still parses the document with UTF-8. Error location in JsonPathEngine:
>        else if (json instanceof File) {
>             File file = (File) json;
>             return path.read(file, configuration);
>        }
>  path.read(file, configuration) uses always UTF-8
> - if jsonpath is called with an URL pointing to a JSON document whose encoding is different from UTF-8, then jsonPath still parses the document with UTF-8. Error location in JsonPathEngine:
>          else if (json instanceof URL) {
>             URL url = (URL) json;
>             return path.read(url, configuration);
>          }
> path.read(url, configuration) uses UTF-8
> My solution proposal is to determine the encoding of the JSON documents automatically according to the specification RFC-4627 (https://www.ietf.org/rfc/rfc4627.txt; see chapter 3. Encoding) and then call the method path.read(jsonDocument,foundEncoding,configuration) with the found encoding. See attached patch.
> Actually I can commit the patch myself. However, I would like that somebody who is more familiar with jsonpath than I does review my patch.
> So please tell me if my patch can be accepted or not. I can then do the actual commit or I will discard the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)