You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "Loke (JIRA)" <ji...@apache.org> on 2016/10/03 10:08:20 UTC
[jira] [Created] (COUCHDB-3173) Views return corrupt data for text
fields containing non-BMP characters
Loke created COUCHDB-3173:
-----------------------------
Summary: Views return corrupt data for text fields containing non-BMP characters
Key: COUCHDB-3173
URL: https://issues.apache.org/jira/browse/COUCHDB-3173
Project: CouchDB
Issue Type: Bug
Components: View Server Support
Reporter: Loke
When inserting non-BMP character (i.e. characters with a Unicode codepoint above {{U+FFFF}}), the content gets corrupted after reading it from a view. Every instance of these characters are returned with an appended {{U+FFFD REPLACEMENT CHARACTER}}.
To reproduce, use the following commands.
Create the document containing a field with the character {{U+1F604 SMILING FACE WITH OPEN MOUTH AND SMILING EYES}}:
{noformat}
$ curl -X PUT -d '{"type":"foo","value":"😄"}' http://localhost:5984/foo/foo2
{"ok":true,"id":"foo2","rev":"1-d7da3cd352ef74f6391cc13601081214"}
{noformat}
Get the document to ensure that it was saved properly:
{noformat}
curl -X GET http://localhost:5984/foo/foo2
{"_id":"foo2","_rev":"1-d7da3cd352ef74f6391cc13601081214","type":"foo","value":"😄"}
{noformat}
Create a view that will return that document:
{noformat}
$ curl --user user:password -X PUT -d '{"language":"javascript","views":{"v":{"map":"function(doc){if(doc.type===\"foo\")emit(doc._id,doc);}"}}}' http://localhost:5984/foo/_design/bugdemo
{"ok":true,"id":"_design/bugdemo","rev":"1-817af2dafecb4cf8213aa7063551daac"}
{noformat}
Get the document from the view:
{noformat}
$ curl -X GET http://localhost:5984/foo/_design/bugdemo/_view/v
{"total_rows":1,"offset":0,"rows":[
{"id":"foo2","key":"foo2","value":{"_id":"foo2","_rev":"1-d7da3cd352ef74f6391cc13601081214","type":"foo","value":"😄�"}}
]}
{noformat}
Now we can see that the field {{value}} now contains two characters. The original character as well as {{U+FFFD}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)