You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Joan Touzet (JIRA)" <ji...@apache.org> on 2009/05/07 00:19:30 UTC

[jira] Updated: (COUCHDB-345) "High ASCII" can be inserted into db but not retrieved

     [ https://issues.apache.org/jira/browse/COUCHDB-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joan Touzet updated COUCHDB-345:
--------------------------------

    Description: 
It is possible to PUT/POST a document into CouchDB with a "high ASCII" value that cannot be retrieved. This results from not escaping a non-ASCII value into \u#### when PUT/POSTing the document.

This sample code will recreate the problem using the hex value D8 (Ø) in a possibly unsavoury test string. It requires the file Couch.py in the same directory, containing the code from the "Example wrapper class" at http://wiki.apache.org/couchdb/Getting_started_with_Python to run.

================================================
#!/usr/bin/python
from Couch import Couch
db = Couch('localhost', '5984')
db.createDb('utf8_fail')
badtext = "4E45494D454E2046D85252204641454E21".decode("hex")
doc = """
{
    "Message":\"""" + badtext + """\",
}
"""
db.saveDoc('utf8_fail', doc, 'fail')
db.openDoc('utf8_fail', 'fail')
================================================
Sample output against 0.9.0 is as follows:

{
    "ok": true
}
{
    "id": "fail", 
    "ok": true, 
    "rev": "1-76726372"
}
{
    "error": "ucs", 
    "reason": "{bad_utf8_character_code}"
}

================================================

Please note this defect turned up another problem, namely that the bad_utf8_character_code exception thrown by a design document attempting to map() the bad document caused Futon to fail silently in building the view, with no indication (except via debug log) that there was a failure. The log indicated two attempts to build the view, both failing, followed by an uncaught exception error for Futon.

Based on this, there are likely other areas in the codebase that do not handle the bad_utf8_character_code exception correctly.

My belief is that CouchDB shouldn't accept this input and should have rejected the PUT/POST, or should have escaped the input itself before the insertion.

  was:
It is possible to PUT/POST a document into CouchDB with a "high ASCII" value that cannot be retrieved. This results from not escaping a non-ASCII value into \u#### when PUT/POSTing the document.

This sample code will recreate the problem using the hex value D8 (Ø) in a possibly unsavoury test string. It requires the file Couch.py in the same directory, containing the code from the "Example wrapper class" at http://wiki.apache.org/couchdb/Getting_started_with_Python to run.

================================================
#!/usr/bin/python
from Couch import Couch
db = Couch('localhost', '5984')
db.createDb('utf8_fail')
badtext = "4E45494D454E2046D85252204641454E21".decode("hex")
doc = """
{
    "Message":\"""" + badtext + """\",
}
"""
db.saveDoc('utf8_fail', doc, 'fail')
db.openDoc('utf8_fail', 'fail')
================================================
Sample output against 0.9.0 is as follows:

{
    "ok": true
}
{
    "id": "fail", 
    "ok": true, 
    "rev": "1-76726372"
}
{
    "error": "ucs", 
    "reason": "{bad_utf8_character_code}"
}

================================================

Please note this defect turned up another problem, namely that the bad_utf8_character_code exception thrown by a design document attempting to map() the bad document caused Futon to fail silently in building the view, with no indication (except via debug log) that there was a failure. The log indicated two attempts to build the view, both failing, followed by an uncaught exception error for Futon.

Based on this, there are likely other areas in the codebase that do not handle the bad_utf8_character_code exception correctly.

        Summary: "High ASCII" can be inserted into db but not retrieved  (was: High ASCII can be )

> "High ASCII" can be inserted into db but not retrieved
> ------------------------------------------------------
>
>                 Key: COUCHDB-345
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-345
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.9
>         Environment: OSX 10.5.6
>            Reporter: Joan Touzet
>
> It is possible to PUT/POST a document into CouchDB with a "high ASCII" value that cannot be retrieved. This results from not escaping a non-ASCII value into \u#### when PUT/POSTing the document.
> This sample code will recreate the problem using the hex value D8 (Ø) in a possibly unsavoury test string. It requires the file Couch.py in the same directory, containing the code from the "Example wrapper class" at http://wiki.apache.org/couchdb/Getting_started_with_Python to run.
> ================================================
> #!/usr/bin/python
> from Couch import Couch
> db = Couch('localhost', '5984')
> db.createDb('utf8_fail')
> badtext = "4E45494D454E2046D85252204641454E21".decode("hex")
> doc = """
> {
>     "Message":\"""" + badtext + """\",
> }
> """
> db.saveDoc('utf8_fail', doc, 'fail')
> db.openDoc('utf8_fail', 'fail')
> ================================================
> Sample output against 0.9.0 is as follows:
> {
>     "ok": true
> }
> {
>     "id": "fail", 
>     "ok": true, 
>     "rev": "1-76726372"
> }
> {
>     "error": "ucs", 
>     "reason": "{bad_utf8_character_code}"
> }
> ================================================
> Please note this defect turned up another problem, namely that the bad_utf8_character_code exception thrown by a design document attempting to map() the bad document caused Futon to fail silently in building the view, with no indication (except via debug log) that there was a failure. The log indicated two attempts to build the view, both failing, followed by an uncaught exception error for Futon.
> Based on this, there are likely other areas in the codebase that do not handle the bad_utf8_character_code exception correctly.
> My belief is that CouchDB shouldn't accept this input and should have rejected the PUT/POST, or should have escaped the input itself before the insertion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.