You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Jan Lehnardt (JIRA)" <ji...@apache.org> on 2009/03/05 13:07:56 UTC
[jira] Updated: (COUCHDB-254) Non-Unicde characters in an
attachment name render a document unreadable.
[ https://issues.apache.org/jira/browse/COUCHDB-254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jan Lehnardt updated COUCHDB-254:
---------------------------------
Attachment: COUCHDB-254.txt
This patch sends a 400 Bad Request response with the reason "Attachment name is not UTF-8 encoded" when trying to save a document with an attachment which has non-utf-8 characters in the name. With test cases for inline-attachments, standalone attachments, bulk docs.
> Non-Unicde characters in an attachment name render a document unreadable.
> -------------------------------------------------------------------------
>
> Key: COUCHDB-254
> URL: https://issues.apache.org/jira/browse/COUCHDB-254
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Affects Versions: 0.9
> Environment: Linux, erlang, 12b-5, couchdb r791265
> Reporter: Maximillian Dornseif
> Priority: Critical
> Attachments: COUCHDB-254.txt
>
>
> Attatchment names containing nun unicode characters can be created easily because URI-s are (nearly) 8-bit clean. But when reading they are encoded into utf-8 which doesn't work out. So you are left with unreadable database entries.
> I was not able to generate invalid UTF-8 in JavaScript but a test case would look somewhat like this:
> --- couch_tests.js 2009-02-05 19:47:20.000000000 +0000
> +++ /usr/local/share/couchdb/www/script/couch_tests.js 2009-02-13 21:34:23.000000000 +0000
> @@ -1078,9 +1078,31 @@
> var xhr = CouchDB.request("GET", "/test_suite_db/bin_doc4/attachment.txt");
> T(xhr.status == 200);
> T(xhr.responseText == "This is a string");
> -
> },
>
> + attatchment_names : function(debug) {
> + var db = new CouchDB("test_suite_db");
> + db.deleteDb();
> + db.createDb();
> + if (debug) debugger;
> +
> + var binAttDoc = {
> + _id: "bin_doc",
> + _attachments:{
> + "foo\x80txt": {
> + content_type:"text/plain",
> + data: "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
> + }
> + }
> + }
> +
> + var save_response = db.save(binAttDoc);
> + T(save_response.ok);
> +
> + var xhr = CouchDB.request("GET", "/test_suite_db/bin_doc\x80foo.txt");
> + T(xhr.responseText == "This is a base64 encoded text");
> +},
> +
> attachment_paths : function(debug) {
> if (debug) debugger;
> var dbNames = ["test_suite_db", "test_suite_db/with_slashes"];
> A python script (fuzzer?) for triggering the bug looks like this:
> import sys
> import couchdb.client
> COUCHSERVER = "http://localhost:5984"
> COUCHDB_NAME = "md_test"
> def _setup_couchdb():
> """Get a connection handler to the CouchDB Database, creating it when needed."""
> server = couchdb.client.Server(COUCHSERVER)
> print "using %s/%s" % (COUCHSERVER, COUCHDB_NAME)
> if COUCHDB_NAME in server:
> return server[COUCHDB_NAME]
> else:
> return server.create(COUCHDB_NAME)
>
> def main():
> db = _setup_couchdb()
> doc_id = "doc_id"
>
> try:
> doc = db[doc_id]
> except couchdb.client.ResourceNotFound:
> doc = {}
>
> db[doc_id] = doc
> for i in range(256):
> char = chr(i)
> name = "___%s___" % (char)
> print "checking %r (%d) " % (char, i),
> sys.stdout.flush()
> db.put_attachment(db[doc_id], "data", name)
> db[doc_id]
> print '\r',
> print
> main()
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.