You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Jens Alfke (Created) (JIRA)" <ji...@apache.org> on 2011/12/21 00:45:30 UTC

[jira] [Created] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

multipart/related document body doesn't identify which part is which attachment
-------------------------------------------------------------------------------

                 Key: COUCHDB-1368
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
             Project: CouchDB
          Issue Type: Bug
          Components: HTTP Interface
            Reporter: Jens Alfke
            Priority: Minor


If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)

The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.

This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497466#comment-13497466 ] 

Jan Lehnardt commented on COUCHDB-1368:
---------------------------------------

Jup, check out share/www/script/test/attachments_multipart.js :)
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497290#comment-13497290 ] 

Jan Lehnardt commented on COUCHDB-1368:
---------------------------------------

The main CouchDB repo lives here: https://git-wip-us.apache.org/repos/asf?p=couchdb.git;a=summary

Branches are not mirrored to GitHub at this point (sorry), but I pushed it to my fork:

  https://github.com/janl/couchdb/tree/1368-fix-multipart-header-parts.

The commit in question:

  https://github.com/janl/couchdb/commit/18971de71c93c3a00e408b3d4eb67be8c695150c

Here’s a request dump:

> curl -Nv $COUCH/test/asd?attachments=true -H "Accept: multipart/related,*/*;"
* About to connect() to 127.0.0.1 port 5984 (#0)
*   Trying 127.0.0.1...
* connected
* Connected to 127.0.0.1 (127.0.0.1) port 5984 (#0)
> GET /test/asd?attachments=true HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: 127.0.0.1:5984
> Accept: multipart/related,*/*;
> 
< HTTP/1.1 200 OK
< Server: CouchDB/1.3.0a-b9af7ea-git (Erlang OTP/R15B02)
< ETag: "9-4310e4b1fcab6822344790d37fb5ddea"
< Date: Wed, 14 Nov 2012 18:04:57 GMT
< Content-Type: multipart/related; boundary="a38b2d614bb2a8d70e31050a0e2e11a5"
< Content-Length: 493
< 
--a38b2d614bb2a8d70e31050a0e2e11a5
content-type: application/json

{"_id":"asd","_rev":"9-4310e4b1fcab6822344790d37fb5ddea","foo":"var","_attachments":{"test.txt":{"content_type":"text/plain","revpos":8,"digest":"md5-7xbQv30HNBSgLpMDsQTH7A==","length":12,"follows":true,"encoding":"gzip","encoded_length":30}}}
--a38b2d614bb2a8d70e31050a0e2e11a5
Content-ID: test.txt
Content-Type: text/plain
Content-Length: 30
Content-Transfer-Encoding: gzip

K??WHJ,?*.?5?

--a38b2

The closing boundary is off, I seem to have a bug in the main request’s Content-Length calculation, but this is the direction this is going.
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jens Alfke (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497464#comment-13497464 ] 

Jens Alfke commented on COUCHDB-1368:
-------------------------------------

You mean, add some JS unit tests? I think I could do that. Hopefully there are already some tests that look at MIME responses?
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Damien Katz (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Damien Katz reassigned COUCHDB-1368:
------------------------------------

    Assignee: Damien Katz
    
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Assignee: Damien Katz
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jens Alfke (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497418#comment-13497418 ] 

Jens Alfke commented on COUCHDB-1368:
-------------------------------------

It turns out "Content-ID" is not the correct header to use for the filename, because according to RFC2045 sec.7, "Content-ID values must be generated to be world-unique". (I didn't know this when writing up this issue, but discovered it later on while implementing MIME support for TouchDB. I should have updated this issue too; sorry!)

The most appropriate header to use seems to be Content-Disposition (RFC1806):

    Content-Disposition: attachment; filename="test.txt"

This is what TouchDB generates, and what it will recognize in incoming MIME documents.
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497445#comment-13497445 ] 

Jan Lehnardt commented on COUCHDB-1368:
---------------------------------------

oopsn, I updated the branches accordingly. I only glanced at the spec for the format, not the header names, sorry about that!

Either way though, I’d like a bit more thorough testing on this one, especially with all combinations of compressed, non compressed, binary and plain text attachments with compressed transfer encodings and without, just to make sure it is all correct. Is that something you can help with? (I know I asked that the last time and then nothing happened, but here we already have the fix, mostly, the other one turned to be a bit more hairy than I could handle with my time then :)
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497106#comment-13497106 ] 

Jan Lehnardt commented on COUCHDB-1368:
---------------------------------------

Fixed in the branch: 1368-fix-multipart-header-parts 

I’d love a review.
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Assignee: Damien Katz
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497301#comment-13497301 ] 

Jan Lehnardt commented on COUCHDB-1368:
---------------------------------------

And fixed. I’ve updated the branches.
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jan Lehnardt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jan Lehnardt updated COUCHDB-1368:
----------------------------------

    Assignee:     (was: Damien Katz)
    
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Jens Alfke (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497240#comment-13497240 ] 

Jens Alfke commented on COUCHDB-1368:
-------------------------------------

Where is the branch? I don't see it in the github UI at https://github.com/apache/couchdb .

Also, could you post a sample of what the MIME headers look like for an attachment part?
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1368) multipart/related document body doesn't identify which part is which attachment

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13510737#comment-13510737 ] 

Paul Joseph Davis commented on COUCHDB-1368:
--------------------------------------------

+1 on this. The only part I dislike is that length calculation, but given the current status of that function I think its the least bad way to implement this.
                
> multipart/related document body doesn't identify which part is which attachment
> -------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1368
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1368
>             Project: CouchDB
>          Issue Type: Bug
>          Components: HTTP Interface
>            Reporter: Jens Alfke
>            Priority: Minor
>
> If you GET a document with attachments in multipart/related format (by adding ?attachments=true and setting Accept:multipart/related), the MIME bodies for the attachments have no headers. This makes it difficult to tell which one is which. Damien says they're in the same order that they appear in the document's "_attachments" object ... which is fine if you're Erlang, because Erlang preserves the order of keys in a JSON object, but no other JSON implementation I know of does that (because they use hashtables instead of linked lists.)
> The upshot is that any non-Erlang code trying to parse such a response will have to do some by-hand parsing of the JSON data to get the _attachment keys in order.
> This can be fixed by adding a "Content-ID" header to each attachment body, whose value is the filename. It would be nice if other standard headers were added too, like "Content-Type", "Content-Length", "Content-Encoding", as this would make it work better with existing MIME multipart libraries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira