You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Daniel Gonzalez <go...@gonvaled.com> on 2014/01/03 14:52:05 UTC

Specify attachment encoding for couchdb

Hi,

I have the following test script:

# -*- coding: utf-8 -*-

import os
import couchdb

GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'

# Prepare a unicode file, encoded using ENCODING
ENCODING = 'utf-8'
filename = '/tmp/test'
open(filename, 'w').write(GREEK.encode(ENCODING))

# Create an empty document
server = couchdb.Server()
db = server['cdb-tests']
doc_id = 'testing'
doc = { }
db[doc_id] = doc

# Attach the file to the document
content = open(filename, 'rb') # Open the file for reading
db.put_attachment(doc, content, content_type='text/plain')

As you can see, the file is utf-8 encoded, but when I attach that file to
couchdb, I have no way to specify this encoding. Thus, requesting the
attachment at http://localhost:5984/cdb-tests/testing/test returns the
following Response Headers:

HTTP/1.1 200 OK
Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
ETag: "7y85tiUeF/UX9kqpKAzQEw=="
Date: Fri, 03 Jan 2014 13:43:36 GMT
Content-Type: text/plain
Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
Content-Length: 102
Content-Encoding: gzip
Cache-Control: must-revalidate
Accept-Ranges: none

Seeing the attachment with a browser shows complete gibberish. How can I
store the encoding for couchdb attachments?

Thanks and regards,

Daniel

PD: SO reference link: http://stackoverflow.com/q/20905157/647991

Re: Specify attachment encoding for couchdb

Posted by Daniel Gonzalez <go...@gonvaled.com>.
I do not have the code either, but based on the interface available from
python, I guess you are right: it is not possible to specify the encoding
at all.

I still can save the encoding using the custom solution that I have
suggested before, but I guess this defeats a little the idea of attachments
in couchdb, which are supposed to be completely self-contained. But for me
that would be enough, since I am not planning on allowing direct access to
these attachments. My goal is to be able to post-process the data, and for
that I can separately access the attachment and the encoding (as set in the
mapping attachments-encoding) for further processing.

For more background info: the attachments that I am saving are actually
email attachments, associated to a couchdb document. Since I want to have
the possibility to resend the email, I need to save all the data, including
attachments, to the associated document. Whenever I want to, I can reuse
all those attachments, body, html_body, subject, and so on to re-compose
the email. For that I also need the encoding of each attachment, in order
to prepare the mime multipart message.


On Fri, Jan 3, 2014 at 4:03 PM, Nick North <no...@gmail.com> wrote:

> I don't have the code in front of me, so someone please correct me if I'm
> wrong, but I recall that CouchDb does not take any notice of part headers
> in multipart messages. This means that you cannot currently set encoding on
> a per-attachment basis. (I haven't looked closely at the code to add an
> attachment to an existing document - does anyone know if you can set the
> encoding in that case? It doesn't feel likely to me.)
>
> Nick
>
>
> On 3 January 2014 14:40, Daniel Gonzalez <go...@gonvaled.com> wrote:
>
> > Thanks but, how do you set that on a per-attachment basis in a couchdb
> > document? If this is not supported, I guess I will have to add a mapping
> > "attachments-encoding" to the document where I can associate each
> > attachment with its encoding. Any comments on this?
> >
> >
> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com>
> wrote:
> >
> > > You can set MIME type as text/plain;charset=utf-8 to help browsers
> > > detect the correct content encoding.
> > > See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> > > --
> > > ,,,^..^,,,
> > >
> > >
> > > On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <gonvaled@gonvaled.com
> >
> > > wrote:
> > > > Hi,
> > > >
> > > > I have the following test script:
> > > >
> > > > # -*- coding: utf-8 -*-
> > > >
> > > > import os
> > > > import couchdb
> > > >
> > > > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ
> φχψω'
> > > >
> > > > # Prepare a unicode file, encoded using ENCODING
> > > > ENCODING = 'utf-8'
> > > > filename = '/tmp/test'
> > > > open(filename, 'w').write(GREEK.encode(ENCODING))
> > > >
> > > > # Create an empty document
> > > > server = couchdb.Server()
> > > > db = server['cdb-tests']
> > > > doc_id = 'testing'
> > > > doc = { }
> > > > db[doc_id] = doc
> > > >
> > > > # Attach the file to the document
> > > > content = open(filename, 'rb') # Open the file for reading
> > > > db.put_attachment(doc, content, content_type='text/plain')
> > > >
> > > > As you can see, the file is utf-8 encoded, but when I attach that
> file
> > to
> > > > couchdb, I have no way to specify this encoding. Thus, requesting the
> > > > attachment at http://localhost:5984/cdb-tests/testing/test returns
> the
> > > > following Response Headers:
> > > >
> > > > HTTP/1.1 200 OK
> > > > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> > > > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> > > > Date: Fri, 03 Jan 2014 13:43:36 GMT
> > > > Content-Type: text/plain
> > > > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> > > > Content-Length: 102
> > > > Content-Encoding: gzip
> > > > Cache-Control: must-revalidate
> > > > Accept-Ranges: none
> > > >
> > > > Seeing the attachment with a browser shows complete gibberish. How
> can
> > I
> > > > store the encoding for couchdb attachments?
> > > >
> > > > Thanks and regards,
> > > >
> > > > Daniel
> > > >
> > > > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
> > >
> >
>

Re: Specify attachment encoding for couchdb

Posted by Nick North <no...@gmail.com>.
I don't have the code in front of me, so someone please correct me if I'm
wrong, but I recall that CouchDb does not take any notice of part headers
in multipart messages. This means that you cannot currently set encoding on
a per-attachment basis. (I haven't looked closely at the code to add an
attachment to an existing document - does anyone know if you can set the
encoding in that case? It doesn't feel likely to me.)

Nick


On 3 January 2014 14:40, Daniel Gonzalez <go...@gonvaled.com> wrote:

> Thanks but, how do you set that on a per-attachment basis in a couchdb
> document? If this is not supported, I guess I will have to add a mapping
> "attachments-encoding" to the document where I can associate each
> attachment with its encoding. Any comments on this?
>
>
> On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com> wrote:
>
> > You can set MIME type as text/plain;charset=utf-8 to help browsers
> > detect the correct content encoding.
> > See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> > --
> > ,,,^..^,,,
> >
> >
> > On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com>
> > wrote:
> > > Hi,
> > >
> > > I have the following test script:
> > >
> > > # -*- coding: utf-8 -*-
> > >
> > > import os
> > > import couchdb
> > >
> > > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
> > >
> > > # Prepare a unicode file, encoded using ENCODING
> > > ENCODING = 'utf-8'
> > > filename = '/tmp/test'
> > > open(filename, 'w').write(GREEK.encode(ENCODING))
> > >
> > > # Create an empty document
> > > server = couchdb.Server()
> > > db = server['cdb-tests']
> > > doc_id = 'testing'
> > > doc = { }
> > > db[doc_id] = doc
> > >
> > > # Attach the file to the document
> > > content = open(filename, 'rb') # Open the file for reading
> > > db.put_attachment(doc, content, content_type='text/plain')
> > >
> > > As you can see, the file is utf-8 encoded, but when I attach that file
> to
> > > couchdb, I have no way to specify this encoding. Thus, requesting the
> > > attachment at http://localhost:5984/cdb-tests/testing/test returns the
> > > following Response Headers:
> > >
> > > HTTP/1.1 200 OK
> > > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> > > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> > > Date: Fri, 03 Jan 2014 13:43:36 GMT
> > > Content-Type: text/plain
> > > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> > > Content-Length: 102
> > > Content-Encoding: gzip
> > > Cache-Control: must-revalidate
> > > Accept-Ranges: none
> > >
> > > Seeing the attachment with a browser shows complete gibberish. How can
> I
> > > store the encoding for couchdb attachments?
> > >
> > > Thanks and regards,
> > >
> > > Daniel
> > >
> > > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
> >
>

Re: Specify attachment encoding for couchdb

Posted by Daniel Gonzalez <go...@gonvaled.com>.
Great! That is exactly what I was looking for!

Thanks!


On Fri, Jan 3, 2014 at 5:08 PM, Nick North <no...@gmail.com> wrote:

> Cunning :-)
>
>
> On 3 January 2014 15:48, Alexander Shorin <kx...@gmail.com> wrote:
>
> > Erhm..just replace:
> >
> > > db.put_attachment(doc, content, content_type='text/plain')
> >
> > with
> >
> > > db.put_attachment(doc, content,
> content_type='text/plain;charset=utf-8')
> >
> > And CouchDB will remember it:
> >
> > $ http HEAD http://localhost:5984/b/testing/test
> > HTTP/1.1 200 OK
> > Accept-Ranges: none
> > Cache-Control: must-revalidate
> > Content-Encoding: gzip
> > Content-Length: 102
> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> > Content-Type: text/plain; charset=utf-8
> > Date: Fri, 03 Jan 2014 14:14:27 GMT
> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> > Server: CouchDB/1.6.0+build.0bf1856 (Erlang OTP/R16B01)
> >
> > it will also available in attachments stub info. So before decoding,
> > just read content-type value, get att's encoding and decode it
> > according it.
> >
> > --
> > ,,,^..^,,,
> >
> >
> > On Fri, Jan 3, 2014 at 7:43 PM, Daniel Gonzalez <go...@gonvaled.com>
> > wrote:
> > > No, what I mean is "how can I keep track of the encoding used for each
> of
> > > the attachments, so that I can decode then correctly whenever I want
> to"
> > >
> > >
> > > On Fri, Jan 3, 2014 at 4:23 PM, Alexander Shorin <kx...@gmail.com>
> > wrote:
> > >
> > >> Not sure if I follow your idea. You mean, that how you can set such
> > >> charset info for existed attachments? In this case you have to
> > >> reupload them.
> > >> --
> > >> ,,,^..^,,,
> > >>
> > >>
> > >> On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <
> gonvaled@gonvaled.com>
> > >> wrote:
> > >> > Thanks but, how do you set that on a per-attachment basis in a
> couchdb
> > >> > document? If this is not supported, I guess I will have to add a
> > mapping
> > >> > "attachments-encoding" to the document where I can associate each
> > >> > attachment with its encoding. Any comments on this?
> > >> >
> > >> >
> > >> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com>
> > >> wrote:
> > >> >
> > >> >> You can set MIME type as text/plain;charset=utf-8 to help browsers
> > >> >> detect the correct content encoding.
> > >> >> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> > >> >> --
> > >> >> ,,,^..^,,,
> > >> >>
> > >> >>
> > >> >> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <
> > gonvaled@gonvaled.com>
> > >> >> wrote:
> > >> >> > Hi,
> > >> >> >
> > >> >> > I have the following test script:
> > >> >> >
> > >> >> > # -*- coding: utf-8 -*-
> > >> >> >
> > >> >> > import os
> > >> >> > import couchdb
> > >> >> >
> > >> >> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ
> > φχψω'
> > >> >> >
> > >> >> > # Prepare a unicode file, encoded using ENCODING
> > >> >> > ENCODING = 'utf-8'
> > >> >> > filename = '/tmp/test'
> > >> >> > open(filename, 'w').write(GREEK.encode(ENCODING))
> > >> >> >
> > >> >> > # Create an empty document
> > >> >> > server = couchdb.Server()
> > >> >> > db = server['cdb-tests']
> > >> >> > doc_id = 'testing'
> > >> >> > doc = { }
> > >> >> > db[doc_id] = doc
> > >> >> >
> > >> >> > # Attach the file to the document
> > >> >> > content = open(filename, 'rb') # Open the file for reading
> > >> >> > db.put_attachment(doc, content, content_type='text/plain')
> > >> >> >
> > >> >> > As you can see, the file is utf-8 encoded, but when I attach that
> > >> file to
> > >> >> > couchdb, I have no way to specify this encoding. Thus, requesting
> > the
> > >> >> > attachment at http://localhost:5984/cdb-tests/testing/testreturns
> > >> the
> > >> >> > following Response Headers:
> > >> >> >
> > >> >> > HTTP/1.1 200 OK
> > >> >> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> > >> >> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> > >> >> > Date: Fri, 03 Jan 2014 13:43:36 GMT
> > >> >> > Content-Type: text/plain
> > >> >> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> > >> >> > Content-Length: 102
> > >> >> > Content-Encoding: gzip
> > >> >> > Cache-Control: must-revalidate
> > >> >> > Accept-Ranges: none
> > >> >> >
> > >> >> > Seeing the attachment with a browser shows complete gibberish.
> How
> > >> can I
> > >> >> > store the encoding for couchdb attachments?
> > >> >> >
> > >> >> > Thanks and regards,
> > >> >> >
> > >> >> > Daniel
> > >> >> >
> > >> >> > PD: SO reference link:
> http://stackoverflow.com/q/20905157/647991
> > >> >>
> > >>
> >
>

Re: Specify attachment encoding for couchdb

Posted by Nick North <no...@gmail.com>.
Cunning :-)


On 3 January 2014 15:48, Alexander Shorin <kx...@gmail.com> wrote:

> Erhm..just replace:
>
> > db.put_attachment(doc, content, content_type='text/plain')
>
> with
>
> > db.put_attachment(doc, content, content_type='text/plain;charset=utf-8')
>
> And CouchDB will remember it:
>
> $ http HEAD http://localhost:5984/b/testing/test
> HTTP/1.1 200 OK
> Accept-Ranges: none
> Cache-Control: must-revalidate
> Content-Encoding: gzip
> Content-Length: 102
> Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> Content-Type: text/plain; charset=utf-8
> Date: Fri, 03 Jan 2014 14:14:27 GMT
> ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> Server: CouchDB/1.6.0+build.0bf1856 (Erlang OTP/R16B01)
>
> it will also available in attachments stub info. So before decoding,
> just read content-type value, get att's encoding and decode it
> according it.
>
> --
> ,,,^..^,,,
>
>
> On Fri, Jan 3, 2014 at 7:43 PM, Daniel Gonzalez <go...@gonvaled.com>
> wrote:
> > No, what I mean is "how can I keep track of the encoding used for each of
> > the attachments, so that I can decode then correctly whenever I want to"
> >
> >
> > On Fri, Jan 3, 2014 at 4:23 PM, Alexander Shorin <kx...@gmail.com>
> wrote:
> >
> >> Not sure if I follow your idea. You mean, that how you can set such
> >> charset info for existed attachments? In this case you have to
> >> reupload them.
> >> --
> >> ,,,^..^,,,
> >>
> >>
> >> On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <go...@gonvaled.com>
> >> wrote:
> >> > Thanks but, how do you set that on a per-attachment basis in a couchdb
> >> > document? If this is not supported, I guess I will have to add a
> mapping
> >> > "attachments-encoding" to the document where I can associate each
> >> > attachment with its encoding. Any comments on this?
> >> >
> >> >
> >> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com>
> >> wrote:
> >> >
> >> >> You can set MIME type as text/plain;charset=utf-8 to help browsers
> >> >> detect the correct content encoding.
> >> >> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> >> >> --
> >> >> ,,,^..^,,,
> >> >>
> >> >>
> >> >> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <
> gonvaled@gonvaled.com>
> >> >> wrote:
> >> >> > Hi,
> >> >> >
> >> >> > I have the following test script:
> >> >> >
> >> >> > # -*- coding: utf-8 -*-
> >> >> >
> >> >> > import os
> >> >> > import couchdb
> >> >> >
> >> >> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ
> φχψω'
> >> >> >
> >> >> > # Prepare a unicode file, encoded using ENCODING
> >> >> > ENCODING = 'utf-8'
> >> >> > filename = '/tmp/test'
> >> >> > open(filename, 'w').write(GREEK.encode(ENCODING))
> >> >> >
> >> >> > # Create an empty document
> >> >> > server = couchdb.Server()
> >> >> > db = server['cdb-tests']
> >> >> > doc_id = 'testing'
> >> >> > doc = { }
> >> >> > db[doc_id] = doc
> >> >> >
> >> >> > # Attach the file to the document
> >> >> > content = open(filename, 'rb') # Open the file for reading
> >> >> > db.put_attachment(doc, content, content_type='text/plain')
> >> >> >
> >> >> > As you can see, the file is utf-8 encoded, but when I attach that
> >> file to
> >> >> > couchdb, I have no way to specify this encoding. Thus, requesting
> the
> >> >> > attachment at http://localhost:5984/cdb-tests/testing/test returns
> >> the
> >> >> > following Response Headers:
> >> >> >
> >> >> > HTTP/1.1 200 OK
> >> >> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> >> >> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> >> >> > Date: Fri, 03 Jan 2014 13:43:36 GMT
> >> >> > Content-Type: text/plain
> >> >> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> >> >> > Content-Length: 102
> >> >> > Content-Encoding: gzip
> >> >> > Cache-Control: must-revalidate
> >> >> > Accept-Ranges: none
> >> >> >
> >> >> > Seeing the attachment with a browser shows complete gibberish. How
> >> can I
> >> >> > store the encoding for couchdb attachments?
> >> >> >
> >> >> > Thanks and regards,
> >> >> >
> >> >> > Daniel
> >> >> >
> >> >> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
> >> >>
> >>
>

Re: Specify attachment encoding for couchdb

Posted by Alexander Shorin <kx...@gmail.com>.
Erhm..just replace:

> db.put_attachment(doc, content, content_type='text/plain')

with

> db.put_attachment(doc, content, content_type='text/plain;charset=utf-8')

And CouchDB will remember it:

$ http HEAD http://localhost:5984/b/testing/test
HTTP/1.1 200 OK
Accept-Ranges: none
Cache-Control: must-revalidate
Content-Encoding: gzip
Content-Length: 102
Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
Content-Type: text/plain; charset=utf-8
Date: Fri, 03 Jan 2014 14:14:27 GMT
ETag: "7y85tiUeF/UX9kqpKAzQEw=="
Server: CouchDB/1.6.0+build.0bf1856 (Erlang OTP/R16B01)

it will also available in attachments stub info. So before decoding,
just read content-type value, get att's encoding and decode it
according it.

--
,,,^..^,,,


On Fri, Jan 3, 2014 at 7:43 PM, Daniel Gonzalez <go...@gonvaled.com> wrote:
> No, what I mean is "how can I keep track of the encoding used for each of
> the attachments, so that I can decode then correctly whenever I want to"
>
>
> On Fri, Jan 3, 2014 at 4:23 PM, Alexander Shorin <kx...@gmail.com> wrote:
>
>> Not sure if I follow your idea. You mean, that how you can set such
>> charset info for existed attachments? In this case you have to
>> reupload them.
>> --
>> ,,,^..^,,,
>>
>>
>> On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <go...@gonvaled.com>
>> wrote:
>> > Thanks but, how do you set that on a per-attachment basis in a couchdb
>> > document? If this is not supported, I guess I will have to add a mapping
>> > "attachments-encoding" to the document where I can associate each
>> > attachment with its encoding. Any comments on this?
>> >
>> >
>> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com>
>> wrote:
>> >
>> >> You can set MIME type as text/plain;charset=utf-8 to help browsers
>> >> detect the correct content encoding.
>> >> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
>> >> --
>> >> ,,,^..^,,,
>> >>
>> >>
>> >> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > I have the following test script:
>> >> >
>> >> > # -*- coding: utf-8 -*-
>> >> >
>> >> > import os
>> >> > import couchdb
>> >> >
>> >> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
>> >> >
>> >> > # Prepare a unicode file, encoded using ENCODING
>> >> > ENCODING = 'utf-8'
>> >> > filename = '/tmp/test'
>> >> > open(filename, 'w').write(GREEK.encode(ENCODING))
>> >> >
>> >> > # Create an empty document
>> >> > server = couchdb.Server()
>> >> > db = server['cdb-tests']
>> >> > doc_id = 'testing'
>> >> > doc = { }
>> >> > db[doc_id] = doc
>> >> >
>> >> > # Attach the file to the document
>> >> > content = open(filename, 'rb') # Open the file for reading
>> >> > db.put_attachment(doc, content, content_type='text/plain')
>> >> >
>> >> > As you can see, the file is utf-8 encoded, but when I attach that
>> file to
>> >> > couchdb, I have no way to specify this encoding. Thus, requesting the
>> >> > attachment at http://localhost:5984/cdb-tests/testing/test returns
>> the
>> >> > following Response Headers:
>> >> >
>> >> > HTTP/1.1 200 OK
>> >> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
>> >> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
>> >> > Date: Fri, 03 Jan 2014 13:43:36 GMT
>> >> > Content-Type: text/plain
>> >> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
>> >> > Content-Length: 102
>> >> > Content-Encoding: gzip
>> >> > Cache-Control: must-revalidate
>> >> > Accept-Ranges: none
>> >> >
>> >> > Seeing the attachment with a browser shows complete gibberish. How
>> can I
>> >> > store the encoding for couchdb attachments?
>> >> >
>> >> > Thanks and regards,
>> >> >
>> >> > Daniel
>> >> >
>> >> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
>> >>
>>

Re: Specify attachment encoding for couchdb

Posted by Daniel Gonzalez <go...@gonvaled.com>.
No, what I mean is "how can I keep track of the encoding used for each of
the attachments, so that I can decode then correctly whenever I want to"


On Fri, Jan 3, 2014 at 4:23 PM, Alexander Shorin <kx...@gmail.com> wrote:

> Not sure if I follow your idea. You mean, that how you can set such
> charset info for existed attachments? In this case you have to
> reupload them.
> --
> ,,,^..^,,,
>
>
> On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <go...@gonvaled.com>
> wrote:
> > Thanks but, how do you set that on a per-attachment basis in a couchdb
> > document? If this is not supported, I guess I will have to add a mapping
> > "attachments-encoding" to the document where I can associate each
> > attachment with its encoding. Any comments on this?
> >
> >
> > On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com>
> wrote:
> >
> >> You can set MIME type as text/plain;charset=utf-8 to help browsers
> >> detect the correct content encoding.
> >> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> >> --
> >> ,,,^..^,,,
> >>
> >>
> >> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I have the following test script:
> >> >
> >> > # -*- coding: utf-8 -*-
> >> >
> >> > import os
> >> > import couchdb
> >> >
> >> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
> >> >
> >> > # Prepare a unicode file, encoded using ENCODING
> >> > ENCODING = 'utf-8'
> >> > filename = '/tmp/test'
> >> > open(filename, 'w').write(GREEK.encode(ENCODING))
> >> >
> >> > # Create an empty document
> >> > server = couchdb.Server()
> >> > db = server['cdb-tests']
> >> > doc_id = 'testing'
> >> > doc = { }
> >> > db[doc_id] = doc
> >> >
> >> > # Attach the file to the document
> >> > content = open(filename, 'rb') # Open the file for reading
> >> > db.put_attachment(doc, content, content_type='text/plain')
> >> >
> >> > As you can see, the file is utf-8 encoded, but when I attach that
> file to
> >> > couchdb, I have no way to specify this encoding. Thus, requesting the
> >> > attachment at http://localhost:5984/cdb-tests/testing/test returns
> the
> >> > following Response Headers:
> >> >
> >> > HTTP/1.1 200 OK
> >> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> >> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> >> > Date: Fri, 03 Jan 2014 13:43:36 GMT
> >> > Content-Type: text/plain
> >> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> >> > Content-Length: 102
> >> > Content-Encoding: gzip
> >> > Cache-Control: must-revalidate
> >> > Accept-Ranges: none
> >> >
> >> > Seeing the attachment with a browser shows complete gibberish. How
> can I
> >> > store the encoding for couchdb attachments?
> >> >
> >> > Thanks and regards,
> >> >
> >> > Daniel
> >> >
> >> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
> >>
>

Re: Specify attachment encoding for couchdb

Posted by Alexander Shorin <kx...@gmail.com>.
Not sure if I follow your idea. You mean, that how you can set such
charset info for existed attachments? In this case you have to
reupload them.
--
,,,^..^,,,


On Fri, Jan 3, 2014 at 6:40 PM, Daniel Gonzalez <go...@gonvaled.com> wrote:
> Thanks but, how do you set that on a per-attachment basis in a couchdb
> document? If this is not supported, I guess I will have to add a mapping
> "attachments-encoding" to the document where I can associate each
> attachment with its encoding. Any comments on this?
>
>
> On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com> wrote:
>
>> You can set MIME type as text/plain;charset=utf-8 to help browsers
>> detect the correct content encoding.
>> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
>> --
>> ,,,^..^,,,
>>
>>
>> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com>
>> wrote:
>> > Hi,
>> >
>> > I have the following test script:
>> >
>> > # -*- coding: utf-8 -*-
>> >
>> > import os
>> > import couchdb
>> >
>> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
>> >
>> > # Prepare a unicode file, encoded using ENCODING
>> > ENCODING = 'utf-8'
>> > filename = '/tmp/test'
>> > open(filename, 'w').write(GREEK.encode(ENCODING))
>> >
>> > # Create an empty document
>> > server = couchdb.Server()
>> > db = server['cdb-tests']
>> > doc_id = 'testing'
>> > doc = { }
>> > db[doc_id] = doc
>> >
>> > # Attach the file to the document
>> > content = open(filename, 'rb') # Open the file for reading
>> > db.put_attachment(doc, content, content_type='text/plain')
>> >
>> > As you can see, the file is utf-8 encoded, but when I attach that file to
>> > couchdb, I have no way to specify this encoding. Thus, requesting the
>> > attachment at http://localhost:5984/cdb-tests/testing/test returns the
>> > following Response Headers:
>> >
>> > HTTP/1.1 200 OK
>> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
>> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
>> > Date: Fri, 03 Jan 2014 13:43:36 GMT
>> > Content-Type: text/plain
>> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
>> > Content-Length: 102
>> > Content-Encoding: gzip
>> > Cache-Control: must-revalidate
>> > Accept-Ranges: none
>> >
>> > Seeing the attachment with a browser shows complete gibberish. How can I
>> > store the encoding for couchdb attachments?
>> >
>> > Thanks and regards,
>> >
>> > Daniel
>> >
>> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
>>

Re: Specify attachment encoding for couchdb

Posted by Daniel Gonzalez <go...@gonvaled.com>.
Thanks but, how do you set that on a per-attachment basis in a couchdb
document? If this is not supported, I guess I will have to add a mapping
"attachments-encoding" to the document where I can associate each
attachment with its encoding. Any comments on this?


On Fri, Jan 3, 2014 at 3:18 PM, Alexander Shorin <kx...@gmail.com> wrote:

> You can set MIME type as text/plain;charset=utf-8 to help browsers
> detect the correct content encoding.
> See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
> --
> ,,,^..^,,,
>
>
> On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com>
> wrote:
> > Hi,
> >
> > I have the following test script:
> >
> > # -*- coding: utf-8 -*-
> >
> > import os
> > import couchdb
> >
> > GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
> >
> > # Prepare a unicode file, encoded using ENCODING
> > ENCODING = 'utf-8'
> > filename = '/tmp/test'
> > open(filename, 'w').write(GREEK.encode(ENCODING))
> >
> > # Create an empty document
> > server = couchdb.Server()
> > db = server['cdb-tests']
> > doc_id = 'testing'
> > doc = { }
> > db[doc_id] = doc
> >
> > # Attach the file to the document
> > content = open(filename, 'rb') # Open the file for reading
> > db.put_attachment(doc, content, content_type='text/plain')
> >
> > As you can see, the file is utf-8 encoded, but when I attach that file to
> > couchdb, I have no way to specify this encoding. Thus, requesting the
> > attachment at http://localhost:5984/cdb-tests/testing/test returns the
> > following Response Headers:
> >
> > HTTP/1.1 200 OK
> > Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> > ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> > Date: Fri, 03 Jan 2014 13:43:36 GMT
> > Content-Type: text/plain
> > Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> > Content-Length: 102
> > Content-Encoding: gzip
> > Cache-Control: must-revalidate
> > Accept-Ranges: none
> >
> > Seeing the attachment with a browser shows complete gibberish. How can I
> > store the encoding for couchdb attachments?
> >
> > Thanks and regards,
> >
> > Daniel
> >
> > PD: SO reference link: http://stackoverflow.com/q/20905157/647991
>

Re: Specify attachment encoding for couchdb

Posted by Alexander Shorin <kx...@gmail.com>.
You can set MIME type as text/plain;charset=utf-8 to help browsers
detect the correct content encoding.
See http://tools.ietf.org/html/rfc2068#section-3.4 for more info
--
,,,^..^,,,


On Fri, Jan 3, 2014 at 5:52 PM, Daniel Gonzalez <go...@gonvaled.com> wrote:
> Hi,
>
> I have the following test script:
>
> # -*- coding: utf-8 -*-
>
> import os
> import couchdb
>
> GREEK = u'ΑΒΓΔ ΕΖΗΘ ΙΚΛΜ ΝΞΟΠ ΡΣΤΥ ΦΧΨΩ αβγδ εζηθ ικλμ νξοπ ρςτυ φχψω'
>
> # Prepare a unicode file, encoded using ENCODING
> ENCODING = 'utf-8'
> filename = '/tmp/test'
> open(filename, 'w').write(GREEK.encode(ENCODING))
>
> # Create an empty document
> server = couchdb.Server()
> db = server['cdb-tests']
> doc_id = 'testing'
> doc = { }
> db[doc_id] = doc
>
> # Attach the file to the document
> content = open(filename, 'rb') # Open the file for reading
> db.put_attachment(doc, content, content_type='text/plain')
>
> As you can see, the file is utf-8 encoded, but when I attach that file to
> couchdb, I have no way to specify this encoding. Thus, requesting the
> attachment at http://localhost:5984/cdb-tests/testing/test returns the
> following Response Headers:
>
> HTTP/1.1 200 OK
> Server: CouchDB/1.2.0 (Erlang OTP/R15B01)
> ETag: "7y85tiUeF/UX9kqpKAzQEw=="
> Date: Fri, 03 Jan 2014 13:43:36 GMT
> Content-Type: text/plain
> Content-MD5: 7y85tiUeF/UX9kqpKAzQEw==
> Content-Length: 102
> Content-Encoding: gzip
> Cache-Control: must-revalidate
> Accept-Ranges: none
>
> Seeing the attachment with a browser shows complete gibberish. How can I
> store the encoding for couchdb attachments?
>
> Thanks and regards,
>
> Daniel
>
> PD: SO reference link: http://stackoverflow.com/q/20905157/647991