You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Brian Candler <B....@pobox.com> on 2010/01/15 10:50:10 UTC

Binary data [was Bulk CSV import?]

A couple more thoughts about importing and exporting aggregate batches of
document-oriented data.

* A desktop-friendly way to bundle documents is in a ZIP file. 
Unfortunately that's a binary format, and _list/_external use a JSON (UTF-8)
protocol.

I see that an _external function can give back base64-encoded binary data:
http://wiki.apache.org/couchdb/ExternalProcesses

I don't think _list or _show can, and in any case you'd need a ZIP library
written in Javascript.

Maybe this is too far out of scope for a JS-backed CouchApp. But with
erlview it makes a lot more sense: you have a binary-clean interface, and
many libraries available for handling binary formats. e.g.
http://www.erlang.org/doc/man/zip.html

* The other option I've tried is MIME multipart documents, but in my testing
I found that browsers don't handle them well. At best you just get the first
one saved. I think that option can be discarded.

Regards,

Brian.

Re: Binary data

Posted by Chris Anderson <jc...@apache.org>.
On Mon, Jan 18, 2010 at 6:39 AM, Brian Candler <B....@pobox.com> wrote:
>> > On Fri, Jan 15, 2010 at 09:32:34AM -0800, Chris Anderson wrote:
>> >> list and show should be able to return base64 data, to be decoded to
>> >> binary before sending to the client. I know there is a test that _show
>> >> can serve a favicon.ico file.
>> >
>> > Thank you for pointing that out, I found the _show test.
>> >
>> > I had already grepped the source for base64, and found only one relevant
>> > instance in couch_httpd_external.erl. Now I see that gets called from
>> > couch_httpd_show.erl too.
>> >
>> > I can't see how _list calls it though?
>>
>> that's worth writing a test for. written tests rock. -- move to dev@?
>
> Moved to dev.
>
> I've done the test for _update (and it already passes):
> https://issues.apache.org/jira/browse/COUCHDB-626
>
> However I'm not sure what you want to do with _list. At the moment you emit
> chunks of plain strings. Do you want something like
>    send({base64:"..."})
> ?

I think the send({base64:"..."}) option is easier to understand and
probably less burden on implementors.

Thanks,
Chris

>
> Or do you want to put a tag in the header which says all the chunks are
> base64? (like a Content-Transfer-Encoding: base64 which is stripped)
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Binary data

Posted by Brian Candler <B....@pobox.com>.
> > On Fri, Jan 15, 2010 at 09:32:34AM -0800, Chris Anderson wrote:
> >> list and show should be able to return base64 data, to be decoded to
> >> binary before sending to the client. I know there is a test that _show
> >> can serve a favicon.ico file.
> >
> > Thank you for pointing that out, I found the _show test.
> >
> > I had already grepped the source for base64, and found only one relevant
> > instance in couch_httpd_external.erl. Now I see that gets called from
> > couch_httpd_show.erl too.
> >
> > I can't see how _list calls it though?
> 
> that's worth writing a test for. written tests rock. -- move to dev@?

Moved to dev.

I've done the test for _update (and it already passes):
https://issues.apache.org/jira/browse/COUCHDB-626

However I'm not sure what you want to do with _list. At the moment you emit
chunks of plain strings. Do you want something like
    send({base64:"..."})
?

Or do you want to put a tag in the header which says all the chunks are
base64? (like a Content-Transfer-Encoding: base64 which is stripped)

Re: Binary data [was Bulk CSV import?]

Posted by Chris Anderson <jc...@apache.org>.
On Fri, Jan 15, 2010 at 2:16 PM, Brian Candler <B....@pobox.com> wrote:
> On Fri, Jan 15, 2010 at 09:32:34AM -0800, Chris Anderson wrote:
>> list and show should be able to return base64 data, to be decoded to
>> binary before sending to the client. I know there is a test that _show
>> can serve a favicon.ico file.
>
> Thank you for pointing that out, I found the _show test.
>
> I had already grepped the source for base64, and found only one relevant
> instance in couch_httpd_external.erl. Now I see that gets called from
> couch_httpd_show.erl too.
>
> I can't see how _list calls it though?
>

that's worth writing a test for. written tests rock. -- move to dev@?



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Binary data [was Bulk CSV import?]

Posted by Brian Candler <B....@pobox.com>.
On Fri, Jan 15, 2010 at 09:32:34AM -0800, Chris Anderson wrote:
> list and show should be able to return base64 data, to be decoded to
> binary before sending to the client. I know there is a test that _show
> can serve a favicon.ico file.

Thank you for pointing that out, I found the _show test.

I had already grepped the source for base64, and found only one relevant
instance in couch_httpd_external.erl. Now I see that gets called from
couch_httpd_show.erl too.

I can't see how _list calls it though?

Re: Binary data [was Bulk CSV import?]

Posted by Chris Anderson <jc...@apache.org>.
On Fri, Jan 15, 2010 at 1:50 AM, Brian Candler <B....@pobox.com> wrote:
> A couple more thoughts about importing and exporting aggregate batches of
> document-oriented data.
>
> * A desktop-friendly way to bundle documents is in a ZIP file.
> Unfortunately that's a binary format, and _list/_external use a JSON (UTF-8)
> protocol.
>
> I see that an _external function can give back base64-encoded binary data:
> http://wiki.apache.org/couchdb/ExternalProcesses
>
> I don't think _list or _show can, and in any case you'd need a ZIP library
> written in Javascript.

list and show should be able to return base64 data, to be decoded to
binary before sending to the client. I know there is a test that _show
can serve a favicon.ico file.

but yes, it seems a bit much to do ZIP in JS. maybe not in... 2010!

>
> Maybe this is too far out of scope for a JS-backed CouchApp. But with
> erlview it makes a lot more sense: you have a binary-clean interface, and
> many libraries available for handling binary formats. e.g.
> http://www.erlang.org/doc/man/zip.html
>
> * The other option I've tried is MIME multipart documents, but in my testing
> I found that browsers don't handle them well. At best you just get the first
> one saved. I think that option can be discarded.
>
> Regards,
>
> Brian.
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io