You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Norman Barker <no...@gmail.com> on 2009/10/23 19:27:57 UTC

chunked response and couch_doc_open

Hi,

is there a way (in Erlang) to open a couchdb document and to iterate
over the document body without having to open up all of the document
in memory?

I would like to use a chunked response to keep the system having a low
memory overhead.

Not a particular couch question, is there a method in erlang to find
the size (as in number of bytes) of a particular term?

many thanks,

Norman

Re: chunked response and couch_doc_open

Posted by Paul Davis <pa...@gmail.com>.

On Fri, Oct 23, 2009 at 2:29 PM, Norman Barker <no...@gmail.com> wrote:
> On Fri, Oct 23, 2009 at 12:19 PM, Paul Davis
> <pa...@gmail.com> wrote:
>> On Fri, Oct 23, 2009 at 2:11 PM, Norman Barker <no...@gmail.com> wrote:
>>> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
>>> <pa...@gmail.com> wrote:
>>>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>>>> over the document body without having to open up all of the document
>>>>> in memory?
>>>>>
>>>>> I would like to use a chunked response to keep the system having a low
>>>>> memory overhead.
>>>>>
>>>>> Not a particular couch question, is there a method in erlang to find
>>>>> the size (as in number of bytes) of a particular term?
>>>>>
>>>>> many thanks,
>>>>>
>>>>> Norman
>>>>>
>>>>
>>>> Norman,
>>>>
>>>> Well, for document JSON we store Erlang term binaries on disk so
>>>> there's no real way to stream a doc across the wire from disk without
>>>> loading the whole thing into RAM. Have you noticed CouchDB having
>>>> memory issues on read loads? Its generally pretty light on its memory
>>>> requirements for reads.
>>>>
>>>> The only way to get the size of a Term in bytes that I know of is the
>>>> brute force: size(term_to_binary(Term)) method.
>>>>
>>>> Paul Davis
>>>>
>>>
>>> I am sending sizeable JSON documents (a couple of mb), as this scales
>>> by X concurrent users then the problem grows. I have crashed erlang
>>> when the process gets up to about a 1gb of memory.  (Note, this was on
>>> windows) The workaround is to increase the memory allocation.
>>>
>>> Erlang (and couchdb) is fantastic in that it is so light to run as
>>> opposed to a J2EE server, streaming documents out would be good
>>> optimisation. Running a couchdb instance in < 30mb of memory space
>>> would be my ideal.
>>>
>>> If you can point me in the right direction then this is something I
>>> can contribute back, most of my erlang code so far has been specific
>>> to my application.
>>>
>>> Many thanks,
>>>
>>> Norman
>>>
>>
>> Norman,
>>
>> Streaming JSON docs in and out would require massive amounts of work
>> in rewriting lots of the core of CouchDB. Right down to making the
>> JSON parsers stream oriented. I'm not even sure where you'd get
>> started on such an undertaking.
>>
>> Though there was a bug reported earlier today with Windows doing weird
>> things with retaining memory for _bulk_docs calls, I wonder if there's
>> a connection.
>>
>> Paul Davis
>>
> Paul,
>
> I was thinking that perhaps this could be done at the mochijson2
> level,  and wonder if on the way out if there was an iterator approach
> that could be used within mochijson, but perhaps this impacts the
> format of the disk storage within couchdb. Certainly it is an
> optimisation, but without it does limit scalability and the premise of
> running on low commodity hardware. No criticism intended, I will be
> looking at this at some point.
>
> Norman
>

Norman,

Even with a streaming mochijson2, most of the core expects to be
working with 'materialized' documents. Rewriting to stream to disk
would require patches to at least: mochijson2, couch_httpd_*.erl
couch_db.erl, couch_db_updater.erl and well, pretty much all of
CouchDB really.

Its hard to say what's best in terms of app design, but really, the
alternative is quite heavy and fairly unlikely to make it into trunk
anytime soon if ever. Beyond just the making it work part, the amount
of complexity it'd add would most likely be prohibitive at best.

Paul Davis

Re: chunked response and couch_doc_open

Posted by Norman Barker <no...@gmail.com>.

On Fri, Oct 23, 2009 at 12:19 PM, Paul Davis
<pa...@gmail.com> wrote:
> On Fri, Oct 23, 2009 at 2:11 PM, Norman Barker <no...@gmail.com> wrote:
>> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
>> <pa...@gmail.com> wrote:
>>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
>>>> Hi,
>>>>
>>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>>> over the document body without having to open up all of the document
>>>> in memory?
>>>>
>>>> I would like to use a chunked response to keep the system having a low
>>>> memory overhead.
>>>>
>>>> Not a particular couch question, is there a method in erlang to find
>>>> the size (as in number of bytes) of a particular term?
>>>>
>>>> many thanks,
>>>>
>>>> Norman
>>>>
>>>
>>> Norman,
>>>
>>> Well, for document JSON we store Erlang term binaries on disk so
>>> there's no real way to stream a doc across the wire from disk without
>>> loading the whole thing into RAM. Have you noticed CouchDB having
>>> memory issues on read loads? Its generally pretty light on its memory
>>> requirements for reads.
>>>
>>> The only way to get the size of a Term in bytes that I know of is the
>>> brute force: size(term_to_binary(Term)) method.
>>>
>>> Paul Davis
>>>
>>
>> I am sending sizeable JSON documents (a couple of mb), as this scales
>> by X concurrent users then the problem grows. I have crashed erlang
>> when the process gets up to about a 1gb of memory.  (Note, this was on
>> windows) The workaround is to increase the memory allocation.
>>
>> Erlang (and couchdb) is fantastic in that it is so light to run as
>> opposed to a J2EE server, streaming documents out would be good
>> optimisation. Running a couchdb instance in < 30mb of memory space
>> would be my ideal.
>>
>> If you can point me in the right direction then this is something I
>> can contribute back, most of my erlang code so far has been specific
>> to my application.
>>
>> Many thanks,
>>
>> Norman
>>
>
> Norman,
>
> Streaming JSON docs in and out would require massive amounts of work
> in rewriting lots of the core of CouchDB. Right down to making the
> JSON parsers stream oriented. I'm not even sure where you'd get
> started on such an undertaking.
>
> Though there was a bug reported earlier today with Windows doing weird
> things with retaining memory for _bulk_docs calls, I wonder if there's
> a connection.
>
> Paul Davis
>
Paul,

I was thinking that perhaps this could be done at the mochijson2
level,  and wonder if on the way out if there was an iterator approach
that could be used within mochijson, but perhaps this impacts the
format of the disk storage within couchdb. Certainly it is an
optimisation, but without it does limit scalability and the premise of
running on low commodity hardware. No criticism intended, I will be
looking at this at some point.

Norman

Re: chunked response and couch_doc_open

Posted by Paul Davis <pa...@gmail.com>.

On Fri, Oct 23, 2009 at 2:11 PM, Norman Barker <no...@gmail.com> wrote:
> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
> <pa...@gmail.com> wrote:
>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
>>> Hi,
>>>
>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>> over the document body without having to open up all of the document
>>> in memory?
>>>
>>> I would like to use a chunked response to keep the system having a low
>>> memory overhead.
>>>
>>> Not a particular couch question, is there a method in erlang to find
>>> the size (as in number of bytes) of a particular term?
>>>
>>> many thanks,
>>>
>>> Norman
>>>
>>
>> Norman,
>>
>> Well, for document JSON we store Erlang term binaries on disk so
>> there's no real way to stream a doc across the wire from disk without
>> loading the whole thing into RAM. Have you noticed CouchDB having
>> memory issues on read loads? Its generally pretty light on its memory
>> requirements for reads.
>>
>> The only way to get the size of a Term in bytes that I know of is the
>> brute force: size(term_to_binary(Term)) method.
>>
>> Paul Davis
>>
>
> I am sending sizeable JSON documents (a couple of mb), as this scales
> by X concurrent users then the problem grows. I have crashed erlang
> when the process gets up to about a 1gb of memory.  (Note, this was on
> windows) The workaround is to increase the memory allocation.
>
> Erlang (and couchdb) is fantastic in that it is so light to run as
> opposed to a J2EE server, streaming documents out would be good
> optimisation. Running a couchdb instance in < 30mb of memory space
> would be my ideal.
>
> If you can point me in the right direction then this is something I
> can contribute back, most of my erlang code so far has been specific
> to my application.
>
> Many thanks,
>
> Norman
>

Norman,

Streaming JSON docs in and out would require massive amounts of work
in rewriting lots of the core of CouchDB. Right down to making the
JSON parsers stream oriented. I'm not even sure where you'd get
started on such an undertaking.

Though there was a bug reported earlier today with Windows doing weird
things with retaining memory for _bulk_docs calls, I wonder if there's
a connection.

Paul Davis

Re: chunked response and couch_doc_open

Posted by Andrew Melo <an...@gmail.com>.

On Fri, Oct 23, 2009 at 1:31 PM, Norman Barker <no...@gmail.com> wrote:
> On Fri, Oct 23, 2009 at 12:27 PM, Adam Kocoloski <ko...@apache.org> wrote:
>> On Oct 23, 2009, at 2:11 PM, Norman Barker wrote:
>>
>>> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
>>> <pa...@gmail.com> wrote:
>>>>
>>>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>>>> over the document body without having to open up all of the document
>>>>> in memory?
>>>>>
>>>>> I would like to use a chunked response to keep the system having a low
>>>>> memory overhead.
>>>>>
>>>>> Not a particular couch question, is there a method in erlang to find
>>>>> the size (as in number of bytes) of a particular term?
>>>>>
>>>>> many thanks,
>>>>>
>>>>> Norman
>>>>>
>>>>
>>>> Norman,
>>>>
>>>> Well, for document JSON we store Erlang term binaries on disk so
>>>> there's no real way to stream a doc across the wire from disk without
>>>> loading the whole thing into RAM. Have you noticed CouchDB having
>>>> memory issues on read loads? Its generally pretty light on its memory
>>>> requirements for reads.
>>>>
>>>> The only way to get the size of a Term in bytes that I know of is the
>>>> brute force: size(term_to_binary(Term)) method.
>>>>
>>>> Paul Davis
>>>>
>>>
>>> I am sending sizeable JSON documents (a couple of mb), as this scales
>>> by X concurrent users then the problem grows. I have crashed erlang
>>> when the process gets up to about a 1gb of memory.  (Note, this was on
>>> windows) The workaround is to increase the memory allocation.
>>>
>>> Erlang (and couchdb) is fantastic in that it is so light to run as
>>> opposed to a J2EE server, streaming documents out would be good
>>> optimisation. Running a couchdb instance in < 30mb of memory space
>>> would be my ideal.
>>>
>>> If you can point me in the right direction then this is something I
>>> can contribute back, most of my erlang code so far has been specific
>>> to my application.
>>>
>>> Many thanks,
>>>
>>> Norman
>>
>> Hi Norman, could your application store some of that data as an attachment
>> to the document?  Attachments can be streamed in both directions.  Best,
>>
>> Adam
>>
>>
>
> Hi Adam,
>
> it was seeing that attachments could be streamed that got me thinking
> about the JSON.  Unfortunately I couldn't use attachments since I am
> storing metadata, all of which has to be view-able.
>
> Norman
>

Hey Norman,

I don't know if it would work with your application, but I'm storing
logfiles as attachments and pulling out the "relevant" metadata to put
into the actual document. It was kindof my compromise to not end up
with ~MB sized json

Regards,
Andrew

Re: chunked response and couch_doc_open

Posted by Norman Barker <no...@gmail.com>.

On Fri, Oct 23, 2009 at 12:27 PM, Adam Kocoloski <ko...@apache.org> wrote:
> On Oct 23, 2009, at 2:11 PM, Norman Barker wrote:
>
>> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
>> <pa...@gmail.com> wrote:
>>>
>>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>>> over the document body without having to open up all of the document
>>>> in memory?
>>>>
>>>> I would like to use a chunked response to keep the system having a low
>>>> memory overhead.
>>>>
>>>> Not a particular couch question, is there a method in erlang to find
>>>> the size (as in number of bytes) of a particular term?
>>>>
>>>> many thanks,
>>>>
>>>> Norman
>>>>
>>>
>>> Norman,
>>>
>>> Well, for document JSON we store Erlang term binaries on disk so
>>> there's no real way to stream a doc across the wire from disk without
>>> loading the whole thing into RAM. Have you noticed CouchDB having
>>> memory issues on read loads? Its generally pretty light on its memory
>>> requirements for reads.
>>>
>>> The only way to get the size of a Term in bytes that I know of is the
>>> brute force: size(term_to_binary(Term)) method.
>>>
>>> Paul Davis
>>>
>>
>> I am sending sizeable JSON documents (a couple of mb), as this scales
>> by X concurrent users then the problem grows. I have crashed erlang
>> when the process gets up to about a 1gb of memory.  (Note, this was on
>> windows) The workaround is to increase the memory allocation.
>>
>> Erlang (and couchdb) is fantastic in that it is so light to run as
>> opposed to a J2EE server, streaming documents out would be good
>> optimisation. Running a couchdb instance in < 30mb of memory space
>> would be my ideal.
>>
>> If you can point me in the right direction then this is something I
>> can contribute back, most of my erlang code so far has been specific
>> to my application.
>>
>> Many thanks,
>>
>> Norman
>
> Hi Norman, could your application store some of that data as an attachment
> to the document?  Attachments can be streamed in both directions.  Best,
>
> Adam
>
>

Hi Adam,

it was seeing that attachments could be streamed that got me thinking
about the JSON.  Unfortunately I couldn't use attachments since I am
storing metadata, all of which has to be view-able.

Norman

Re: chunked response and couch_doc_open

Posted by Adam Kocoloski <ko...@apache.org>.

On Oct 23, 2009, at 2:11 PM, Norman Barker wrote:

> On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
> <pa...@gmail.com> wrote:
>> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <norman.barker@gmail.com 
>> > wrote:
>>> Hi,
>>>
>>> is there a way (in Erlang) to open a couchdb document and to iterate
>>> over the document body without having to open up all of the document
>>> in memory?
>>>
>>> I would like to use a chunked response to keep the system having a  
>>> low
>>> memory overhead.
>>>
>>> Not a particular couch question, is there a method in erlang to find
>>> the size (as in number of bytes) of a particular term?
>>>
>>> many thanks,
>>>
>>> Norman
>>>
>>
>> Norman,
>>
>> Well, for document JSON we store Erlang term binaries on disk so
>> there's no real way to stream a doc across the wire from disk without
>> loading the whole thing into RAM. Have you noticed CouchDB having
>> memory issues on read loads? Its generally pretty light on its memory
>> requirements for reads.
>>
>> The only way to get the size of a Term in bytes that I know of is the
>> brute force: size(term_to_binary(Term)) method.
>>
>> Paul Davis
>>
>
> I am sending sizeable JSON documents (a couple of mb), as this scales
> by X concurrent users then the problem grows. I have crashed erlang
> when the process gets up to about a 1gb of memory.  (Note, this was on
> windows) The workaround is to increase the memory allocation.
>
> Erlang (and couchdb) is fantastic in that it is so light to run as
> opposed to a J2EE server, streaming documents out would be good
> optimisation. Running a couchdb instance in < 30mb of memory space
> would be my ideal.
>
> If you can point me in the right direction then this is something I
> can contribute back, most of my erlang code so far has been specific
> to my application.
>
> Many thanks,
>
> Norman

Hi Norman, could your application store some of that data as an  
attachment to the document?  Attachments can be streamed in both  
directions.  Best,

Adam

Re: chunked response and couch_doc_open

Posted by Norman Barker <no...@gmail.com>.

On Fri, Oct 23, 2009 at 11:33 AM, Paul Davis
<pa...@gmail.com> wrote:
> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
>> Hi,
>>
>> is there a way (in Erlang) to open a couchdb document and to iterate
>> over the document body without having to open up all of the document
>> in memory?
>>
>> I would like to use a chunked response to keep the system having a low
>> memory overhead.
>>
>> Not a particular couch question, is there a method in erlang to find
>> the size (as in number of bytes) of a particular term?
>>
>> many thanks,
>>
>> Norman
>>
>
> Norman,
>
> Well, for document JSON we store Erlang term binaries on disk so
> there's no real way to stream a doc across the wire from disk without
> loading the whole thing into RAM. Have you noticed CouchDB having
> memory issues on read loads? Its generally pretty light on its memory
> requirements for reads.
>
> The only way to get the size of a Term in bytes that I know of is the
> brute force: size(term_to_binary(Term)) method.
>
> Paul Davis
>

I am sending sizeable JSON documents (a couple of mb), as this scales
by X concurrent users then the problem grows. I have crashed erlang
when the process gets up to about a 1gb of memory.  (Note, this was on
windows) The workaround is to increase the memory allocation.

Erlang (and couchdb) is fantastic in that it is so light to run as
opposed to a J2EE server, streaming documents out would be good
optimisation. Running a couchdb instance in < 30mb of memory space
would be my ideal.

If you can point me in the right direction then this is something I
can contribute back, most of my erlang code so far has been specific
to my application.

Many thanks,

Norman

Re: chunked response and couch_doc_open

Posted by Paul Davis <pa...@gmail.com>.

On Fri, Oct 23, 2009 at 3:34 PM, Zoltan Lajos Kis <ki...@tmit.bme.hu> wrote:
> Paul Davis wrote:
>>>
>>> Hello,
>>>
>>> You can use the erts_debug:size(Term) and erts_debug:flat_size(Term)
>>> functions. They are somewhat
>>> documented here:
>>> http://www.erlang.org/doc/efficiency_guide/processes.html .
>>> Note that the returned
>>> value is in words and not bytes (see erlang:system_info(wordsize). ).
>>>
>>> Zoltan.
>>>
>>>
>>
>> Zachary,
>>
>> Most interesting, but I'm a bit confused by what its reporting:
>>
>> 1> erts_debug:size("stuff") * erlang:system_info(wordsize).
>> 80
>> 2> size(term_to_binary("stuff")).
>> 9
>>
>> Is it really using 80 bytes internally to represent that?
>>
>> Paul Davis
>>
>
> Hello.
>
> "stuff" is a linked list ( [115,116,117,102,102] ) built of cons cells,
> each consisting of one handler (pointer) to the current item, and one to the
> next cons cell.
> That is ten pointers each taking a word, 10 x 8 = 80 bytes in a 64-bit OS.

Oh! Totally didn't realize that was counting the size of the internals
like that, but makes obvious sense now that you point it out.

> Converted to binary it is stored as you would expect it to be (in 5 bytes),
> plus you get some
> overhead (version info, metadata, list length, whatever) pumping it up to 9
> bytes.
>
> Btw I believe the erts_debug:size() functions only deal with the size of the
> term structure, so they
> will behave odd if used with binaries.
>
> Zoltan (who is definitely not Zolton ;))

Whoops! Must've transposed that in my head.

Paul Davis

Re: chunked response and couch_doc_open

Posted by Zachary Zolton <za...@gmail.com>.

I think we have a case of mistaken Zolton's--or was it Zoltan's?  (^_-)

On Oct 23, 2009, at 2:34 PM, Zoltan Lajos Kis <ki...@tmit.bme.hu> wrote:

> Paul Davis wrote:
>>> Hello,
>>>
>>> You can use the erts_debug:size(Term) and erts_debug:flat_size(Term)
>>> functions. They are somewhat
>>> documented here: http://www.erlang.org/doc/efficiency_guide/processes.html 
>>>  .
>>> Note that the returned
>>> value is in words and not bytes (see erlang:system_info 
>>> (wordsize). ).
>>>
>>> Zoltan.
>>>
>>>
>>
>> Zachary,
>>
>> Most interesting, but I'm a bit confused by what its reporting:
>>
>> 1> erts_debug:size("stuff") * erlang:system_info(wordsize).
>> 80
>> 2> size(term_to_binary("stuff")).
>> 9
>>
>> Is it really using 80 bytes internally to represent that?
>>
>> Paul Davis
>>
> Hello.
>
> "stuff" is a linked list ( [115,116,117,102,102] ) built of cons  
> cells,
> each consisting of one handler (pointer) to the current item, and  
> one to the next cons cell.
> That is ten pointers each taking a word, 10 x 8 = 80 bytes in a 64- 
> bit OS.
>
> Converted to binary it is stored as you would expect it to be (in 5  
> bytes), plus you get some
> overhead (version info, metadata, list length, whatever) pumping it  
> up to 9 bytes.
>
> Btw I believe the erts_debug:size() functions only deal with the  
> size of the term structure, so they
> will behave odd if used with binaries.
>
> Zoltan (who is definitely not Zolton ;))
>

Re: chunked response and couch_doc_open

Posted by Zoltan Lajos Kis <ki...@tmit.bme.hu>.

Paul Davis wrote:
>> Hello,
>>
>> You can use the erts_debug:size(Term) and erts_debug:flat_size(Term)
>> functions. They are somewhat
>> documented here: http://www.erlang.org/doc/efficiency_guide/processes.html .
>> Note that the returned
>> value is in words and not bytes (see erlang:system_info(wordsize). ).
>>
>> Zoltan.
>>
>>     
>
> Zachary,
>
> Most interesting, but I'm a bit confused by what its reporting:
>
> 1> erts_debug:size("stuff") * erlang:system_info(wordsize).
> 80
> 2> size(term_to_binary("stuff")).
> 9
>
> Is it really using 80 bytes internally to represent that?
>
> Paul Davis
>   
Hello.

"stuff" is a linked list ( [115,116,117,102,102] ) built of cons cells,
each consisting of one handler (pointer) to the current item, and one to 
the next cons cell.
That is ten pointers each taking a word, 10 x 8 = 80 bytes in a 64-bit OS.

Converted to binary it is stored as you would expect it to be (in 5 
bytes), plus you get some
overhead (version info, metadata, list length, whatever) pumping it up 
to 9 bytes.

Btw I believe the erts_debug:size() functions only deal with the size of 
the term structure, so they
will behave odd if used with binaries.

Zoltan (who is definitely not Zolton ;))

Re: chunked response and couch_doc_open

Posted by Paul Davis <pa...@gmail.com>.

> Hello,
>
> You can use the erts_debug:size(Term) and erts_debug:flat_size(Term)
> functions. They are somewhat
> documented here: http://www.erlang.org/doc/efficiency_guide/processes.html .
> Note that the returned
> value is in words and not bytes (see erlang:system_info(wordsize). ).
>
> Zoltan.
>

Zachary,

Most interesting, but I'm a bit confused by what its reporting:

1> erts_debug:size("stuff") * erlang:system_info(wordsize).
80
2> size(term_to_binary("stuff")).
9

Is it really using 80 bytes internally to represent that?

Paul Davis

Re: chunked response and couch_doc_open

Posted by Zoltan Lajos Kis <ki...@tmit.bme.hu>.

Paul Davis wrote:
> On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
>   
>> Hi,
>>
>> is there a way (in Erlang) to open a couchdb document and to iterate
>> over the document body without having to open up all of the document
>> in memory?
>>
>> I would like to use a chunked response to keep the system having a low
>> memory overhead.
>>
>> Not a particular couch question, is there a method in erlang to find
>> the size (as in number of bytes) of a particular term?
>>
>> many thanks,
>>
>> Norman
>>
>>     
>
> Norman,
>
> Well, for document JSON we store Erlang term binaries on disk so
> there's no real way to stream a doc across the wire from disk without
> loading the whole thing into RAM. Have you noticed CouchDB having
> memory issues on read loads? Its generally pretty light on its memory
> requirements for reads.
>
> The only way to get the size of a Term in bytes that I know of is the
> brute force: size(term_to_binary(Term)) method.
>
> Paul Davis
>   
Hello,

You can use the erts_debug:size(Term) and erts_debug:flat_size(Term) 
functions. They are somewhat
documented here: 
http://www.erlang.org/doc/efficiency_guide/processes.html . Note that 
the returned
value is in words and not bytes (see erlang:system_info(wordsize). ).

Zoltan.

Re: chunked response and couch_doc_open

Posted by Paul Davis <pa...@gmail.com>.

On Fri, Oct 23, 2009 at 1:27 PM, Norman Barker <no...@gmail.com> wrote:
> Hi,
>
> is there a way (in Erlang) to open a couchdb document and to iterate
> over the document body without having to open up all of the document
> in memory?
>
> I would like to use a chunked response to keep the system having a low
> memory overhead.
>
> Not a particular couch question, is there a method in erlang to find
> the size (as in number of bytes) of a particular term?
>
> many thanks,
>
> Norman
>

Norman,

Well, for document JSON we store Erlang term binaries on disk so
there's no real way to stream a doc across the wire from disk without
loading the whole thing into RAM. Have you noticed CouchDB having
memory issues on read loads? Its generally pretty light on its memory
requirements for reads.

The only way to get the size of a Term in bytes that I know of is the
brute force: size(term_to_binary(Term)) method.

Paul Davis