You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by James Marca <jm...@translab.its.uci.edu> on 2009/05/01 22:41:34 UTC

possible bug using bulk docs?

Hi All, 

I'm getting an out of memory type of crash when uploading lots of
files using bulk docs, and I'm wondering whether
this is a known issue or user error.

I set the logging to debug and this is what I see:


[Fri, 01 May 2009 20:27:30 GMT] [debug] [<0.108.0>] 'POST'
/d12_june2007/_bulk_docs {1,1}
Headers: [{'Connection',"TE, close"},
          {'Content-Length',"257418477"},
          {'Host',"127.0.0.1:5984"},
          {"Te","deflate,gzip;q=0.3"},
          {'User-Agent',"libwww-perl/5.805"}]

[Fri, 01 May 2009 20:31:01 GMT] [info] [<0.108.0>] 127.0.0.1 - -
'POST' /d12_june2007/_bulk_docs 201


In top I can watch the RAM usage go up and down until finally it peaks
and the server is just gone.

That content length is from 1278 documents.  I can easily split that
pile up into fewer documents (and that is what I am going to do) but
I thought I'd raise the issue here as this feels like a software bug
somewhere.  

I am running the release 0.9, on Gentoo, using the gentoo ebuild,
x86_64 using the latest (compiled yesterday) release of erlang on
gentoo.  

I'm more than happy to pull new code from git or svn to test things
out too.

Regards, 
James

-- 
James E. Marca
Researcher
Institute of Transportation Studies
AIRB Suite 4000
University of California
Irvine, CA 92697-3600
jmarca@translab.its.uci.edu

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: possible bug using bulk docs?

Posted by James Marca <jm...@translab.its.uci.edu>.
As another data point, I chunked the bulk docs call into 100 docs
(from 1200) and the server crashed again on the first bulk_docs call.
I split it into chunks of 10 docs, and the loading of all 1200+ went
off without a hitch.

Also, perhaps I was wrong in saying it was an out of memory error that
killed the server.  I watched while it was processing the 100 docs, and it
just disappeared with only about 50 to 60% of RAM allocated to the
process, and my server logs said nothing about OOM process
termination.  

Cheers,
James

On Fri, May 01, 2009 at 01:41:34PM -0700, James Marca wrote:
> Hi All, 
> 
> I'm getting an out of memory type of crash when uploading lots of
> files using bulk docs, and I'm wondering whether
> this is a known issue or user error.
> 
> I set the logging to debug and this is what I see:
> 
> 
> [Fri, 01 May 2009 20:27:30 GMT] [debug] [<0.108.0>] 'POST'
> /d12_june2007/_bulk_docs {1,1}
> Headers: [{'Connection',"TE, close"},
>           {'Content-Length',"257418477"},
>           {'Host',"127.0.0.1:5984"},
>           {"Te","deflate,gzip;q=0.3"},
>           {'User-Agent',"libwww-perl/5.805"}]
> 
> [Fri, 01 May 2009 20:31:01 GMT] [info] [<0.108.0>] 127.0.0.1 - -
> 'POST' /d12_june2007/_bulk_docs 201
> 
> 
> In top I can watch the RAM usage go up and down until finally it peaks
> and the server is just gone.
> 
> That content length is from 1278 documents.  I can easily split that
> pile up into fewer documents (and that is what I am going to do) but
> I thought I'd raise the issue here as this feels like a software bug
> somewhere.  
> 
> I am running the release 0.9, on Gentoo, using the gentoo ebuild,
> x86_64 using the latest (compiled yesterday) release of erlang on
> gentoo.  
> 
> I'm more than happy to pull new code from git or svn to test things
> out too.
> 
> Regards, 
> James
> 
> -- 
> James E. Marca
> Researcher
> Institute of Transportation Studies
> AIRB Suite 4000
> University of California
> Irvine, CA 92697-3600
> jmarca@translab.its.uci.edu
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: possible bug using bulk docs?

Posted by James Marca <jm...@translab.its.uci.edu>.
Paul,

Thanks for the insight!  The "3 copies" explains a lot.  I have 8g of
RAM and two more or less identical machines, one running a perl script
to parse the data that gets up to about 50% of RAM with each parsed
file, one running database servers (including couchdb) and accepting
data.  It makes sense that it would die, as 3 times 50% is more than
100!

James



On Fri, May 01, 2009 at 03:59:49PM -0700, Paul Davis wrote:
> James,
> 
> This sounds like you're hitting Erlang's out of memory handling. Which
> is to shutdown the entire VM. Part of the Erlang way is to fail fast
> and allow the code to be restarted. It'd probably be better if we
> maybe try and detect that we're running out of memory but that seems
> like it could get 'interesting' real quick.
> 
> If you're doing things with attachments, you might check out the
> standalone API as that should make more efficient use of memory. If
> your JSON is really that big, you'll just have to tune your _bulk_docs
> size to fit in RAM. And remember that there can be three copies of
> your JSON in ram when you have the JSON string, the Erlang term
> representation, and the binary before writing to disk.
> 
> HTH,
> Paul Davis
> 
> On Fri, May 1, 2009 at 1:41 PM, James Marca <jm...@translab.its.uci.edu> wrote:
> > Hi All,
> >
> > I'm getting an out of memory type of crash when uploading lots of
> > files using bulk docs, and I'm wondering whether
> > this is a known issue or user error.
> >
> > I set the logging to debug and this is what I see:
> >
> >
> > [Fri, 01 May 2009 20:27:30 GMT] [debug] [<0.108.0>] 'POST'
> > /d12_june2007/_bulk_docs {1,1}
> > Headers: [{'Connection',"TE, close"},
> >          {'Content-Length',"257418477"},
> >          {'Host',"127.0.0.1:5984"},
> >          {"Te","deflate,gzip;q=0.3"},
> >          {'User-Agent',"libwww-perl/5.805"}]
> >
> > [Fri, 01 May 2009 20:31:01 GMT] [info] [<0.108.0>] 127.0.0.1 - -
> > 'POST' /d12_june2007/_bulk_docs 201
> >
> >
> > In top I can watch the RAM usage go up and down until finally it peaks
> > and the server is just gone.
> >
> > That content length is from 1278 documents.  I can easily split that
> > pile up into fewer documents (and that is what I am going to do) but
> > I thought I'd raise the issue here as this feels like a software bug
> > somewhere.
> >
> > I am running the release 0.9, on Gentoo, using the gentoo ebuild,
> > x86_64 using the latest (compiled yesterday) release of erlang on
> > gentoo.
> >
> > I'm more than happy to pull new code from git or svn to test things
> > out too.
> >
> > Regards,
> > James
> >
> > --
> > James E. Marca
> > Researcher
> > Institute of Transportation Studies
> > AIRB Suite 4000
> > University of California
> > Irvine, CA 92697-3600
> > jmarca@translab.its.uci.edu
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
> >

-- 
James E. Marca, PhD
Researcher
Institute of Transportation Studies
AIRB Suite 4000
University of California
Irvine, CA 92697-3600
jmarca@translab.its.uci.edu
(949) 824-6287

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: possible bug using bulk docs?

Posted by Paul Davis <pa...@gmail.com>.
James,

This sounds like you're hitting Erlang's out of memory handling. Which
is to shutdown the entire VM. Part of the Erlang way is to fail fast
and allow the code to be restarted. It'd probably be better if we
maybe try and detect that we're running out of memory but that seems
like it could get 'interesting' real quick.

If you're doing things with attachments, you might check out the
standalone API as that should make more efficient use of memory. If
your JSON is really that big, you'll just have to tune your _bulk_docs
size to fit in RAM. And remember that there can be three copies of
your JSON in ram when you have the JSON string, the Erlang term
representation, and the binary before writing to disk.

HTH,
Paul Davis

On Fri, May 1, 2009 at 1:41 PM, James Marca <jm...@translab.its.uci.edu> wrote:
> Hi All,
>
> I'm getting an out of memory type of crash when uploading lots of
> files using bulk docs, and I'm wondering whether
> this is a known issue or user error.
>
> I set the logging to debug and this is what I see:
>
>
> [Fri, 01 May 2009 20:27:30 GMT] [debug] [<0.108.0>] 'POST'
> /d12_june2007/_bulk_docs {1,1}
> Headers: [{'Connection',"TE, close"},
>          {'Content-Length',"257418477"},
>          {'Host',"127.0.0.1:5984"},
>          {"Te","deflate,gzip;q=0.3"},
>          {'User-Agent',"libwww-perl/5.805"}]
>
> [Fri, 01 May 2009 20:31:01 GMT] [info] [<0.108.0>] 127.0.0.1 - -
> 'POST' /d12_june2007/_bulk_docs 201
>
>
> In top I can watch the RAM usage go up and down until finally it peaks
> and the server is just gone.
>
> That content length is from 1278 documents.  I can easily split that
> pile up into fewer documents (and that is what I am going to do) but
> I thought I'd raise the issue here as this feels like a software bug
> somewhere.
>
> I am running the release 0.9, on Gentoo, using the gentoo ebuild,
> x86_64 using the latest (compiled yesterday) release of erlang on
> gentoo.
>
> I'm more than happy to pull new code from git or svn to test things
> out too.
>
> Regards,
> James
>
> --
> James E. Marca
> Researcher
> Institute of Transportation Studies
> AIRB Suite 4000
> University of California
> Irvine, CA 92697-3600
> jmarca@translab.its.uci.edu
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>