You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Jeff Macdonald <ma...@gmail.com> on 2009/11/23 23:35:12 UTC

all_dbs_active - what should an app do when it gets this?

Hi all,

I have a simple app that is creating a lot of databases. Eventually
when I get to some number of databases, I get this error
all_dbs_active on insert. I thought I just sleep (ick) and try again,
but when I went to Futon after the program was done for many minutes,
I get that error in the web interface too. So I'm rather puzzled at
what to do. I'm sure there is some config var somewhere where I can up
that limit, but I'm more interested in what the proper thing to do
when such a limit is met.

TIA

-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Nov 25, 2009 at 2:55 PM, Adam Kocoloski <ko...@apache.org> wrote:
> On Nov 25, 2009, at 2:23 PM, Paul Davis wrote:
>
>>> #!/usr/bin/perl
>>>
>>> use FindBin qw($Bin);
>>> use lib ("$Bin/lib","$Bin/cpan-lib");
>>>
>>> use Net::CouchDB;
>>>
>>> my $host=shift;
>>> my $secs=shift;
>>>
>>> my $couch=Net::CouchDB->new($host);
>>>
>>> my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);
>>>
>>> foreach (1...200) {
>>>        my $dbh=$couch->create_db("event-$_");
>>>        print "Created database $_\n";
>>>        $dbh->insert(@docs);
>>>        sleep($secs);
>>> }
>>
>> IANAPM, but if $dbh is holding an open connection you could very well
>> trigger this quite easily. Can you try replacing the sleep($secs) with
>> something like $dbh->close()? Any easy way to check this is to watch
>> `netstat -tap tcp` and see if the number of sockets on either machine
>> is growing monotonically.
>>
>> HTH,
>> Paul Davis
>
> Paul, that was my first thought too, but isn't the DB only considered "active" for the lifetime of a request?  E.g. it doesn't matter if the connections are kept open or not, once couch_db:close(Db) is called the reference counter gets decremented and couch_server can drop it from the LRU cache.  At least that's my reading of the code. Best,
>
> Adam
>
>

Adam,

You make a good point there. Put that way, I do believe that would be
the expected behavior. Though I don't know if we test that condition
explicitly anywhere. The JS tests are limited by browser connections,
and the ETAP tests haven't gotten into making HTTP requests to probe
that layer yet.

Paul

Re: all_dbs_active - what should an app do when it gets this?

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 25, 2009, at 3:08 PM, Damien Katz wrote:

> I think I know what's happening. With delayed commits each database waits ~1 sec before fully committing to disk. So each database is considered "open" until that commit happens. So this looks like normal behavior to me.
> 
> We could try to tell a database to fully commit when that happens, then close it, but it can take a long time to complete a fsync and a client could reopen the database in that time, etc. There are no perfect solutions here.
> 
> -Damien

Good catch, you're quite right.  Best,

Adam

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 9:52 PM, Paul Joseph Davis
<pa...@gmail.com> wrote:
>
>
>
>
> On Nov 25, 2009, at 9:18 PM, Jeff Macdonald <ma...@gmail.com> wrote:
>
>> On Wed, Nov 25, 2009 at 7:35 PM, Paul Davis <pa...@gmail.com>
> Trunk is always labelled as 0.1 + last release so technically it's 0.11 but
> I prefer to think of it as trunk.
>
> Feel free to hop on irc to get build help.

Turns out building on the mac was a breeze. I had 0.10 installed via
mac ports. I hopped on IRC to report:

$ macbook-pro:to_many_active jeff$ ./all-active http://localhost:5985 0
...
Created database 200

:)

-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Joseph Davis <pa...@gmail.com>.



On Nov 25, 2009, at 9:18 PM, Jeff Macdonald <ma...@gmail.com>  
wrote:

> On Wed, Nov 25, 2009 at 7:35 PM, Paul Davis <paul.joseph.davis@gmail.com 
> > wrote:
>> Jeff,
>>
>> You rightly pointed out that once we get to the 101st db, the first  
>> db
>> should be idle which triggered my bug sleuthing mode. It turns out
>> that once a delayed commit finished it wasn't updating the state of
>> the gen_server in couch_db.erl where the idle status is checked.
>>
>> Can you `svn up` and check that your tests all pass now?
>
> I have 4 days to figure out how to do that on my Mac book. :)
> At work I was testing on a CentOS system. I'm I correct in thinking
> I'll be pulling over .11?
>
>
>
> -- 
> Jeff Macdonald
> Ayer, MA

Trunk is always labelled as 0.1 + last release so technically it's  
0.11 but I prefer to think of it as trunk.

Feel free to hop on irc to get build help.

Paul Davis

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 7:35 PM, Paul Davis <pa...@gmail.com> wrote:
> Jeff,
>
> You rightly pointed out that once we get to the 101st db, the first db
> should be idle which triggered my bug sleuthing mode. It turns out
> that once a delayed commit finished it wasn't updating the state of
> the gen_server in couch_db.erl where the idle status is checked.
>
> Can you `svn up` and check that your tests all pass now?

I have 4 days to figure out how to do that on my Mac book. :)
At work I was testing on a CentOS system. I'm I correct in thinking
I'll be pulling over .11?



-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Nov 25, 2009 at 5:53 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Wed, Nov 25, 2009 at 3:12 PM, Paul Davis <pa...@gmail.com> wrote:
>> That does seem logical. I've duplicated this behavior in Python as well:
>>
>> #! /usr/bin/env python
>>
>> import couchdb
>> conns = []
>> for i in range(200):
>>    conns.append(couchdb.Server("http://127.0.0.1:5984/").create("test-%s" % i))
>>    docs = [{"_id": "%s" % j, "lang": "erlang"} for j in range(500)]
>>    conns[-1].update(docs)
>
> Paul, does this code create 200 connections? It's been a while since
> I've done Python. The Perl version I created only creates 1 connection
> and tries to create 200 databases with that single connection.
>
>> And that runs fine with no call to update.
>
> yeah, creating 200 databases is no issue. It is once you add docs the
> issue shows up.
>
> --
> Jeff Macdonald
> Ayer, MA
>

Jeff,

You rightly pointed out that once we get to the 101st db, the first db
should be idle which triggered my bug sleuthing mode. It turns out
that once a delayed commit finished it wasn't updating the state of
the gen_server in couch_db.erl where the idle status is checked.

Can you `svn up` and check that your tests all pass now?

Paul Davis

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 3:12 PM, Paul Davis <pa...@gmail.com> wrote:
> That does seem logical. I've duplicated this behavior in Python as well:
>
> #! /usr/bin/env python
>
> import couchdb
> conns = []
> for i in range(200):
>    conns.append(couchdb.Server("http://127.0.0.1:5984/").create("test-%s" % i))
>    docs = [{"_id": "%s" % j, "lang": "erlang"} for j in range(500)]
>    conns[-1].update(docs)

Paul, does this code create 200 connections? It's been a while since
I've done Python. The Perl version I created only creates 1 connection
and tries to create 200 databases with that single connection.

> And that runs fine with no call to update.

yeah, creating 200 databases is no issue. It is once you add docs the
issue shows up.

-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Nov 25, 2009 at 3:08 PM, Damien Katz <da...@apache.org> wrote:
> I think I know what's happening. With delayed commits each database waits ~1
> sec before fully committing to disk. So each database is considered "open"
> until that commit happens. So this looks like normal behavior to me.
>
> We could try to tell a database to fully commit when that happens, then
> close it, but it can take a long time to complete a fsync and a client could
> reopen the database in that time, etc. There are no perfect solutions here.
>
> -Damien
>

That does seem logical. I've duplicated this behavior in Python as well:

#! /usr/bin/env python

import couchdb
conns = []
for i in range(200):
    conns.append(couchdb.Server("http://127.0.0.1:5984/").create("test-%s" % i))
    docs = [{"_id": "%s" % j, "lang": "erlang"} for j in range(500)]
    conns[-1].update(docs)

And that runs fine with no call to update.

Paul

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 4:49 PM, Damien Katz <da...@apache.org> wrote:
> I looked again at your crash stack traces. It's caused by a timeout while
> creating the database. It looks like you are very IO bound, and creating the
> databases, which is just creating a new file a writing an initial header (<
> 4k), is timing out. We should probably add an infinite timeout there to
> prevent this crash.

while that is true, those crash dumps are not really related to the
problem of having all the databases active. :(
In other words, those crash dumps show another problem. Oh, and I
don't recall if I said I'm using 0.10.


-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Damien Katz <da...@apache.org>.
I looked again at your crash stack traces. It's caused by a timeout  
while creating the database. It looks like you are very IO bound, and  
creating the databases, which is just creating a new file a writing an  
initial header (< 4k), is timing out. We should probably add an  
infinite timeout there to prevent this crash.

-Damien


On Nov 25, 2009, at 4:27 PM, Jeff Macdonald wrote:

> On Wed, Nov 25, 2009 at 3:08 PM, Damien Katz <da...@apache.org>  
> wrote:
>> I think I know what's happening. With delayed commits each database  
>> waits ~1
>> sec before fully committing to disk. So each database is considered  
>> "open"
>> until that commit happens. So this looks like normal behavior to me.
>
> Yes, I saw the delayed commits, but I would think the test run of a
> sleep of 1 sec between database creations would allow couchdb to do
> commits and therefore no longer be active. In other words, the first
> database should well be inactive after getting to the creation of the
> 99th database as 99 seconds would of passed.
>
> I should of said I'm using the standard default.ini.
>
> -- 
> Jeff Macdonald
> Ayer, MA


Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 3:08 PM, Damien Katz <da...@apache.org> wrote:
> I think I know what's happening. With delayed commits each database waits ~1
> sec before fully committing to disk. So each database is considered "open"
> until that commit happens. So this looks like normal behavior to me.

Yes, I saw the delayed commits, but I would think the test run of a
sleep of 1 sec between database creations would allow couchdb to do
commits and therefore no longer be active. In other words, the first
database should well be inactive after getting to the creation of the
99th database as 99 seconds would of passed.

I should of said I'm using the standard default.ini.

-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Damien Katz <da...@apache.org>.
I think I know what's happening. With delayed commits each database  
waits ~1 sec before fully committing to disk. So each database is  
considered "open" until that commit happens. So this looks like normal  
behavior to me.

We could try to tell a database to fully commit when that happens,  
then close it, but it can take a long time to complete a fsync and a  
client could reopen the database in that time, etc. There are no  
perfect solutions here.

-Damien

On Nov 25, 2009, at 2:55 PM, Adam Kocoloski wrote:

> On Nov 25, 2009, at 2:23 PM, Paul Davis wrote:
>
>>> #!/usr/bin/perl
>>>
>>> use FindBin qw($Bin);
>>> use lib ("$Bin/lib","$Bin/cpan-lib");
>>>
>>> use Net::CouchDB;
>>>
>>> my $host=shift;
>>> my $secs=shift;
>>>
>>> my $couch=Net::CouchDB->new($host);
>>>
>>> my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);
>>>
>>> foreach (1...200) {
>>>       my $dbh=$couch->create_db("event-$_");
>>>       print "Created database $_\n";
>>>       $dbh->insert(@docs);
>>>       sleep($secs);
>>> }
>>
>> IANAPM, but if $dbh is holding an open connection you could very well
>> trigger this quite easily. Can you try replacing the sleep($secs)  
>> with
>> something like $dbh->close()? Any easy way to check this is to watch
>> `netstat -tap tcp` and see if the number of sockets on either machine
>> is growing monotonically.
>>
>> HTH,
>> Paul Davis
>
> Paul, that was my first thought too, but isn't the DB only  
> considered "active" for the lifetime of a request?  E.g. it doesn't  
> matter if the connections are kept open or not, once  
> couch_db:close(Db) is called the reference counter gets decremented  
> and couch_server can drop it from the LRU cache.  At least that's my  
> reading of the code. Best,
>
> Adam
>


Re: all_dbs_active - what should an app do when it gets this?

Posted by Adam Kocoloski <ko...@apache.org>.
On Nov 25, 2009, at 2:23 PM, Paul Davis wrote:

>> #!/usr/bin/perl
>> 
>> use FindBin qw($Bin);
>> use lib ("$Bin/lib","$Bin/cpan-lib");
>> 
>> use Net::CouchDB;
>> 
>> my $host=shift;
>> my $secs=shift;
>> 
>> my $couch=Net::CouchDB->new($host);
>> 
>> my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);
>> 
>> foreach (1...200) {
>>        my $dbh=$couch->create_db("event-$_");
>>        print "Created database $_\n";
>>        $dbh->insert(@docs);
>>        sleep($secs);
>> }
> 
> IANAPM, but if $dbh is holding an open connection you could very well
> trigger this quite easily. Can you try replacing the sleep($secs) with
> something like $dbh->close()? Any easy way to check this is to watch
> `netstat -tap tcp` and see if the number of sockets on either machine
> is growing monotonically.
> 
> HTH,
> Paul Davis

Paul, that was my first thought too, but isn't the DB only considered "active" for the lifetime of a request?  E.g. it doesn't matter if the connections are kept open or not, once couch_db:close(Db) is called the reference counter gets decremented and couch_server can drop it from the LRU cache.  At least that's my reading of the code. Best,

Adam


Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Wed, Nov 25, 2009 at 2:23 PM, Paul Davis <pa...@gmail.com> wrote:
> IANAPM, but if $dbh is holding an open connection you could very well
> trigger this quite easily. Can you try replacing the sleep($secs) with
> something like $dbh->close()?

I don't believe Net::CouchDB has a close method.

> Any easy way to check this is to watch
> `netstat -tap tcp` and see if the number of sockets on either machine
> is growing monotonically.

I did this and it simply held one tcp connection open, which is what I
suspected would happen since the connection is made outside the loop.


-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
> #!/usr/bin/perl
>
> use FindBin qw($Bin);
> use lib ("$Bin/lib","$Bin/cpan-lib");
>
> use Net::CouchDB;
>
> my $host=shift;
> my $secs=shift;
>
> my $couch=Net::CouchDB->new($host);
>
> my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);
>
> foreach (1...200) {
>        my $dbh=$couch->create_db("event-$_");
>        print "Created database $_\n";
>        $dbh->insert(@docs);
>        sleep($secs);
> }

IANAPM, but if $dbh is holding an open connection you could very well
trigger this quite easily. Can you try replacing the sleep($secs) with
something like $dbh->close()? Any easy way to check this is to watch
`netstat -tap tcp` and see if the number of sockets on either machine
is growing monotonically.

HTH,
Paul Davis

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Mon, Nov 23, 2009 at 8:26 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Mon, Nov 23, 2009 at 6:41 PM, Damien Katz <da...@apache.org> wrote:
>> If you are hitting the limit, but don't have enough clients accessing the
>> dbs, and have no views being built and no compactions happening, it might be
>> bug in CouchDB.
>
> I have 1 client program using Net::CouchDB, no views yet, no
> compactions. I'll try to create a small test program tomorrow to see
> if I can duplicate the conditions.

This code can reproduce what I'm seeing:

#!/usr/bin/perl

use FindBin qw($Bin);
use lib ("$Bin/lib","$Bin/cpan-lib");

use Net::CouchDB;

my $host=shift;
my $secs=shift;

my $couch=Net::CouchDB->new($host);

my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);

foreach (1...200) {
	my $dbh=$couch->create_db("event-$_");
	print "Created database $_\n";
	$dbh->insert(@docs);
	sleep($secs);
}


Net::CouchDB is done by Michael Hendricks  <mi...@ndrix.org>.

Running the code like so reproduces the error:

$ ./test-active-dbs http://localhost:5984 0
<output trimmed>
Created database 100
Unknown status code '500' while trying to create a database named
'event-101'. PUT request to http://localhost:5984/event-101/:
all_dbs_active

Using 1 for sleep resulted in the same results. Using 2 for sleep
resulted in this:
Created database 33
Unknown status code '500' while trying to create a database named
'event-34'. PUT request to http://localhost:5984/event-34/:
{gen_server,call,
            [couch_server,
             {create,<<"event-34">>,
                     [{user_ctx,{user_ctx,null,[<<"_admin">>]}}]}]}

log output is:
[info] [<0.4631.0>] 127.0.0.1 - - 'POST' /event-33/_bulk_docs 201
[error] [<0.4631.0>] Uncaught error in HTTP request: {exit,
                                 {timeout,
                                  {gen_server,call,
                                   [couch_server,
                                    {create,<<"event-34">>,
                                     [{user_ctx,
                                       {user_ctx,null,[<<"_admin">>]}}]}]}}}
[info] [<0.4631.0>] Stacktrace: [{gen_server,call,2},
             {couch_server,create,2},
             {couch_httpd_db,create_db_req,2},
             {couch_httpd,handle_request,5},
             {mochiweb_http,headers,5},
             {proc_lib,init_p_do_apply,3}]
[info] [<0.4631.0>] 127.0.0.1 - - 'PUT' /event-34/ 500


(I was looking at futon around the time this crash happened)

Running with 2 secs a 2nd time, the error happened when creating database 27. :(

I delete the DBs using this code:

$ more delete-dbs
#!/usr/bin/perl

use FindBin qw($Bin);
use lib ("$Bin/lib","$Bin/cpan-lib");

use Net::CouchDB;

use Data::Dumper;

my $host=shift;

my $couch=Net::CouchDB->new($host);
my @dbs=$couch->all_dbs();

foreach my $db (@dbs) {
	if($db->name()=~/^event-/) {
		$db->delete();
	}
}

Doing a ls in var shows the DBs are gone. Actually, I don't think the
error in these cases is couchdb's fault. I noticed the system
hesitating. I've seen this with my editor as well when saving files.
Not sure what is wrong with the system in that case.

In any case, changing sleep to 2,5 or 10 secs resulted in
all_dbs_active when it got to the 101st database.

I also tried this variation and got the same results when sec=0
(didn't try any other values). The thinking was maybe a need to create
a new session for couchdb to release something.

#!/usr/bin/perl

use FindBin qw($Bin);
use lib ("$Bin/lib","$Bin/cpan-lib");

use Net::CouchDB;

my $host=shift;
my $secs=shift;

my @docs=map{ { '_id' => $_, 'lang' => 'erlang' } } (1...500);

foreach my $i (1...10) {
	my $couch=Net::CouchDB->new($host);
	foreach my $j (1...20) {
		my $dbh=$couch->create_db("event-$i-$j");
		print "Created database event-$i-$j\n";
		$dbh->insert(@docs);
		sleep($secs);
	}
}

Finally, I notice futon has problems when there are > 100 dbs. I'm
guessing when futon get stats regarding each DB on the overview page,
that causes each database to be active. That seems like a simple way
to create a service denial for other clients just by using futon.

Anyhow, for now I'm upping my database limit. I don't know of this is
a serious problem or not, but something doesn't seem right to me.

TIA


-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Nov 24, 2009 at 10:16 AM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Mon, Nov 23, 2009 at 8:26 PM, Jeff Macdonald <ma...@gmail.com> wrote:
>> On Mon, Nov 23, 2009 at 6:41 PM, Damien Katz <da...@apache.org> wrote:
>>>
>>> Yeah, the limit is 100 open dbs at a time. Databases that are being
>>> compacted or having views built will cause the dbs to remain open even if no
>>> clients are actively access the server. Continuous replication shouldn't
>>> keep the dbs open unless it's actively replicatng changes.
>>>
>>> You can up the open db limit to whatever you want, but you might have to
>>> boost the open files allowable in the OS settings somewhere.
>>>
>>> If you are hitting the limit, but don't have enough clients accessing the
>>> dbs, and have no views being built and no compactions happening, it might be
>>> bug in CouchDB.
>>
>> I have 1 client program using Net::CouchDB, no views yet, no
>> compactions. I'll try to create a small test program tomorrow to see
>> if I can duplicate the conditions.
>
>
> hmmm.... before I do this, maybe I'm mis-understanding what an active
> db is. Does it mean there is some activity with a database, or that a
> database simply exists?
>
> --
> Jeff Macdonald
> Ayer, MA
>

An active DB is one that has an open file handle being accessed by
some part of CouchDB.

Paul Davis

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Mon, Nov 23, 2009 at 8:26 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> On Mon, Nov 23, 2009 at 6:41 PM, Damien Katz <da...@apache.org> wrote:
>>
>> Yeah, the limit is 100 open dbs at a time. Databases that are being
>> compacted or having views built will cause the dbs to remain open even if no
>> clients are actively access the server. Continuous replication shouldn't
>> keep the dbs open unless it's actively replicatng changes.
>>
>> You can up the open db limit to whatever you want, but you might have to
>> boost the open files allowable in the OS settings somewhere.
>>
>> If you are hitting the limit, but don't have enough clients accessing the
>> dbs, and have no views being built and no compactions happening, it might be
>> bug in CouchDB.
>
> I have 1 client program using Net::CouchDB, no views yet, no
> compactions. I'll try to create a small test program tomorrow to see
> if I can duplicate the conditions.


hmmm.... before I do this, maybe I'm mis-understanding what an active
db is. Does it mean there is some activity with a database, or that a
database simply exists?

-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Jeff Macdonald <ma...@gmail.com>.
On Mon, Nov 23, 2009 at 6:41 PM, Damien Katz <da...@apache.org> wrote:
>
> Yeah, the limit is 100 open dbs at a time. Databases that are being
> compacted or having views built will cause the dbs to remain open even if no
> clients are actively access the server. Continuous replication shouldn't
> keep the dbs open unless it's actively replicatng changes.
>
> You can up the open db limit to whatever you want, but you might have to
> boost the open files allowable in the OS settings somewhere.
>
> If you are hitting the limit, but don't have enough clients accessing the
> dbs, and have no views being built and no compactions happening, it might be
> bug in CouchDB.

I have 1 client program using Net::CouchDB, no views yet, no
compactions. I'll try to create a small test program tomorrow to see
if I can duplicate the conditions.


-- 
Jeff Macdonald
Ayer, MA

Re: all_dbs_active - what should an app do when it gets this?

Posted by Damien Katz <da...@apache.org>.
On Nov 23, 2009, at 5:41 PM, Paul Davis wrote:

> On Mon, Nov 23, 2009 at 5:35 PM, Jeff Macdonald <macfisherman@gmail.com 
> > wrote:
>> Hi all,
>>
>> I have a simple app that is creating a lot of databases. Eventually
>> when I get to some number of databases, I get this error
>> all_dbs_active on insert. I thought I just sleep (ick) and try again,
>> but when I went to Futon after the program was done for many minutes,
>> I get that error in the web interface too. So I'm rather puzzled at
>> what to do. I'm sure there is some config var somewhere where I can  
>> up
>> that limit, but I'm more interested in what the proper thing to do
>> when such a limit is met.
>>
>> TIA
>>
>> --
>> Jeff Macdonald
>> Ayer, MA
>>
>
> Other than increase the max_open_dbs configuration option, I can't
> think of anything 'elegant' that you could do client side other than
> backoff in your script for the db creation/access rate.
>
> The limit comes from trying to avoid throwing random file descriptor
> limit errors, so I'm not sure what the best solution would be. It'd
> theoretically be possible to create a queue that would just block
> client access to a db until there was an active slot available, but
> that could be easily pushed into starvation if the open db's remain
> active long enough which could be indefinite with things like
> continuous replication IIUC.

Yeah, the limit is 100 open dbs at a time. Databases that are being  
compacted or having views built will cause the dbs to remain open even  
if no clients are actively access the server. Continuous replication  
shouldn't keep the dbs open unless it's actively replicatng changes.

You can up the open db limit to whatever you want, but you might have  
to boost the open files allowable in the OS settings somewhere.

If you are hitting the limit, but don't have enough clients accessing  
the dbs, and have no views being built and no compactions happening,  
it might be bug in CouchDB.

-Damien

>
> HTH,
> Paul Davis


Re: all_dbs_active - what should an app do when it gets this?

Posted by Paul Davis <pa...@gmail.com>.
On Mon, Nov 23, 2009 at 5:35 PM, Jeff Macdonald <ma...@gmail.com> wrote:
> Hi all,
>
> I have a simple app that is creating a lot of databases. Eventually
> when I get to some number of databases, I get this error
> all_dbs_active on insert. I thought I just sleep (ick) and try again,
> but when I went to Futon after the program was done for many minutes,
> I get that error in the web interface too. So I'm rather puzzled at
> what to do. I'm sure there is some config var somewhere where I can up
> that limit, but I'm more interested in what the proper thing to do
> when such a limit is met.
>
> TIA
>
> --
> Jeff Macdonald
> Ayer, MA
>

Other than increase the max_open_dbs configuration option, I can't
think of anything 'elegant' that you could do client side other than
backoff in your script for the db creation/access rate.

The limit comes from trying to avoid throwing random file descriptor
limit errors, so I'm not sure what the best solution would be. It'd
theoretically be possible to create a queue that would just block
client access to a db until there was an active slot available, but
that could be easily pushed into starvation if the open db's remain
active long enough which could be indefinite with things like
continuous replication IIUC.

HTH,
Paul Davis