You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Paul Okstad <po...@gmail.com> on 2015/01/12 00:24:17 UTC

Tweaking CouchDB For DB-Per-User

Howdy All,

I would really appreciate help from anyone experienced with running a server with database-per-user configuration.

I have a server with about 1000 users and growing and I’m using a database per user to maintain privacy of data. I also have all databases replicating to a global database so that I can easily replicate the all data to another server and do views across users. I’ve noticed that some of the replications are failing. Here’s a sample error:

[Sun, 11 Jan 2015 21:05:05 GMT] [error] [<0.88.0>] {error_report,<0.31.0>,
                    {<0.88.0>,supervisor_report,
                     [{supervisor,{local,couch_replicator_job_sup}},
                      {errorContext,child_terminated},
                      {reason,
                       {{{case_clause,
                          {{badmatch,{error,emfile}},
                           [{couch_file,init,1,
                             [{file,"couch_file.erl"},{line,314}]},
                            {gen_server,init_it,6,
                             [{file,"gen_server.erl"},{line,306}]},
                            {proc_lib,init_p_do_apply,3,
                             [{file,"proc_lib.erl"},{line,239}]}]}},
                         [{couch_server,handle_info,2,
                           [{file,"couch_server.erl"},{line,442}]},
                          {gen_server,handle_msg,5,
                           [{file,"gen_server.erl"},{line,599}]},
                          {proc_lib,init_p_do_apply,3,
                           [{file,"proc_lib.erl"},{line,239}]}]},
                        {gen_server,call,
                         [couch_server,
                          {open,<<“user1_database">>,
                           [{user_ctx,
                             {user_ctx,<<“admin_username">>,
                              [<<"_admin">>],
                              undefined}}]},
                          infinity]}}},
                      {offender,
                       [{pid,<0.16660.0>},
                        {name,"e93a43abfd0dbe11ea68891c7cfce253+continuous"},
                        {mfargs,{gen_server,start_link,undefined}},
                        {restart_type,temporary},
                        {shutdown,250},
                        {child_type,worker}]}]}}

I was under the impression that replication between databases on the same server was a lightweight task compared to external replication. Hopefully a solution can be found because other than this issue, it is working great!

- Paul

Re: Tweaking CouchDB For DB-Per-User

Posted by Alexander Shorin <kx...@gmail.com>.
On Sun, Feb 8, 2015 at 11:47 PM, Paul Okstad <po...@gmail.com> wrote:
> Also, I changed the max open databases in /etc/couchdb/local.ini to 65535.
>
> Unfortunately, now I am using WAY more CPU than before. Previously CouchDB was hovering around 3-4% and now it’s about 9-22%. This is on a testbed system where none of the user databases are being updated by real users.
>
> I did a file count using this command:
>
> lsof | grep couchdb | wc
>
> And I learned that CouchDB has about 27000 files open.
>
> This doesn’t seem to be a very scalable solution unless I’m missing something here. Maybe I should be using the update notifications to trigger one time replications each time a user’s DB is modified?

Max open databases isn't about file descriptors. IIRC, CouchDB opens
two descriptors per database: one for read and one for write. Plus
there are fd used for view index files and sockets. You'd better setup
monitoring for CouchDB stats[1] to keep your hand on the pulse and
receive notification when number of used descriptors / open databases
getting closer to the limit.

System ulimit you can also set to unlimited, so you can avoid
unexpectable emfile errors.

[1]: As an example of CouchDB monitoring:
http://gws.github.io/munin-plugin-couchdb

--
,,,^..^,,,

Re: Tweaking CouchDB For DB-Per-User

Posted by Paul Okstad <po...@gmail.com>.
I was able to fix the problem but at a cost. I have about 2500 databases for each of my users and I tried changing the file limit to 4096 but it wasn’t enough. I tried 65535 and it worked, no more errors. The following modifications made it possible:

I added “limit nofile 65535 65535” to the Ubuntu upstart file (/etc/init/couchdb.conf) after the author line:

# Apache CouchDB - a RESTful document oriented database

description "Start the system-wide CouchDB instance"
author "Jason Gerard DeRose <ja...@system76.com>"
limit nofile 65535 65535

start on filesystem and static-network-up
stop on deconfiguring-networking
respawn

pre-start script
    mkdir -p /var/run/couchdb || /bin/true
    chown -R couchdb:couchdb /var/run/couchdb /etc/couchdb/local.*
end script

script
  HOME=/var/lib/couchdb
  export HOME
  chdir $HOME
  exec su couchdb -c /usr/bin/couchdb
end script

post-stop script
    rm -rf /var/run/couchdb/*
end script

Also, I changed the max open databases in /etc/couchdb/local.ini to 65535.

Unfortunately, now I am using WAY more CPU than before. Previously CouchDB was hovering around 3-4% and now it’s about 9-22%. This is on a testbed system where none of the user databases are being updated by real users.

I did a file count using this command:

lsof | grep couchdb | wc

And I learned that CouchDB has about 27000 files open.

This doesn’t seem to be a very scalable solution unless I’m missing something here. Maybe I should be using the update notifications to trigger one time replications each time a user’s DB is modified?

> On Jan 11, 2015, at 4:13 PM, Alexander Shorin <kx...@gmail.com> wrote:
> 
> CouchDB automatically closes unused file handlers. However, for 1000
> active databases it's hard to not hit the default 1024 limit.
> You can setup monitoring for couchdb/open_os_files and send you alert
> when it's getting close to the deadline.
> --
> ,,,^..^,,,
> 
> 
> On Mon, Jan 12, 2015 at 3:07 AM, Paul Okstad <po...@gmail.com> wrote:
>> BTW, is there a better strategy to this instead of brute forcing the limit to be larger? It seems to be a bad idea to keep over 1000 files open if I don’t even need to replicate them until a change occurs. It this a limitation of internal continuous replication? Should I be triggering one time replications using the database update notifications?
>> 
>>> On Jan 11, 2015, at 3:44 PM, Paul Okstad <po...@gmail.com> wrote:
>>> 
>>> Thank you for the quick reply. I am indeed using Ubuntu and indeed using SSL so this is extremely relevant. I’ll try out the fixes and get back.
>>> 
>>>> On Jan 11, 2015, at 3:35 PM, Alexander Shorin <kx...@gmail.com> wrote:
>>>> 
>>>> On Mon, Jan 12, 2015 at 2:24 AM, Paul Okstad <po...@gmail.com> wrote:
>>>>> {error,emfile}
>>>> 
>>>> emfile - too many open files. For thousand databases you might likely
>>>> hit default ulimit for 1024 file handlers.
>>>> See also this thread:
>>>> http://erlang.org/pipermail/erlang-questions/2015-January/082446.html
>>>> about other ways to solve this. For instance, on Ubuntu with upstart
>>>> there is a bit different way to set process limits.
>>>> 
>>>> --
>>>> ,,,^..^,,,
>>> 
>> 


Re: Tweaking CouchDB For DB-Per-User

Posted by Alexander Shorin <kx...@gmail.com>.
CouchDB automatically closes unused file handlers. However, for 1000
active databases it's hard to not hit the default 1024 limit.
You can setup monitoring for couchdb/open_os_files and send you alert
when it's getting close to the deadline.
--
,,,^..^,,,


On Mon, Jan 12, 2015 at 3:07 AM, Paul Okstad <po...@gmail.com> wrote:
> BTW, is there a better strategy to this instead of brute forcing the limit to be larger? It seems to be a bad idea to keep over 1000 files open if I don’t even need to replicate them until a change occurs. It this a limitation of internal continuous replication? Should I be triggering one time replications using the database update notifications?
>
>> On Jan 11, 2015, at 3:44 PM, Paul Okstad <po...@gmail.com> wrote:
>>
>> Thank you for the quick reply. I am indeed using Ubuntu and indeed using SSL so this is extremely relevant. I’ll try out the fixes and get back.
>>
>>> On Jan 11, 2015, at 3:35 PM, Alexander Shorin <kx...@gmail.com> wrote:
>>>
>>> On Mon, Jan 12, 2015 at 2:24 AM, Paul Okstad <po...@gmail.com> wrote:
>>>> {error,emfile}
>>>
>>> emfile - too many open files. For thousand databases you might likely
>>> hit default ulimit for 1024 file handlers.
>>> See also this thread:
>>> http://erlang.org/pipermail/erlang-questions/2015-January/082446.html
>>> about other ways to solve this. For instance, on Ubuntu with upstart
>>> there is a bit different way to set process limits.
>>>
>>> --
>>> ,,,^..^,,,
>>
>

Re: Tweaking CouchDB For DB-Per-User

Posted by Paul Okstad <po...@gmail.com>.
BTW, is there a better strategy to this instead of brute forcing the limit to be larger? It seems to be a bad idea to keep over 1000 files open if I don’t even need to replicate them until a change occurs. It this a limitation of internal continuous replication? Should I be triggering one time replications using the database update notifications?

> On Jan 11, 2015, at 3:44 PM, Paul Okstad <po...@gmail.com> wrote:
> 
> Thank you for the quick reply. I am indeed using Ubuntu and indeed using SSL so this is extremely relevant. I’ll try out the fixes and get back.
> 
>> On Jan 11, 2015, at 3:35 PM, Alexander Shorin <kx...@gmail.com> wrote:
>> 
>> On Mon, Jan 12, 2015 at 2:24 AM, Paul Okstad <po...@gmail.com> wrote:
>>> {error,emfile}
>> 
>> emfile - too many open files. For thousand databases you might likely
>> hit default ulimit for 1024 file handlers.
>> See also this thread:
>> http://erlang.org/pipermail/erlang-questions/2015-January/082446.html
>> about other ways to solve this. For instance, on Ubuntu with upstart
>> there is a bit different way to set process limits.
>> 
>> --
>> ,,,^..^,,,
> 


Re: Tweaking CouchDB For DB-Per-User

Posted by Paul Okstad <po...@gmail.com>.
Thank you for the quick reply. I am indeed using Ubuntu and indeed using SSL so this is extremely relevant. I’ll try out the fixes and get back.

> On Jan 11, 2015, at 3:35 PM, Alexander Shorin <kx...@gmail.com> wrote:
> 
> On Mon, Jan 12, 2015 at 2:24 AM, Paul Okstad <po...@gmail.com> wrote:
>> {error,emfile}
> 
> emfile - too many open files. For thousand databases you might likely
> hit default ulimit for 1024 file handlers.
> See also this thread:
> http://erlang.org/pipermail/erlang-questions/2015-January/082446.html
> about other ways to solve this. For instance, on Ubuntu with upstart
> there is a bit different way to set process limits.
> 
> --
> ,,,^..^,,,


Re: Tweaking CouchDB For DB-Per-User

Posted by Alexander Shorin <kx...@gmail.com>.
On Mon, Jan 12, 2015 at 2:24 AM, Paul Okstad <po...@gmail.com> wrote:
> {error,emfile}

emfile - too many open files. For thousand databases you might likely
hit default ulimit for 1024 file handlers.
See also this thread:
http://erlang.org/pipermail/erlang-questions/2015-January/082446.html
about other ways to solve this. For instance, on Ubuntu with upstart
there is a bit different way to set process limits.

--
,,,^..^,,,