You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2014/03/24 14:37:11 UTC

[Couchdb Wiki] Update of "Performance" by JoanTouzet

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "Performance" page has been changed by JoanTouzet:
https://wiki.apache.org/couchdb/Performance?action=diff&rev1=20&rev2=21

  <<Include(EditTheWiki)>>
+ 
+ '''This page has been replaced by the official documentation''' '''at''' http://couchdb.readthedocs.org/en/latest/maintenance/performance.html.
  
  <<TableOfContents>>
  
@@ -9, +11 @@

  Many of the individual wiki pages mention performance when describing how to do things.  It is worthwhile refreshing your memory by revisiting them.
  
  = DELETE operation =
- When you delete a document the database will create a new revision which contains the _id and _rev fields as well as a deleted flag. This revision will remain even after a [[Compaction#Database_Compaction|database compaction]] so that the deletion can be replicated.
- Deleted documents, like non-deleted documents, can affect view build times, PUT and DELETE requests time and size of database on disk, since they increase the size of the B+Tree's. You can see the number of deleted documents in [[HTTP_database_API#Database_Information|database information]]. If your use case creates lots of deleted documents (for example, if you are storing short-term data like logfile entries, message queues, etc), you might want to periodically switch to a new database and delete the old one (once the entries in it have all expired).
+ When you delete a document the database will create a new revision which contains the _id and _rev fields as well as a deleted flag. This revision will remain even after a [[Compaction#Database_Compaction|database compaction]] so that the deletion can be replicated. Deleted documents, like non-deleted documents, can affect view build times, PUT and DELETE requests time and size of database on disk, since they increase the size of the B+Tree's. You can see the number of deleted documents in [[HTTP_database_API#Database_Information|database information]]. If your use case creates lots of deleted documents (for example, if you are storing short-term data like logfile entries, message queues, etc), you might want to periodically switch to a new database and delete the old one (once the entries in it have all expired).
  
  = File size =
  The smaller your file size, the less I/O operations there will be, the more of the file can be cached by CouchDB and the operating system, the quicker it is to replicate, backup etc.  Consequently you should carefully examine the data you are storing.  For example it would be silly to use keys that are hundreds of characters long, but your program would be hard to maintain if you only used single character keys.  Carefully consider data that is duplicated by putting it in views.
@@ -31, +32 @@

  [httpd]
  socket_options = [{nodelay, true}]
  }}}
- 
- 
  = View generation =
  Views with the Javascript view server (default) are extremely slow to generate when there are a non-trivial number of documents to process.  The generation process won't even saturate a single CPU let alone your I/O.  The cause is the latency involved in the CouchDB server and separate couchjs view server, dramatically indicating how important it is to take latency out of your implementation.
  
  You can let view access be "stale" but it isn't practical to determine when that will occur giving you a quick response and when views will be updated which will take a long time.  (A 10 million document database took about 10 minutes to load into CouchDB but about 4 hours to do view generation.)
  
- View information isn't replicated - it is rebuilt on each database so you can't do the view generation on a separate sever.  The only useful mechanism I have found is to generate the view on a separate machine together with data updates, shut down your target server, copy the couchdb raw database file across and then restart the target server.
+ View information isn't replicated - it is rebuilt on each database so you can't do the view generation on a separate sever.
  
  == Erlang implementations of common JavaScript functions ==
  If you’re using a very simple view function that only performs a sum or count reduction, you can call native Erlang implementations of them by simply writing "_sum" or "_count" in place of your function declaration. This will speed up things dramatically, as it cuts down on IO between CouchDB and serverside JavaScript. For example, as [[[http://mail-archives.apache.org/mod_mbox/couchdb-user/201003.mbox/<5E...@julianstahnke.com>|http://mail-archives.apache.org/mod_mbox/couchdb-user/201003.mbox/%3c5E07E00E-3D69-4A8C-ADA3-1B20CF0BA4C8@julianstahnke.com%3e]] mentioned on the mailing list], the time for outputting an (already indexed and cached) view with about 78,000 items went down from 60 seconds to 4 seconds.
@@ -103, +102 @@

  {{{
  export ERL_MAX_PORTS=4096
  }}}
- 
  CouchDB versions up to 1.1.x also create Erlang Term Storage (ETS) tables for each replication. If you are using a version of CouchDB older than 1.2 and must support many replications, also set the {{{ERL_MAX_ETS_TABLES}}} variable. The default is approximately 1400 tables.
  
  Note that on Mac OS X, Erlang will not actually increase the file descriptor limit past 1024 (i.e. the system header–defined value of FD_SETSIZE.) See [[http://erlang.org/pipermail/erlang-questions/2011-December/063119.html|this tip for a possible workaround]] and [[http://erlang.org/pipermail/erlang-questions/2011-October/061971.html|this thread for a deeper explanation]].
- 
  
  === PAM and ulimit ===
  Finally, most *nix operating systems impose various resource limits on every process. If your system is set up to use the Pluggable Authentication Modules (PAM) system, increasing this limit is straightforward. For example, creating a file named {{{/etc/security/limits.d/100-couchdb.conf}}} with the following contents will ensure that CouchDB can open enough file descriptors to service your increased maximum open databases and Erlang ports: