You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "KangTheTerrible (via GitHub)" <gi...@apache.org> on 2023/08/10 11:58:36 UTC

[GitHub] [couchdb] KangTheTerrible opened a new issue, #4725: Long running erlang map/reduce can block compaction from completion, leaking erlang procs

KangTheTerrible opened a new issue, #4725:
URL: https://github.com/apache/couchdb/issues/4725

   ## Description
   
   A long running/slow erlang map/reduce due to a new shard deployment, appears to be blocking that shards compaction from completing. It also appears to be leaking/growing erlang procs at a steady rate, between 5k-10k per hour. 
   
   ## Steps to Reproduce
   
   Start Compaction
   Start long Erlang/reduce
   Compaction tries to complete, is unable to until reduce completes
   Observe steady increase in erlang procs (may require continued insertion/interaction with the shard)
   
   ## Expected Behaviour
   
   Compaction should not be blocked
   Erlang procs should not continue to increase until it hits the limit and crashes
   
   ## Your Environment
   AWS C6i.x32large 5 nodes q=3 n=5
   
   * CouchDB version used: 3.2.2
   * Operating system and version: Debian Buster
   
   ## Additional Context
   We resharded which resulted in the erlang map reduce being a lot longer than it should(not incremental). 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Long running erlang map/reduce can block view compaction from completion, leaking erlang procs [couchdb]

Posted by "fr2lancer (via GitHub)" <gi...@apache.org>.
fr2lancer commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1802874933

   Hi actually I can't see any outstanding lines in debug mode in the log. 
   it just no logs from yesterday and process is unable to be recognized and top is 5.0.
   Not consuming too much memory.
   
   do you know how to flush debug from erlang?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on issue #4725: Long running erlang map/reduce can block view compaction from completion, leaking erlang procs

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1687272441

   One strategy could be to periodically ping the https://docs.couchdb.org/en/stable/api/ddoc/common.html#db-design-design-doc-info endpoint and wait until the index has completed building before querying it to avoid piling up too many client requests if the index is large. 
   
   Using a larger Q (resharding) could also help parallelize indexing building if you have the computation and disk throughput resources.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] KangTheTerrible commented on issue #4725: Long running erlang map/reduce can block view compaction from completion, leaking erlang procs

Posted by "KangTheTerrible (via GitHub)" <gi...@apache.org>.
KangTheTerrible commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1682670259

   This does eventually resolve gracefully, given enough erlang procs and storage. Additional change that had to be made to keep on top of storage was to increase the view ratio smoosh concurrency values since stuck compactions prevented other compactions from running.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Long running erlang map/reduce can block view compaction from completion, leaking erlang procs [couchdb]

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1842062495

   * I'll second @rnewson's proposal to try a old-ddoc/new-ddoc strategy to deploy new views.
   
   * For clients could  use `stable=false&update=false` and let ken (index auto-builder) to build the indices for you in the background. Monitor with `_active_tasks`.
   
   * There is an undocumented `[smoosh.ignore] $shard = true` setting to allow the auto-compactor to ignore specific shards. For example:
   ```
   [smoosh.ignore]
   shards/e0000000-ffffffff/dbname.1660859921 = true
   ```
   
   * @fr2lancer if you're asking about debug logging for compaction/auto-compaction see issue https://github.com/apache/couchdb/issues/4815#issuecomment-1791518288. That's a bit tricky to set but it should work.
   
    * In your version of CouchDB 3.2.2 we had a bug calculating the slack and ratio and ended up triggering the auto-compactor too often. Consider upgrading to 3.3.3 if possible. You might find some of the compaction don't trigger as often any longer. That was fixed in 3.3.0 (https://github.com/apache/couchdb/pull/4264)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] KangTheTerrible commented on issue #4725: Long running erlang map/reduce can block view compaction from completion, leaking erlang procs

Posted by "KangTheTerrible (via GitHub)" <gi...@apache.org>.
KangTheTerrible commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1688317559

   Yeah Nick, in our case unfortunately this was a live production server so we had no trivial means to block users from attempting to access the view. Worth noting, none of these clients were waiting, all view requests to this view are stable=false&update=lazy


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] rnewson commented on issue #4725: Long running erlang map/reduce can block view compaction from completion, leaking erlang procs

Posted by "rnewson (via GitHub)" <gi...@apache.org>.
rnewson commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1691429128

   https://docs.couchdb.org/en/stable/best-practices/views.html#deploying-a-view-change-in-a-live-environment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] KangTheTerrible commented on issue #4725: Long running erlang map/reduce can block view compaction from completion, leaking erlang procs

Posted by "KangTheTerrible (via GitHub)" <gi...@apache.org>.
KangTheTerrible commented on issue #4725:
URL: https://github.com/apache/couchdb/issues/4725#issuecomment-1675144766

   Additional piece of useful info, it seems that while the index is running for the first time I got this from the erlang views metadata
   
   ```
   _design/erlangstatsstats Metadata
   Index Information
   Language:Erlang
   Currently being updated?Yes
   Currently running compaction?Yes
   Waiting for a commit?Yes
   Clients waiting for the index:719422
   Update sequence on DB:257926611
   Processed purge sequence:0
   Actual data size (bytes):602,563,809,246
   Data size on disk (bytes):1,187,591,035,418
   MD5 Signature:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org