You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2021/05/19 15:28:02 UTC

[GitHub] [couchdb] schneuwlym opened a new issue #3571: Compaction dies constantly after a certain amount of documents

schneuwlym opened a new issue #3571:
URL: https://github.com/apache/couchdb/issues/3571


   [NOTE]: # ( ^^ Provide a general summary of the issue in the title above. ^^ )
   
   ## Description
   
   [NOTE]: # ( Describe the problem you're encountering. )
   [TIP]:  # ( Do NOT give us access or passwords to your actual CouchDB! )
   
   We have an issue with our CouchDB 3.1.1. We are using the default compaction configuration and this seems to work fine till the database reaches a certain amount of documents (~76K). Then the compaction dies and it is no longer able to finish the task. The compaction is restarted every 2 seconds and it always dies immediately. Till now, the problem is consistent and I didn't find any way (except of deleting the database) to fix the issue.
   
   I read some other compaction related issues, but here I only used version 3.1.1. So no upgrade, no migration or something similar.
   
   What I tried so far:
   * I tried to remove the compaction files manually and restart CouchDB. Compaction fails again
   * I tried to reboot the node. Compaction fails again
   * Reading the issues #3292 and #2941, I tinkered an own version based on 3.1.1 including the following two changes. Compaction still fails.
     * fix race condition (#3150)
     * add remonitor code to DOWN message (#3144)
   * First the slack compactor always failed, then I disabled it, but then the radio_dbs compactor failed as well.
   
   This is the log, which is repeated every two seconds:
   ```
   [notice] 2021-05-19T14:42:38.848090Z couchdb@127.0.0.1 <0.460.0> -------- ratio_dbs: adding <<"shards/80000000-ffffffff/directory.1621404274">> to internal compactor queue with priority 2.100073355455779
   [info] 2021-05-19T14:42:38.848533Z couchdb@127.0.0.1 <0.5146.0> -------- Starting compaction for db "shards/80000000-ffffffff/directory.1621404274" at 40726
   [notice] 2021-05-19T14:42:38.848615Z couchdb@127.0.0.1 <0.460.0> -------- ratio_dbs: Starting compaction for shards/80000000-ffffffff/directory.1621404274 (priority 2.100073355455779)
   [notice] 2021-05-19T14:42:38.849705Z couchdb@127.0.0.1 <0.460.0> -------- ratio_dbs: Started compaction for shards/80000000-ffffffff/directory.1621404274
   [warning] 2021-05-19T14:42:38.893633Z couchdb@127.0.0.1 <0.460.0> -------- exit for compaction of ["shards/80000000-ffffffff/directory.1621404274"]: {undef,[{math,ceil,[1.6],[]},{couch_emsort,num_merges,2,[{file,"src/couch_emsort.erl"},{line,366}]},{couch_bt_engine_compactor,sort_meta_data,1,[{file,"src/couch_bt_engine_compactor.erl"},{line,508}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{couch_bt_engine_compactor,start,4,[{file,"src/couch_bt_engine_compactor.erl"},{line,75}]}]}
   [error] 2021-05-19T14:42:38.894691Z couchdb@127.0.0.1 emulator -------- Error in process <0.5148.0> on node 'couchdb@127.0.0.1' with exit value:
   {undef,[{math,ceil,[1.6],[]},{couch_emsort,num_merges,2,[{file,"src/couch_emsort.erl"},{line,366}]},{couch_bt_engine_compactor,sort_meta_data,1,[{file,"src/couch_bt_engine_compactor.erl"},{line,508}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{couch_bt_engine_compactor,start,4,[{file,"src/couch_bt_engine_compactor.erl"},{line,75}]}]}
   
   [info] 2021-05-19T14:42:38.894453Z couchdb@127.0.0.1 <0.226.0> -------- db shards/80000000-ffffffff/directory.1621404274 died with reason {undef,[{math,ceil,[1.6],[]},{couch_emsort,num_merges,2,[{file,"src/couch_emsort.erl"},{line,366}]},{couch_bt_engine_compactor,sort_meta_data,1,[{file,"src/couch_bt_engine_compactor.erl"},{line,508}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{couch_bt_engine_compactor,start,4,[{file,"src/couch_bt_engine_compactor.erl"},{line,75}]}]}
   
   ```
   
   If the problem occurs, inserting data is still possible, but often I get the following error message (btw, I'm using python-cloudant)
   ```
   500 Server Error: Internal Server Error unknown_error undefined for url: http://localhost:5984/directory
   ```
   
   ## Steps to Reproduce
   
   [NOTE]: # ( Include commands to reproduce, if possible. curl is preferred. )
   
   1. Clean database
   2. Create a script, which creates documents in an endless loop (pur json, no attachments, just one revision)
   3. After around 76K documents the compactor starts to fail.
   4. Inserts are still possible, but time and again, the insert fails with (see above 500 Server Error)
   
   I did the mentioned stress test above on 3 nodes in parallel. All 3 nodes started to fail around the same amount of documents (70K-80K).
   * In the first node, I created the documents single threaded
   * In the second node, I created the documents using two threads
   * In the third node, I created the documents using four threads
   
   Following the script I used to reproduce the issue in my setup:
   ```
   #!/usr/bin/env python
   
   import signal
   import sys
   from cloudant.client import CouchDB
   from cloudant.document import Document
   from copy import deepcopy
   from threading import Thread
   
   
   USERNAME = 'admin'
   PASSWORD = 'admin'
   COUCHDB_URL = 'http://localhost:5984'
   DB_NAME = 'directory'
   
   
   cdb = CouchDB(USERNAME, PASSWORD, url=COUCHDB_URL, connect=True, auto_renew=True)
   
   account_skeletton = { 'parameter 1': 0,
                         'parameter 2': True,
                         'parameter 3': '',
                         'parameter 4': '',
                         'parameter 5': [],
                         'parameter 6': [],
                         'description': '',
                         'enabled': True,
                         'firstname': '',
                         'parameter 7': False,
                         'lastname': '',
                         'parameter 8': '',
                         'number': '',
                         'parameter 9': '9301162291d5a0480270d97d6c4a6da3edd75aa5',
                         'parameter 10': 'cos02',
                         'parameter 11': '112233',
                         'parameter 12': 1620118266.572422,
                         'parameter 13': 0,
                         'parameter 14': 0.0,
                         'parameter 15': False,
                         'parameter 16': 4,
                         'parameter 17': '',
                         'parameter 18': '',
                         'parameter 19': 'user',
                         'userid': '',
                         'parameter 20': '',
                         'parameter 21': '',
                         'parameter 22': True}
   
   
   if DB_NAME not in cdb.all_dbs():
       cdb.create_database(DB_NAME)
   
   
   def signal_handler(sig, frame):
       print('You pressed Ctrl+C!')
       sys.exit(0)
   
   
   def create_documents(start=0, thread_id=0):
       try:
           for i in xrange(start, 999999):
               number = '{}{:06}'.format(thread_id, i)
               print('create_documents: Creating document {}'.format(number))
               with Document(cdb[DB_NAME], number) as document:
                   document.update(deepcopy(account_skeletton))
                   document['firstname'] = 'FN {}'.format(number)
                   document['lastname'] = 'LN {}'.format(number)
                   document['number'] = number
                   document['userid'] = number
       except Exception as err:
           print('create_documents: {}'.format(err))
   
   
   def create_documents_threaded(threads=2):
       for i in xrange(threads):
           t = Thread(target=create_documents, args=(0, i))
           t.daemon = True
           t.start()
       
       signal.signal(signal.SIGINT, signal_handler)
       print('Press Ctrl+C')
       signal.pause()
   ```
   
   ## Expected Behaviour
   
   [NOTE]: # ( Tell us what you expected to happen. )
   
   Compaction doesn't fail :-)
   
   ## Your Environment
   
   [TIP]:  # ( Include as many relevant details about your environment as possible. )
   [TIP]:  # ( You can paste the output of curl http://YOUR-COUCHDB:5984/ here. )
   
   * CouchDB version used:
     `{"couchdb":"Welcome","version":"3.1.1","git_sha":"ce596c65d","uuid":"08fb7cd0a10f35f6215a531742f7b356","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}`
   * python-cloudant: 2.14.0
   * python2.7
   * Operating system and version:
     * Own Linux distribution
   * CouchDB running in a VM
     * Single Core (also changed to 2 cores, no difference)
     * 1GB Ram (also increased it to 1GB, no difference)
   * To trigger this issue, I used an isolated node, no replication, no clustering
   
   ## Additional Context
   
   [TIP]:  # ( Add any other context about the problem here. )
   
   Following you can find the configuration. Most of it is default:
   ```
   curl http://admin:admin@localhost:5984/_node/couchdb@127.0.0.1/_config | python -m json.tool
     % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                    Dload  Upload   Total   Spent    Left  Speed
   100  2823  100  2823    0     0   310k      0 --:--:-- --:--:-- --:--:--  344k
   {
       "admins": {
           "admin": "-pbkdf2-d5b128e39ebe61b4f50fb9c2e3241c0ea1bc28f9,6b6e6d21c67f685f753d8fa1fe72db71,10"
       },
       "attachments": {
           "compressible_types": "text/*, application/javascript, application/json, application/xml",
           "compression_level": "8"
       },
       "chttpd": {
           "backlog": "512",
           "bind_address": "0.0.0.0",
           "max_db_number_for_dbs_info_req": "100",
           "port": "5984",
           "prefer_minimal": "Cache-Control, Content-Length, Content-Range, Content-Type, ETag, Server, Transfer-Encoding, Vary",
           "require_valid_user": "false",
           "server_options": "[{recbuf, undefined}]",
           "socket_options": "[{sndbuf, 262144}, {nodelay, true}]"
       },
       "cluster": {
           "n": "3",
           "q": "2"
       },
       "cors": {
           "credentials": "false"
       },
       "couch_httpd_auth": {
           "allow_persistent_cookies": "true",
           "auth_cache_size": "50",
           "authentication_db": "_users",
           "authentication_redirect": "/_utils/session.html",
           "iterations": "10",
           "require_valid_user": "false",
           "secret": "a0ec90afc5f896e3cf90e8c4adc9dafa",
           "timeout": "600"
       },
       "couch_peruser": {
           "database_prefix": "userdb-",
           "delete_dbs": "false",
           "enable": "false"
       },
       "couchdb": {
           "attachment_stream_buffer_size": "4096",
           "changes_doc_ids_optimization_threshold": "100",
           "database_dir": "/var/crypt/couchdb/couchdb",
           "default_engine": "couch",
           "default_security": "everyone",
           "file_compression": "snappy",
           "max_dbs_open": "500",
           "max_document_size": "8000000",
           "os_process_timeout": "5000",
           "single_node": "true",
           "users_db_security_editable": "false",
           "uuid": "08fb7cd0a10f35f6215a531742f7b356",
           "view_index_dir": "/var/crypt/couchdb/couchdb"
       },
       "couchdb_engines": {
           "couch": "couch_bt_engine"
       },
       "csp": {
           "enable": "true"
       },
       "feature_flags": {
           "partitioned||*": "true"
       },
       "httpd": {
           "allow_jsonp": "false",
           "authentication_handlers": "{couch_httpd_auth, cookie_authentication_handler}, {couch_httpd_auth, default_authentication_handler}",
           "bind_address": "127.0.0.1",
           "enable_cors": "false",
           "enable_xframe_options": "false",
           "max_http_request_size": "4294967296",
           "port": "5986",
           "secure_rewrites": "true",
           "socket_options": "[{sndbuf, 262144}]"
       },
       "indexers": {
           "couch_mrview": "true"
       },
       "ioq": {
           "concurrency": "10",
           "ratio": "0.01"
       },
       "ioq.bypass": {
           "compaction": "false",
           "os_process": "true",
           "read": "true",
           "shard_sync": "false",
           "view_update": "true",
           "write": "true"
       },
       "log": {
           "file": "/var/log/couchdb/couchdb.log",
           "level": "info",
           "writer": "file"
       },
       "query_server_config": {
           "os_process_limit": "100",
           "reduce_limit": "true"
       },
       "replicator": {
           "connection_timeout": "30000",
           "http_connections": "20",
           "interval": "60000",
           "max_churn": "20",
           "max_jobs": "500",
           "retries_per_request": "5",
           "socket_options": "[{keepalive, true}, {nodelay, false}]",
           "ssl_certificate_max_depth": "3",
           "startup_jitter": "5000",
           "verify_ssl_certificates": "true",
           "worker_batch_size": "500",
           "worker_processes": "4"
       },
       "smoosh": {
           "db_channels": "upgrade_dbs,ratio_dbs",
           "view_channels": "upgrade_views,ratio_views"
       },
       "ssl": {
           "port": "6984"
       },
       "uuids": {
           "algorithm": "sequential",
           "max_count": "1000"
       },
       "vendor": {
           "name": "The Apache Software Foundation"
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] schneuwlym closed issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
schneuwlym closed issue #3571:
URL: https://github.com/apache/couchdb/issues/3571


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
nickva commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-856164099


   @wohali good point, thanks for clarifying


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] schneuwlym commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
schneuwlym commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-857495689


   Hi
   
   Updating the Erlang compiler to 22 definitely seems to fix our issue!
   
   Thank you very much for your help!
   
   Best regards
   Mathias


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] oldrich-svec commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
oldrich-svec commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-845800181


   We have a similar issue (Ubuntu Server 20.04, Docker version of CouchDB 3.1.1).
   
   We have a 30GB database which is being replicated from an another machine. Looking at the files I can see that the database files take someting like 100GB + another ca. 30GB for the compaction files.
   
   The compaction starts but always dies before it finishes. So the database gets never compacted and the compaction files are hanging there forever.
   
   Some logs:
   
   ```
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.466850Z nonode@nohost <0.222.0> -------- db shards/80000000-ffffffff/yoda_filesystem.1614838752 died with reason {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.24352.1>,{append_bin,[<<0,0,32,0>>,[<<6,66,160,149,155,150,189,119,66,50,121,52,14,190,9,143...}}
   couchdb-backup-service-01 | [warning] 2021-05-21T08:27:57.467050Z nonode@nohost <0.428.0> -------- exit for compaction of ["shards/00000000-7fffffff/yoda_filesystem.1614838752"]: {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.22588.1>,{pread_iolist,40340132370},compaction,<0.22589.1>,undefined},infinity]}}
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.467190Z nonode@nohost <0.222.0> -------- db shards/00000000-7fffffff/yoda_filesystem.1614838752 died with reason {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.22588.1>,{pread_iolist,40340132370},compaction,<0.22589.1>,undefined},infinity]}}
   
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.468479Z nonode@nohost <0.28448.1> -------- Starting compaction for db "shards/80000000-ffffffff/yoda_filesystem.1614838752" at 125776
   couchdb-backup-service-01 | [notice] 2021-05-21T08:27:57.468708Z nonode@nohost <0.428.0> -------- ratio_dbs: Started compaction for shards/80000000-ffffffff/yoda_filesystem.1614838752
   couchdb-backup-service-01 | [error] 2021-05-21T08:27:57.469504Z nonode@nohost <0.22508.1> -------- gen_server ioq terminated with reason: bad argument in call to erlang:monitor(process, {main,'clouseau@127.0.0.1'}) at ioq:submit_request/2(line:187) <= ioq:maybe_submit_request/1(line:150) <= ioq:handle_info/2(line:123) <= gen_server:try_dispatch/4(line:616) <= gen_server:handle_msg/6(line:686) <= proc_lib:init_p_do_apply/3(line:247)
   couchdb-backup-service-01 |   last msg: timeout
   couchdb-backup-service-01 |      state: {state,10,0.01,{[{request,{main,'clouseau@127.0.0.1'},{open,<0.28430.1>,<<"shards/80000000-ffffffff/yodadev_features.1614849634/de6fcb4cc39fbd3e0c1d621d441e9057">>,<<"standard">>},other,{<0.28430.1>,#Ref<0.2791382983.2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] schneuwlym commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
schneuwlym commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-855207246


   Hi nickva
   
   Thanks for your reply. We are using Version 19.3.
   ```
   erl -eval '{ok, Version} = file:read_file(filename:join([code:root_dir(), "releases", erlang:system_info(otp_release), "OTP_VERSION"])), io:fwrite(Version), halt().' -noshell
   19.3
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] schneuwlym commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
schneuwlym commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-855716978


   Hi nickva, thanks for your reply.
   
   Indeed it seems that our packager patched the source to build couchdb
   ```
   --- a/src/couch/src/couch_emsort.erl.clean      2021-01-14 17:18:40.436549175 +0000
   +++ a/src/couch/src/couch_emsort.erl    2021-01-14 17:17:39.128103923 +0000
   @@ -133,6 +133,8 @@
    -export([add/2, merge/1, merge/2, sort/1, iter/1, next/1]).
    -export([num_kvs/1, num_merges/1]).
   
   +-import(math, [ceil/1]).
   +
    -record(ems, {
        fd,
        root,
   --- a/src/couch/rebar.config.script.clean       2021-01-14 17:28:34.570100193 +0000
   +++ b/src/couch/rebar.config.script     2021-01-14 19:06:56.523186136 +0000
   @@ -107,7 +107,7 @@
            };
        {unix, _} when SMVsn == "1.8.5" ->
            {
   -            "-DXP_UNIX -I/usr/include/js -I/usr/local/include/js",
   +            "-DXP_UNIX " ++ os:getenv("JS_CFLAGS"),
                "-L/usr/local/lib -lmozjs185 -lm"
            };
        {win32, _} when SMVsn == "60" ->
   @@ -164,7 +164,7 @@
    CouchJSEnv = case SMVsn of
        "1.8.5" ->
            [
   -            {"CFLAGS", JS_CFLAGS ++ " " ++ CURL_CFLAGS},
   +            {"CFLAGS", JS_CFLAGS ++ " " ++ CURL_CFLAGS ++ os:getenv("JS_CFLAGS")},
                {"LDFLAGS", JS_LDFLAGS ++ " " ++ CURL_LDFLAGS}
            ];
        _ ->
   
   ```
   
   But what I don't understand is, the dependency page (https://docs.couchdb.org/en/3.1.1/install/unix.html#dependencies) mentions Erlang OTP 19.x as requirement. Do I don't understand the line "Erlang OTP (19.x, 20.x >= 21.3.8.5, 21.x >= 21.2.3, 22.x >= 22.0.5)" or is this information wrong? Since it is comma sparated, I assumed that 19.x is fully supported...
   
   What Erlang version should we try? 22 or should we already try the latest, eg 24? Since 24 is not mentioned in the list I guess we should go with 22, right?
   
   Regards
   Mathias


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] oldrich-svec edited a comment on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
oldrich-svec edited a comment on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-845800181


   We have a similar issue (Ubuntu Server 20.04, Docker version of CouchDB 3.1.1).
   
   We have a 30GB database which is being replicated from an another machine. Looking at the files I can see that the database files take someting like 100GB + another ca. 30GB for the compaction files.
   
   The compaction starts but always dies before it finishes. So the database gets never compacted and the compaction files are hanging there forever.
   
   I would also add that the source machine (where the replication goes from) runs CouchDB 3.1.0 on Windows Server 2019 and there the compaction seems to work just fine.
   
   Some logs:
   
   ```
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.466850Z nonode@nohost <0.222.0> -------- db shards/80000000-ffffffff/yoda_filesystem.1614838752 died with reason {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.24352.1>,{append_bin,[<<0,0,32,0>>,[<<6,66,160,149,155,150,189,119,66,50,121,52,14,190,9,143...}}
   couchdb-backup-service-01 | [warning] 2021-05-21T08:27:57.467050Z nonode@nohost <0.428.0> -------- exit for compaction of ["shards/00000000-7fffffff/yoda_filesystem.1614838752"]: {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.22588.1>,{pread_iolist,40340132370},compaction,<0.22589.1>,undefined},infinity]}}
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.467190Z nonode@nohost <0.222.0> -------- db shards/00000000-7fffffff/yoda_filesystem.1614838752 died with reason {{badarg,[{erlang,monitor,[process,{main,'clouseau@127.0.0.1'}],[]},{ioq,submit_request,2,[{file,"src/ioq.erl"},{line,187}]},{ioq,maybe_submit_request,1,[{file,"src/ioq.erl"},{line,150}]},{ioq,handle_info,2,[{file,"src/ioq.erl"},{line,123}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[ioq,{request,<0.22588.1>,{pread_iolist,40340132370},compaction,<0.22589.1>,undefined},infinity]}}
   
   couchdb-backup-service-01 | [info] 2021-05-21T08:27:57.468479Z nonode@nohost <0.28448.1> -------- Starting compaction for db "shards/80000000-ffffffff/yoda_filesystem.1614838752" at 125776
   couchdb-backup-service-01 | [notice] 2021-05-21T08:27:57.468708Z nonode@nohost <0.428.0> -------- ratio_dbs: Started compaction for shards/80000000-ffffffff/yoda_filesystem.1614838752
   couchdb-backup-service-01 | [error] 2021-05-21T08:27:57.469504Z nonode@nohost <0.22508.1> -------- gen_server ioq terminated with reason: bad argument in call to erlang:monitor(process, {main,'clouseau@127.0.0.1'}) at ioq:submit_request/2(line:187) <= ioq:maybe_submit_request/1(line:150) <= ioq:handle_info/2(line:123) <= gen_server:try_dispatch/4(line:616) <= gen_server:handle_msg/6(line:686) <= proc_lib:init_p_do_apply/3(line:247)
   couchdb-backup-service-01 |   last msg: timeout
   couchdb-backup-service-01 |      state: {state,10,0.01,{[{request,{main,'clouseau@127.0.0.1'},{open,<0.28430.1>,<<"shards/80000000-ffffffff/yodadev_features.1614849634/de6fcb4cc39fbd3e0c1d621d441e9057">>,<<"standard">>},other,{<0.28430.1>,#Ref<0.2791382983.2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva edited a comment on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
nickva edited a comment on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-855272434


   @schneuwlym  Erlang 19 would explain why you got an `undef` error there. That `ceil/1` function is not present in Erlang 19. Unfortunately Erlang 19 is not supported any longer for CouchDB 3.x releases.
   
   From the error message it seems as if someone had "patched" the CouchDB release to compile on 19.x and replaced the undefined `ceil/1` function (which would have prevented compiling on < 20.0 releases) with `math:ceil/1`. However, `math:ceil/1` is also not defined in < 20.0 release but we'd only find out about it at runtime.
   
   ```
   4> catch ceil(1.6).    
   {'EXIT',{{shell_undef,ceil,1,[]},
            [{shell,shell_undef,2,[{file,"shell.erl"},{line,1061}]},
   
   5> catch math:ceil(1.6).
   {'EXIT',{undef,[{math,ceil,[1.6],[]},
                   {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] AdrianTute commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
AdrianTute commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-852952031


   Short appendix to what @schneuwlym wrote.
   I was able to completely turn off auto-compaction (smoosh).
   After insertion of ~70k records, I triggered manually a compaction task.
   It was crashing at the very end (Fauxton showed progress ~99%) with the above mentioned error.
   CPU and memory usage was decent all the time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
nickva commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-855272434


   @schneuwlym Thanks for replying. Erlang 19 would explain why you got an `undef` error there. That `ceil/1` function is not present in Erlang 19. Unfortunately Erlang 19 is not supported any longer for CouchDB 3.x releases.
   
   From the error message it seems as if someone had "patched" the CouchDB release to compile on 19.x and replaced the undefined `ceil/1` function (which would have prevented compiling on < 20.0 releases) with `math:ceil/1`. However, `math:ceil/1` is also not defined in < 20.0 release but we'd only find out about it at runtime.
   
   ```
   4> catch ceil(1.6).    
   {'EXIT',{{shell_undef,ceil,1,[]},
            [{shell,shell_undef,2,[{file,"shell.erl"},{line,1061}]},
   
   5> catch math:ceil(1.6).
   {'EXIT',{undef,[{math,ceil,[1.6],[]},
                   {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
nickva commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-856052942


   @schneuwlym that was a mistake on our part, we have documented it as "soft" supported in release notes for 3.0:  https://docs.couchdb.org/en/3.1.1/whatsnew/3.0.html
   
   ```
   19.x - “soft” support only. No longer tested, but should work.
   ```
   
   Basically saying we're not going to go out of our way to break it but it may break at some point accidentally and we're not testing it. With `ceil` the idea I think was that a failed compilation error would indicate that it obviously won't build. In retrospect, we should have taken a firmer stance and explicitly indicated we are not supporting Erlang versions < 20 in documentation at that point.
   
   I already updated the rebar config file to disallow Erlang 19 and will update the dependencies list in unix.html docs file too.
   
   As for which versions to try. The binary packages we release are shipped with the latest versions of 20. In production at Cloudant I have seen 20 run for a few years without any issues. So could pick 20.3.8.26 for example. However, the downside there is Erlang developers promise to support only the last two versions behind the current one. If that's a concern perhaps pick the latest patch version of 23 and make sure to periodically check for fixes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] wohali commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
wohali commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-856153924


   Note that 23.x and 24.x are not yet supported in CouchDB 3, unless you are building from the `3.x` branch directly. For `3.1.1` you cannot go any newer than 22.x.
   
   See: https://docs.couchdb.org/en/3.1.1/install/unix.html#installation-from-source for the versions supported at the time `3.1.1` was released. We acknowledge 19.x was incorrectly included there, and will support 23.x and 24.x with the forthcoming 3.2 release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on issue #3571: Compaction dies constantly after a certain amount of documents

Posted by GitBox <gi...@apache.org>.
nickva commented on issue #3571:
URL: https://github.com/apache/couchdb/issues/3571#issuecomment-854989033


   `{undef,[{math,ceil,[1.6],[]},{couch_emsort,num_merges,2,` is quite odd. I can't figure out where it is coming from
   
   I see erlang `ceil` function in https://github.com/apache/couchdb/blob/ce596c65d9d7f0bc5d9937bcaf6253b343015690/src/couch/src/couch_emsort.erl#L363-L366 is calling `erlang:ceil` in https://github.com/erlang/otp/blob/8b29b1ca870e6b31a0f3da067ebf4b1b4ceaa969/erts/preloaded/src/erlang.erl#L566-L570 which seems to call a C NIF function but not `math:ceil` which the error indicates.
   
   @schneuwlym what version of Erlang are you running? Wonder if there is something related to that. `ceil` is a fairly new function in Erlang 20+ only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org