You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by "jcoglan (via GitHub)" <gi...@apache.org> on 2023/05/25 09:23:49 UTC

[GitHub] [couchdb] jcoglan opened a new issue, #4624: "not_in_range" failure while resharding database

jcoglan opened a new issue, #4624:
URL: https://github.com/apache/couchdb/issues/4624

   While attempting to shard-split a q=16 database on a 3-node cluster, we found that all reshard jobs failed, and `GET /_reshard/jobs` stopped responding to requests. The logs reveal a `not_in_range` failure in `mem3_reshard_job`.
   
   ## Description
   
   We are attempting to reshard a database from q=8 to q=32, using the following script: https://gist.github.com/jcoglan/ad2b631664bc436c48e4274718a0acd6. This worked to get from q=8 to q=16, but failed the second step to get to q=32.
   
   `/_reshard` shows that all jobs failed:
   
   ```json
   {
     "state": "running",
     "state_reason": null,
     "completed": 0,
     "failed": 48,
     "running": 0,
     "stopped": 0,
     "total": 48
   }
   ```
   
   Also, `/_reshard/jobs` does not respond at all, the request hangs with no activity visible in the logs.
   
   We observed many messages like the following while the jobs were running:
   
   ```
   [error] 2023-05-24T20:42:03.508280Z couchdb@node.example.com <0.22158.5016>
   -------- mem3_reshard_job worker error "#job{001-2ce218fc7c061f93575765
   4236cbb4b25aceab927d7c433d377c72f0dee42bca shards/c0000000-cfffffff/some-db.15
   66089635 /2 job_state:running split_state:initial_copy pid:<0.22158.5016>}"
   {{badkey,not_i n_range},[{erlang,map_get,[not_in_range,#{[3221225472,3355443199]
   => {target,{db,1,<<"shar
   ds/c0000000-c7ffffff/some-db.1566089635">>,"/mnt/couchdb/data/shards/c0000000-
   c7ffffff/some-db.1566089635.couch",{couch_bt_engine,{st,"/mnt/couchdb/data/sha
   rds/c0000000-c7ffffff/some-db.1566089635.couch",<0.11973.5023>,#Ref<0.18983093
   87.670826504.2396>,undefined,{db_header,8,2343426,0,{27701359082,{320334,10,{size_info,276
   36410613,166933365080}},34564353},{27708861596,320344,31829599},{27708882839,[],20290},nil
   ,nil,27708887320,1000,<<"47f4490fe284680faaa83e44186c9037">>,[{'node.example.com',0}],0,1000,undefined},true,{btree,<0.11973.5023>,{2770135
   9082,{320334,10,{size_info,27636410613,166933365080}},34564353},fun
   couch_bt_engine:id_tre e_split/1,fun
   couch_bt_engine:id_tree_join/2,undefined,fun couch_bt_engine:id_tree_reduce/
   2,snappy},{btree,<0.11973.5023>,{27708861596,320344,31829599},fun
   couch_bt_engine:seq_tree _split/1,fun
   couch_bt_engine:seq_tree_join/2,undefined,fun couch_bt_engine:seq_tree_reduce
   /2,snappy},{btree,<0.11973.5023>,{27708882839,[],20290},fun
   couch_bt_engine:local_tree_spl it/1,fun
   couch_bt_engine:local_tree_join/2,undefined,nil,snappy},snappy,{btree,<0.11973.50
   23>,nil,fun couch_bt_engine:purge_tree_split/1,fun
   couch_bt_engine:purge_tree_join/2,undef ined,fun
   couch_bt_engine:purge_tree_reduce/2,snappy},{btree,<0.11973.5023>,nil,fun
   couch_b t_engine:purge_seq_tree_split/1,fun
   couch_bt_engine:purge_seq_tree_join/2,undefined,fun co
   uch_bt_engine:purge_tree_reduce/2,snappy}}},<0.13348.5022>,nil,0,<<"1684957391682422">>,{u
   ser_ctx,null,[],undefined},[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>
   ,{[{<<"roles">>,[<<"_admin">>]}]}}],undefined,nil,nil,undefined,[{default_security_object,
   [{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles">>,[<<"_admin"
   >>]}]}}]},{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}},{props,[]}],undefined},undef
   ined,[{43468,<<"48794d4718e1452be1b90573fdbf8bd8">>,<<"-L-3nWkJHMoh8BCsOK31">>,[{2,<<58,60
   ,181,83,213,161,108,4,33,6,123,250,199,83,148,163>>}]},{43467,<<"48794d4718e1452be1b90573f
   dbf853e">>,
   ... followed by a lot of binary data
   ```
   
   ```
   [error] 2023-05-24T20:42:03.483993Z couchdb@node.example.com emulator
   -------- Error in process <0.23943.5023> on node 'couchdb@node.example.com' with
   exit value:
   {{badkey,not_in_range},[{erlang,map_get,[not_in_range,#{[3221225472,3355443199]
   => {target
   ,{db,1,<<"shards/c0000000-c7ffffff/some-db.1566089635">>,"/mnt/couchdb/data/sh
   ards/c0000000-c7ffffff/some-db.1566089635.couch",{couch_bt_engine,{st,"/mnt/co
   uchdb/data/shards/c0000000-c7ffffff/some-db.1566089635.couch",<0.11973.5023>,#
   Ref<0.1898309387.670826504.2396>,undefined,{db_header,8,2343426,0,{27701359082,{320334,10,
   {size_info,27636410613,166933365080}},34564353},{27708861596,320344,31829599},{27708882839
   ,[],20290},nil,nil,27708887320,1000,<<"47f4490fe284680faaa83e44186c9037">>,[{'couchdb@node.example.com',0}],0,1000,undefined},true,{btree,<0.11973.
   5023>,{27701359082,{320334,10,{size_info,27636410613,166933365080}},34564353},fun
   couch_bt _engine:id_tree_split/1,fun
   couch_bt_engine:id_tree_join/2,undefined,fun couch_bt_engine:i
   d_tree_reduce/2,snappy},{btree,<0.11973.5023>,{27708861596,320344,31829599},fun
   couch_bt_e ngine:seq_tree_split/1,fun
   couch_bt_engine:seq_tree_join/2,undefined,fun couch_bt_engine:s
   eq_tree_reduce/2,snappy},{btree,<0.11973.5023>,{27708882839,[],20290},fun
   couch_bt_engine: local_tree_split/1,fun
   couch_bt_engine:local_tree_join/2,undefined,nil,snappy},snappy,{btr
   ee,<0.11973.5023>,nil,fun couch_bt_engine:purge_tree_split/1,fun
   couch_bt_engine:purge_tre e_join/2,undefined,fun
   couch_bt_engine:purge_tree_reduce/2,snappy},{btree,<0.11973.5023>,n il,fun
   couch_bt_engine:purge_seq_tree_split/1,fun
   couch_bt_engine:purge_seq_tree_join/2,un defined,fun
   couch_bt_engine:purge_tree_reduce/2,snappy}}},<0.13348.5022>,nil,0,<<"16849573
   91682422">>,{user_ctx,null,[],undefined},[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}}
   ,{<<"admins">>,{[{<<"roles">>,[<<"_admin">>]}]}}],undefined,nil,nil,undefined,[{default_se
   curity_object,[{<<"members">>,{[{<<"roles">>,[<<"_admin">>]}]}},{<<"admins">>,{[{<<"roles"
   >>,[<<"_admin">>]}]}}]},{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}},{props,[]}],un
   defined},undefined,[{43468,<<"48794d4718e1452be1b90573fdbf8bd8">>
   ... followed by a lot of binary data
   ```
   
   The database's current shards are as follows:
   
   ```json
   {
     "shards": {
       "00000000-0fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "10000000-1fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "20000000-2fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "30000000-3fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "40000000-4fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "50000000-5fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "60000000-6fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "70000000-7fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "80000000-8fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "90000000-9fffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "a0000000-afffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "b0000000-bfffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "c0000000-cfffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "d0000000-dfffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "e0000000-efffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ],
       "f0000000-ffffffff": [
         "couchdb@node-9.example.com",
         "couchdb@node-8.example.com",
         "couchdb@node-7.example.com"
       ]
     }
   }
   ```
   
   And the database info looks like this:
   
   ```json
   {
     "instance_start_time": "1566089635",
     "db_name": "some-db",
     "purge_seq": "1032970-<redacted>",
     "update_seq": "37489526-<redacted>",
     "sizes": {
       "file": 890080785312,
       "external": 5343640921800,
       "active": 887177503725
     },
     "props": {},
     "doc_del_count": 405,
     "doc_count": 10255536,
     "disk_format_version": 8,
     "compact_running": false,
     "cluster": {
       "q": 16,
       "n": 3,
       "w": 2,
       "r": 2
     }
   }
   ```
   
   
   ## Your Environment
   
   * CouchDB version used: 3.3.1
   * Operating system and version: Debian 11.7
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on issue #4624: "not_in_range" failure while resharding database

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva commented on issue #4624:
URL: https://github.com/apache/couchdb/issues/4624#issuecomment-1565262678

   This PR should fix the issue https://github.com/apache/couchdb/pull/4626


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva closed issue #4624: "not_in_range" failure while resharding database

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva closed issue #4624: "not_in_range" failure while resharding database
URL: https://github.com/apache/couchdb/issues/4624


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on issue #4624: "not_in_range" failure while resharding database

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva commented on issue #4624:
URL: https://github.com/apache/couchdb/issues/4624#issuecomment-1563127467

   Thanks for the detailed report @jcoglan. Would you be able to share few more logs starting a bit before the first error include a bit more after the ones showed if they have any stack function names and line numbers.
   
   Do you see anywhere in the logs the string `not in any target ranges` or `not_in_target_ranges`. That would indicate that the initial split didn't go well and during the second one we're just seeing the effects of that.
   
   In remsh what does the `config:get("mem3").` return?
   
   If possible share the full _dbs document `curl $URL/_node/_local/_dbs/some_db` it should have shard ranges and also the change records from the first split in it.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on issue #4624: "not_in_range" failure while resharding database

Posted by "nickva (via GitHub)" <gi...@apache.org>.
nickva commented on issue #4624:
URL: https://github.com/apache/couchdb/issues/4624#issuecomment-1563495430

   I wonder if this happened: after the first split (to Q=16), we ran the internal replicator to top off changes from source to the targets. When doing so we also pushed purges to the new targets however in the internal replicator we didn't pick them by the hash function of target range, but copied them as is. So S->T1|T2 both T1 and T2 might have gotten purge_info with DocID=<<"a">>, even though the hashing would have put it only on T1, but T2 got it as well.
   
   https://github.com/apache/couchdb/blob/c75c31d38c51c4b2c6c2c59103c204b2294c6895/src/mem3/src/mem3_rep.erl#L317-L344
   
   On next split 16 -> 32 we start by doing the initial copy in https://github.com/apache/couchdb/blob/c75c31d38c51c4b2c6c2c59103c204b2294c6895/src/couch/src/couch_db_split.erl#L340-L344 and when we get to purge infos we do the right thing and pick a target according to the hash function however we also now find out that some purge infos do not belong to any of the target. So if T2 -> T21|T22 is split and purge info <<"a">> doesn't belong to T2 to start with we'd get that exception.
   
   To confirm this theory, the first step would be to double-check that the actual document IDs are on the shards they are supposed to be in. That is we'd assert Q=16 document placement looks sane. Then do the same for each purge info and here we should find that some purge infos do no belong on those shards.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org