You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Luca Morandini <lu...@gmail.com> on 2022/07/07 07:28:47 UTC

Shards cannot be read after move to a different cluster

Dear All,

I moved some CouchDB 3.1.0 databases to a new 4-node cluster via
copying the shard files.

The operation worked for 5 out of 6 databases; the biggest database
(about 200GB, 12 shards, 2 replicas) did not come online on the new
cluster.

I suspect high disk latency, but... could someone shed some light on this?

The relevant logs are:

[info] 2022-07-06T04:30:44.697901Z couchdb@10.0.0.80
\u003c0.228.0\u003e -------- db
shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
{timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}}
[error] 2022-07-06T04:30:44.698269Z couchdb@10.0.0.80
\u003c0.26789.5\u003e -------- CRASH REPORT Process
(\u003c0.26789.5\u003e) with 2 neighbors exited with reason:
{timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}} at
gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
\u003c= couch_bt_engine:init/2(line:157) \u003c=
couch_db_engine:init/3(line:775) \u003c=
couch_db_updater:init/1(line:43) \u003c=
proc_lib:init_p_do_apply/3(line:247); initial_call:
{couch_db_updater,init,['Argument__1']}, ancestors:
[\u003c0.26784.5\u003e], message_queue_len: 0, messages: [], links:
[\u003c0.26784.5\u003e,\u003c0.26790.5\u003e], dictionary:
[{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
trap_exit: false, status: running, heap_size: 610, stack_size: 27,
reductions: 250
[error] 2022-07-06T04:56:10.077664Z couchdb@10.0.0.80
\u003c0.6591.6\u003e -------- CRASH REPORT Process
(\u003c0.6591.6\u003e) with 2 neighbors exited with reason:
{timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}} at
gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
\u003c= couch_bt_engine:init/2(line:157) \u003c=
couch_db_engine:init/3(line:775) \u003c=
couch_db_updater:init/1(line:43) \u003c=
proc_lib:init_p_do_apply/3(line:247); initial_call:
{couch_db_updater,init,['Argument__1']}, ancestors:
[\u003c0.6584.6\u003e], message_queue_len: 0, messages: [], links:
[\u003c0.6584.6\u003e,\u003c0.6593.6\u003e], dictionary:
[{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
trap_exit: false, status: running, heap_size: 610, stack_size: 27,
reductions: 250
[info] 2022-07-06T04:56:10.077711Z couchdb@10.0.0.80
\u003c0.228.0\u003e -------- db
shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
{timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}}
[info] 2022-07-07T06:44:13.863950Z couchdb@10.0.0.80
\u003c0.228.0\u003e -------- db
shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
{timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}}
[error] 2022-07-07T06:44:13.864516Z couchdb@10.0.0.80
\u003c0.9152.29\u003e -------- CRASH REPORT Process
(\u003c0.9152.29\u003e) with 2 neighbors exited with reason:
{timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}} at
gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
\u003c= couch_bt_engine:init/2(line:157) \u003c=
couch_db_engine:init/3(line:775) \u003c=
couch_db_updater:init/1(line:43) \u003c=
proc_lib:init_p_do_apply/3(line:247); initial_call:
{couch_db_updater,init,['Argument__1']}, ancestors:
[\u003c0.9136.29\u003e], message_queue_len: 0, messages: [], links:
[\u003c0.9136.29\u003e,\u003c0.9139.29\u003e], dictionary:
[{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
trap_exit: false, status: running, heap_size: 610, stack_size: 27,
reductions: 250

Cheers,

Luca Morandini

Re: Shards cannot be read after move to a different cluster

Posted by Luca Morandini <lu...@gmail.com>.

On Fri, 8 Jul 2022 at 22:04, Robert Newson <rn...@apache.org> wrote:

> Hi,
>
> The config file isn't monitored, so just changing the file won't help,
> you'd need to restart couchdb.

I did not change the config file, used the config API instead and checked
that the files were changed afyerwards.

Did you have anything bypassed in the first place, though?

Yes:
- os_process
-read
- write
- view_update

Could you explain the replication problems you encountered?

A few of these:

[error] 2022-04-27T00:14:26.027209Z
couchdb@couchdb-310-couchdb-2.couchdb-310-couchdb.couchdb.svc.cluster.local
<0.28525.7700> 734f3d6415 req_err(50502807) unknown_error : normal
[<<"chttpd:catch_error/3 L358">>,<<"chttpd:handle_req_after_auth/2
L324">>,<<"chttpd:process_request/1
L305">>,<<"chttpd:handle_request_int/1
L243">>,<<"mochiweb_http:headers/6
L150">>,<<"proc_lib:init_p_do_apply/3 L247">>]

In addition to this, the source cluster was constantly compacting
views and shards, even when there wwere no transactions.

I can say for sure that it is generally unsafe to modify shard files
> out-of-band while couchdb is running, as it appears you did at step 5.
> Couchdb may well have opened the shard files (created at step 3) and it
> holds them open in a cache.
>

The renaming of files to match the database-id happened on a shared volume
that was not used by either cluster, only after setting the target cluster
to maintenance mode I copied the shard files.

> I don't think we have written up how to do this properly (we strongly
> advise replication instead) but I did write an SO post a while ago:
> https://stackoverflow.com/questions/6676972/moving-a-shard-from-one-bigcouch-server-to-another-for-balancing.
> The sharing scheme for bigcouch is the same as for couchdb 3.x.
>
> The essential difference is to _not_ create the clustered database at the
> target cluster until _after_ you've copied the shard files over. You then
> create the '_dbs' doc yourself. (Note that in big couch this database was
> called "dbs").

I based my procedure on the  CouchDB doc (4.4.3. Moving a shard).

Cheers,
Luca Morandini

Re: Shards cannot be read after move to a different cluster

Posted by Robert Newson <rn...@apache.org>.

Hi,

The config file isn't monitored, so just changing the file won't help, you'd need to restart couchdb.

Did you have anything bypassed in the first place, though?

Could you explain the replication problems you encountered?

I can say for sure that it is generally unsafe to modify shard files out-of-band while couchdb is running, as it appears you did at step 5. Couchdb may well have opened the shard files (created at step 3) and it holds them open in a cache.

I don't think we have written up how to do this properly (we strongly advise replication instead) but I did write an SO post a while ago: https://stackoverflow.com/questions/6676972/moving-a-shard-from-one-bigcouch-server-to-another-for-balancing. The sharing scheme for bigcouch is the same as for couchdb 3.x.

The essential difference is to _not_ create the clustered database at the target cluster until _after_ you've copied the shard files over. You then create the '_dbs' doc yourself. (Note that in big couch this database was called "dbs").

B.

> On 8 Jul 2022, at 09:08, Luca Morandini <lu...@gmail.com> wrote:
> 
> On Fri, 8 Jul 2022 at 17:17, Robert Newson <rn...@apache.org> wrote:
>> 
>> Hi,
>> 
>> There's a bug in 3.1.0 that affects you. Namely that the default 5 second gen_server timeout is used for some requests if ioq bypass is enabled. Please check if your config has a [ioq.bypass] section and try again without bypasses for a time.
> 
> Thanks for taking the time to answer me.
> 
> I set all the settings of the [ioq.bypass] section to false, set the
> cluster in maintenance mode, waited a couple minutes, than set
> maintenance to false... but no joy.
> 
> 
>> If you could explain your migration process in more detail perhaps we can find other explanations. I note that such migrations are better done online using replication, moving the files around is a bit more challenging.
> 
> I tried replication, but it failed, hence the shard files copy.
> 
> The procedure I followed (a tad simplified):
> - set the source cluster in maintenance mode;
> - copied the shard files to a shared disk;
> - created a database with the same name on the target cluster;
> - changed the database id on the copied shard files to match the
> newly-created one on the target cluster;
> - set the target cluster to maintenance mode;
> - copied the shard files from the shared disk to the target cluster
> data directories, making sure to get the shard directories right;
> - unset the maintenance mode on the target cluster.
> 
> The procedure above worked for a few databases (including one that
> -with replicas- was 6GB) but failed with the 200GB database.
> 
> Cheers,
> 
> Luca Morandini

Re: Shards cannot be read after move to a different cluster

Posted by Luca Morandini <lu...@gmail.com>.

On Fri, 8 Jul 2022 at 17:17, Robert Newson <rn...@apache.org> wrote:
>
> Hi,
>
> There's a bug in 3.1.0 that affects you. Namely that the default 5 second gen_server timeout is used for some requests if ioq bypass is enabled. Please check if your config has a [ioq.bypass] section and try again without bypasses for a time.

Thanks for taking the time to answer me.

I set all the settings of the [ioq.bypass] section to false, set the
cluster in maintenance mode, waited a couple minutes, than set
maintenance to false... but no joy.


> If you could explain your migration process in more detail perhaps we can find other explanations. I note that such migrations are better done online using replication, moving the files around is a bit more challenging.

I tried replication, but it failed, hence the shard files copy.

The procedure I followed (a tad simplified):
- set the source cluster in maintenance mode;
- copied the shard files to a shared disk;
- created a database with the same name on the target cluster;
- changed the database id on the copied shard files to match the
newly-created one on the target cluster;
- set the target cluster to maintenance mode;
- copied the shard files from the shared disk to the target cluster
data directories, making sure to get the shard directories right;
- unset the maintenance mode on the target cluster.

The procedure above worked for a few databases (including one that
-with replicas- was 6GB) but failed with the 200GB database.

Cheers,

Luca Morandini

Re: Shards cannot be read after move to a different cluster

Posted by Robert Newson <rn...@apache.org>.

Hi,

There's a bug in 3.1.0 that affects you. Namely that the default 5 second gen_server timeout is used for some requests if ioq bypass is enabled. Please check if your config has a [ioq.bypass] section and try again without bypasses for a time.

If you could explain your migration process in more detail perhaps we can find other explanations. I note that such migrations are better done online using replication, moving the files around is a bit more challenging.

B.

> On 7 Jul 2022, at 08:28, Luca Morandini <lu...@gmail.com> wrote:
> 
> Dear All,
> 
> I moved some CouchDB 3.1.0 databases to a new 4-node cluster via
> copying the shard files.
> 
> The operation worked for 5 out of 6 databases; the biggest database
> (about 200GB, 12 shards, 2 replicas) did not come online on the new
> cluster.
> 
> I suspect high disk latency, but... could someone shed some light on this?
> 
> The relevant logs are:
> 
> [info] 2022-07-06T04:30:44.697901Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}}
> [error] 2022-07-06T04:30:44.698269Z couchdb@10.0.0.80
> \u003c0.26789.5\u003e -------- CRASH REPORT Process
> (\u003c0.26789.5\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.26784.5\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.26784.5\u003e,\u003c0.26790.5\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> [error] 2022-07-06T04:56:10.077664Z couchdb@10.0.0.80
> \u003c0.6591.6\u003e -------- CRASH REPORT Process
> (\u003c0.6591.6\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.6584.6\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.6584.6\u003e,\u003c0.6593.6\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> [info] 2022-07-06T04:56:10.077711Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}}
> [info] 2022-07-07T06:44:13.863950Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}}
> [error] 2022-07-07T06:44:13.864516Z couchdb@10.0.0.80
> \u003c0.9152.29\u003e -------- CRASH REPORT Process
> (\u003c0.9152.29\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.9136.29\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.9136.29\u003e,\u003c0.9139.29\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> 
> Cheers,
> 
> Luca Morandini

Re: Shards cannot be read after move to a different cluster

Posted by Jeff Marshall <ma...@gmail.com>.

unsubscribe

> On Jul 7, 2022, at 3:29 AM, Luca Morandini <lu...@gmail.com> wrote:
> 
> Dear All,
> 
> I moved some CouchDB 3.1.0 databases to a new 4-node cluster via
> copying the shard files.
> 
> The operation worked for 5 out of 6 databases; the biggest database
> (about 200GB, 12 shards, 2 replicas) did not come online on the new
> cluster.
> 
> I suspect high disk latency, but... could someone shed some light on this?
> 
> The relevant logs are:
> 
> [info] 2022-07-06T04:30:44.697901Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}}
> [error] 2022-07-06T04:30:44.698269Z couchdb@10.0.0.80
> \u003c0.26789.5\u003e -------- CRASH REPORT Process
> (\u003c0.26789.5\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.26790.5\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.26784.5\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.26784.5\u003e,\u003c0.26790.5\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> [error] 2022-07-06T04:56:10.077664Z couchdb@10.0.0.80
> \u003c0.6591.6\u003e -------- CRASH REPORT Process
> (\u003c0.6591.6\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.6584.6\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.6584.6\u003e,\u003c0.6593.6\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> [info] 2022-07-06T04:56:10.077711Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.6593.6\u003e,find_header]}}
> [info] 2022-07-07T06:44:13.863950Z couchdb@10.0.0.80
> \u003c0.228.0\u003e -------- db
> shards/95555553-aaaaaaa7/twitter.1657067184 died with reason
> {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}}
> [error] 2022-07-07T06:44:13.864516Z couchdb@10.0.0.80
> \u003c0.9152.29\u003e -------- CRASH REPORT Process
> (\u003c0.9152.29\u003e) with 2 neighbors exited with reason:
> {timeout,{gen_server,call,[\u003c0.9139.29\u003e,find_header]}} at
> gen_server:call/2(line:206) \u003c= couch_file:read_header/1(line:378)
> \u003c= couch_bt_engine:init/2(line:157) \u003c=
> couch_db_engine:init/3(line:775) \u003c=
> couch_db_updater:init/1(line:43) \u003c=
> proc_lib:init_p_do_apply/3(line:247); initial_call:
> {couch_db_updater,init,['Argument__1']}, ancestors:
> [\u003c0.9136.29\u003e], message_queue_len: 0, messages: [], links:
> [\u003c0.9136.29\u003e,\u003c0.9139.29\u003e], dictionary:
> [{io_priority,{db_update,\u003c\u003c\"shards/95555553-aaaaaaa7/twitter.16570671...\"\u003e\u003e}},...],
> trap_exit: false, status: running, heap_size: 610, stack_size: 27,
> reductions: 250
> 
> Cheers,
> 
> Luca Morandini