You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Timo Ahokas <ti...@gmail.com> on 2016/09/22 13:47:17 UTC

Rebuild failing when adding new datacenter (3.0.8)

Hi,

We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15) currently
running in two data centers (13 and 19 nodes, RF3 in both). We are adding a
third data center before decommissioning one of the earlier ones.
Installing Cassandra (3.0.8) goes fine and all the nodes join the cluster
(not set to bootstrap, as documented in
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
).

When trying to rebuild nodes in the new DC from a previous DC (nodetool
rebuild -- DC1), we get the following error:

Unable to find sufficient sources for streaming range
(597769692463489739,597931451954862346] in keyspace system_distributed

The same error occurs which ever of the 2 existing DCs we try to rebuild
from.

We run pr repairs (nodetool repair -pr) on all nodes twice a week via cron.

Any advice on how to get the rebuild started?

Best regards,
Timo

Re: Rebuild failing when adding new datacenter (3.0.8)

Posted by Timo Ahokas <ti...@gmail.com>.
Hi Yabin/Alain,

I changed the replication strategies for system_distributed, system_auth
and system_traces to use NetworkTopologyStrategies and repaired the
affected keyspaces. Now the rebuild process starts up ok without errors.

Thanks a lot for your help!

Best regards,
Timo

On 22 September 2016 at 21:16, Yabin Meng <ya...@gmail.com> wrote:

> It is a Cassandra bug. The workaround is to change system_distributed
> keyspce replication strategy to something as below:
>
>       alter keyspace  system_distributed with replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>
> You may see similar problem for other system keyspaces. Do the same thing.
>
> Cheers,
>
> Yabin
>
> On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <ti...@gmail.com>
> wrote:
>
>> Hi Alain,
>>
>> Our normal user keyspaces have RF3 in all DCs, e.g:
>>
>> create keyspace reporting with replication = {'class':
>> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>>
>> Any idea would it be safe to change the system_distributed keyspace to
>> match this?
>>
>> -Timo
>>
>> On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:
>>
>>> Hi Alain,
>>>
>>> Thanks a lot for a helping out!
>>>
>>> Some of the basic keyspace / cluster info you requested:
>>>
>>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>
>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>>    keyspace_name text,
>>>
>>>    columnfamily_name text,
>>>
>>>    id timeuuid,
>>>
>>>    coordinator inet,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    parent_id timeuuid,
>>>
>>>    participants set<inet>,
>>>
>>>    range_begin text,
>>>
>>>    range_end text,
>>>
>>>    started_at timestamp,
>>>
>>>    status text,
>>>
>>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>>    AND bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>>    parent_id timeuuid PRIMARY KEY,
>>>
>>>    columnfamily_names set<text>,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    keyspace_name text,
>>>
>>>    requested_ranges set<text>,
>>>
>>>    started_at timestamp,
>>>
>>>    successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>>    keyspace_name text,
>>>
>>>    columnfamily_name text,
>>>
>>>    id timeuuid,
>>>
>>>    coordinator inet,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    parent_id timeuuid,
>>>
>>>    participants set<inet>,
>>>
>>>    range_begin text,
>>>
>>>    range_end text,
>>>
>>>    started_at timestamp,
>>>
>>>    status text,
>>>
>>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>>    AND bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>>    parent_id timeuuid PRIMARY KEY,
>>>
>>>    columnfamily_names set<text>,
>>>
>>>    exception_message text,
>>>
>>>    exception_stacktrace text,
>>>
>>>    finished_at timestamp,
>>>
>>>    keyspace_name text,
>>>
>>>    requested_ranges set<text>,
>>>
>>>    started_at timestamp,
>>>
>>>    successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>>    AND comment = 'Repair history'
>>>
>>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>>    AND crc_check_chance = 1.0
>>>
>>>    AND dclocal_read_repair_chance = 0.0
>>>
>>>    AND default_time_to_live = 0
>>>
>>>    AND gc_grace_seconds = 0
>>>
>>>    AND max_index_interval = 2048
>>>
>>>    AND memtable_flush_period_in_ms = 3600000
>>>
>>>    AND min_index_interval = 128
>>>
>>>    AND read_repair_chance = 0.0
>>>
>>>    AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>>
>>> # nodetool status
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.145.5    693,63 GB  256          ?
>>>       6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>>
>>> UN  xxx.xxx.145.225  648,55 GB  256          ?
>>>       f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>>
>>> UN  xxx.xxx.145.160  608,31 GB  256          ?
>>>       d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>>
>>> UN  xxx.xxx.145.67   552,93 GB  256          ?
>>>       1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>>
>>> UN  xxx.xxx.145.227  636,68 GB  256          ?
>>>       47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>>
>>> UN  xxx.xxx.146.105  610,9 GB   256          ?
>>>       8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>>
>>> UN  xxx.xxx.147.136  666,82 GB  256          ?
>>>       bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>>
>>> UN  xxx.xxx.146.213  609,79 GB  256          ?
>>>       6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>>
>>> UN  xxx.xxx.146.20   664,44 GB  256          ?
>>>       b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>>
>>> UN  xxx.xxx.146.209  615,44 GB  256          ?
>>>       898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>>
>>> UN  xxx.xxx.146.241  668,91 GB  256          ?
>>>       0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>>
>>> UN  xxx.xxx.147.211  641,33 GB  256          ?
>>>       16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>>
>>> UN  xxx.xxx.147.125  647,03 GB  256          ?
>>>       2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.7.99     18,76 MB   256          ?
>>>       d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>>
>>> UN  xxx.xxx.6.135    16,04 MB   256          ?
>>>       463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>>
>>> UN  xxx.xxx.7.229    17,36 MB   256          ?
>>>       9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>>
>>> UN  xxx.xxx.7.5      14,01 MB   256          ?
>>>       ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>>
>>> UN  xxx.xxx.7.4      14,93 MB   256          ?
>>>       122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>>
>>> UN  xxx.xxx.6.10     16,77 MB   256          ?
>>>       bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>>
>>> UN  xxx.xxx.6.15     14,95 MB   256          ?
>>>       668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>>
>>> UN  xxx.xxx.7.140    17,38 MB   256          ?
>>>       7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>>
>>> UN  xxx.xxx.7.113    19,14 MB   256          ?
>>>       46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>>
>>> UN  xxx.xxx.6.118    16,7 MB    256          ?
>>>       9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>>
>>> UN  xxx.xxx.6.248    17,29 MB   256          ?
>>>       35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>>
>>> UN  xxx.xxx.5.24     16,55 MB   256          ?
>>>       5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>>
>>> UN  xxx.xxx.7.189    16,63 MB   256          ?
>>>       be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>>
>>> UN  xxx.xxx.5.124    20,37 MB   256          ?
>>>       638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>>
>>> UN  xxx.xxx.6.60     24,57 MB   256          ?
>>>       cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns    Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.151.102  389,41 GB  256          ?
>>>       1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>>
>>> UN  xxx.xxx.149.161  367,82 GB  256          ?
>>>       3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>>
>>> UN  xxx.xxx.149.226  390,88 GB  256          ?
>>>       b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>>
>>> UN  xxx.xxx.151.162  408,35 GB  256          ?
>>>       54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>>
>>> UN  xxx.xxx.149.109  369,33 GB  256          ?
>>>       9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>>
>>> UN  xxx.xxx.150.172  362,32 GB  256          ?
>>>       ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>>
>>> UN  xxx.xxx.149.238  388,98 GB  256          ?
>>>       a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>>
>>> UN  xxx.xxx.151.232  435,31 GB  256          ?
>>>       500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>>
>>> UN  xxx.xxx.151.43   410,69 GB  256          ?
>>>       b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>>
>>> UN  xxx.xxx.151.139  407,47 GB  256          ?
>>>       ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>>
>>> UN  xxx.xxx.151.213  375,05 GB  256          ?
>>>       9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>>
>>> UN  xxx.xxx.149.177  401,91 GB  256          ?
>>>       b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>>
>>> UN  xxx.xxx.150.145  388,76 GB  256          ?
>>>       1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>>
>>> UN  xxx.xxx.149.48   385,43 GB  256          ?
>>>       ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>>
>>> UN  xxx.xxx.150.189  384,52 GB  256          ?
>>>       f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>>
>>> UN  xxx.xxx.151.220  357,56 GB  256          ?
>>>       feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>>
>>> UN  xxx.xxx.149.121  355,64 GB  256          ?
>>>       47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>>
>>> UN  xxx.xxx.151.218  416,57 GB  256          ?
>>>       bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>>>
>>> UN  xxx.xxx.150.26   383,06 GB  256          ?
>>>       1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>>
>>> Note: Non-system keyspaces don't have the same replication settings,
>>> effective ownership information is meaningless
>>>
>>>
>>>
>>> # nodetool status system_distributed
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.145.5    693,63 GB  256          6,2%
>>>              6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>>
>>> UN  xxx.xxx.145.225  648,55 GB  256          6,8%
>>>              f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>>
>>> UN  xxx.xxx.145.160  608,31 GB  256          6,5%
>>>              d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>>
>>> UN  xxx.xxx.145.67   552,93 GB  256          6,1%
>>>              1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>>
>>> UN  xxx.xxx.145.227  636,68 GB  256          6,0%
>>>              47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>>
>>> UN  xxx.xxx.146.105  610,9 GB   256          6,1%
>>>              8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>>
>>> UN  xxx.xxx.147.136  666,82 GB  256          6,3%
>>>              bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>>
>>> UN  xxx.xxx.146.213  609,79 GB  256          6,0%
>>>              6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>>
>>> UN  xxx.xxx.146.20   664,44 GB  256          7,0%
>>>              b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>>
>>> UN  xxx.xxx.146.209  615,44 GB  256          6,6%
>>>              898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>>
>>> UN  xxx.xxx.146.241  668,91 GB  256          6,2%
>>>              0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>>
>>> UN  xxx.xxx.147.211  641,33 GB  256          6,5%
>>>              16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>>
>>> UN  xxx.xxx.147.125  647,03 GB  256          6,3%
>>>              2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.7.99     18,76 MB   256          6,3%
>>>              d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>>
>>> UN  xxx.xxx.6.135    16,04 MB   256          6,1%
>>>              463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>>
>>> UN  xxx.xxx.7.229    17,36 MB   256          5,9%
>>>              9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>>
>>> UN  xxx.xxx.7.5      14,01 MB   256          6,2%
>>>              ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>>
>>> UN  xxx.xxx.7.4      14,93 MB   256          6,4%
>>>              122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>>
>>> UN  xxx.xxx.6.10     16,77 MB   256          6,4%
>>>              bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>>
>>> UN  xxx.xxx.6.15     14,95 MB   256          6,1%
>>>              668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>>
>>> UN  xxx.xxx.7.140    17,38 MB   256          6,7%
>>>              7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>>
>>> UN  xxx.xxx.7.113    19,14 MB   256          6,8%
>>>              46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>>
>>> UN  xxx.xxx.6.118    16,7 MB    256          6,7%
>>>              9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>>
>>> UN  xxx.xxx.6.248    17,29 MB   256          6,9%
>>>              35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>>
>>> UN  xxx.xxx.5.24     16,55 MB   256          6,8%
>>>              5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>>
>>> UN  xxx.xxx.7.189    16,63 MB   256          6,2%
>>>              be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>>
>>> UN  xxx.xxx.5.124    20,37 MB   256          6,3%
>>>              638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>>
>>> UN  xxx.xxx.6.60     24,57 MB   256          6,4%
>>>              cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>>                               Rack
>>>
>>> UN  xxx.xxx.151.102  389,41 GB  256          6,4%
>>>              1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>>
>>> UN  xxx.xxx.149.161  367,82 GB  256          6,3%
>>>              3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>>
>>> UN  xxx.xxx.149.226  390,88 GB  256          6,2%
>>>              b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>>
>>> UN  xxx.xxx.151.162  408,35 GB  256          6,4%
>>>              54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>>
>>> UN  xxx.xxx.149.109  369,33 GB  256          6,2%
>>>              9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>>
>>> UN  xxx.xxx.150.172  362,32 GB  256          6,0%
>>>              ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>>
>>> UN  xxx.xxx.149.238  388,98 GB  256          6,4%
>>>              a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>>
>>> UN  xxx.xxx.151.232  435,31 GB  256          6,6%
>>>              500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>>
>>> UN  xxx.xxx.151.43   410,69 GB  256          6,2%
>>>              b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>>
>>> UN  xxx.xxx.151.139  407,47 GB  256          6,2%
>>>              ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>>
>>> UN  xxx.xxx.151.213  375,05 GB  256          6,5%
>>>              9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>>
>>> UN  xxx.xxx.149.177  401,91 GB  256          6,6%
>>>              b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>>
>>> UN  xxx.xxx.150.145  388,76 GB  256          7,1%
>>>              1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>>
>>> UN  xxx.xxx.149.48   385,43 GB  256          6,2%
>>>              ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>>
>>> UN  xxx.xxx.150.189  384,52 GB  256          6,4%
>>>              f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>>
>>> UN  xxx.xxx.151.220  357,56 GB  256          6,1%
>>>              feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>>
>>> UN  xxx.xxx.149.121  355,64 GB  256          6,4%
>>>              47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>>
>>> UN  xxx.xxx.151.218  416,57 GB  256          6,3%
>>>              bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>>> UN  xxx.xxx.150.26   383,06 GB  256          6,7%
>>>              1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>>
>>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
>>> seen from the data loads).
>>>
>>> For the snitch we are using GossipingPropertyFileSnitch and a
>>> cassandra-rackdc.properties with config such as:
>>> dc=DC1
>>> rack=RAC1
>>>
>>> Just noticed that we also have cassandra-topology.properties present on
>>> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>>>
>>> I was wondering on whether the replication settings for the
>>> system_distributed keyspace might need a change, but didn't find any yet
>>> documentation pointing to that.
>>>
>>> Best regards,
>>> Timo
>>>
>>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com>
>>> wrote:
>>>
>>>> It could be a bug.
>>>>
>>>> Yet I am not very aware of this system_distributed keyspace, but from
>>>> what I see, it is using a simple strategy:
>>>>
>>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>>>> cqlsh $(hostname -I | awk '{print $1}')
>>>>
>>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>>>
>>>> Let's first check some stuff. Could you share the output of:
>>>>
>>>>
>>>>    - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>>    [ip_address_of_the_server]
>>>>    - nodetool status
>>>>    - nodetool status system_distributed
>>>>    - Let us know about the snitch you are using and the corresponding
>>>>    configuration.
>>>>
>>>>
>>>> I am trying to make sure the command you used is expected to work,
>>>> given your setup.
>>>>
>>>> My guess is this you might need to alter this keyspace accordingly to
>>>> your cluster setup.
>>>>
>>>> Just guessing, hope that helps.
>>>>
>>>> C*heers,
>>>> -----------------------
>>>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>>>> France
>>>>
>>>> The Last Pickle - Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
>>>>
>>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>>>> are adding a third data center before decommissioning one of the earlier
>>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>>>> cluster (not set to bootstrap, as documented in
>>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>>> ons/opsAddDCToCluster.html).
>>>>>
>>>>> When trying to rebuild nodes in the new DC from a previous DC
>>>>> (nodetool rebuild -- DC1), we get the following error:
>>>>>
>>>>> Unable to find sufficient sources for streaming range
>>>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>>>
>>>>> The same error occurs which ever of the 2 existing DCs we try to
>>>>> rebuild from.
>>>>>
>>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>>>> cron.
>>>>>
>>>>> Any advice on how to get the rebuild started?
>>>>>
>>>>> Best regards,
>>>>> Timo
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Rebuild failing when adding new datacenter (3.0.8)

Posted by Yabin Meng <ya...@gmail.com>.
It is a Cassandra bug. The workaround is to change system_distributed
keyspce replication strategy to something as below:

      alter keyspace  system_distributed with replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};

You may see similar problem for other system keyspaces. Do the same thing.

Cheers,

Yabin

On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <ti...@gmail.com> wrote:

> Hi Alain,
>
> Our normal user keyspaces have RF3 in all DCs, e.g:
>
> create keyspace reporting with replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>
> Any idea would it be safe to change the system_distributed keyspace to
> match this?
>
> -Timo
>
> On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:
>
>> Hi Alain,
>>
>> Thanks a lot for a helping out!
>>
>> Some of the basic keyspace / cluster info you requested:
>>
>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>
>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>
>> CREATE TABLE system_distributed.repair_history (
>>
>>    keyspace_name text,
>>
>>    columnfamily_name text,
>>
>>    id timeuuid,
>>
>>    coordinator inet,
>>
>>    exception_message text,
>>
>>    exception_stacktrace text,
>>
>>    finished_at timestamp,
>>
>>    parent_id timeuuid,
>>
>>    participants set<inet>,
>>
>>    range_begin text,
>>
>>    range_end text,
>>
>>    started_at timestamp,
>>
>>    status text,
>>
>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>
>> ) WITH CLUSTERING ORDER BY (id ASC)
>>
>>    AND bloom_filter_fp_chance = 0.01
>>
>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>>    AND comment = 'Repair history'
>>
>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>>    AND crc_check_chance = 1.0
>>
>>    AND dclocal_read_repair_chance = 0.0
>>
>>    AND default_time_to_live = 0
>>
>>    AND gc_grace_seconds = 0
>>
>>    AND max_index_interval = 2048
>>
>>    AND memtable_flush_period_in_ms = 3600000
>>
>>    AND min_index_interval = 128
>>
>>    AND read_repair_chance = 0.0
>>
>>    AND speculative_retry = '99PERCENTILE';
>>
>> CREATE TABLE system_distributed.parent_repair_history (
>>
>>    parent_id timeuuid PRIMARY KEY,
>>
>>    columnfamily_names set<text>,
>>
>>    exception_message text,
>>
>>    exception_stacktrace text,
>>
>>    finished_at timestamp,
>>
>>    keyspace_name text,
>>
>>    requested_ranges set<text>,
>>
>>    started_at timestamp,
>>
>>    successful_ranges set<text>
>>
>> ) WITH bloom_filter_fp_chance = 0.01
>>
>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>>    AND comment = 'Repair history'
>>
>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>>    AND crc_check_chance = 1.0
>>
>>    AND dclocal_read_repair_chance = 0.0
>>
>>    AND default_time_to_live = 0
>>
>>    AND gc_grace_seconds = 0
>>
>>    AND max_index_interval = 2048
>>
>>    AND memtable_flush_period_in_ms = 3600000
>>
>>    AND min_index_interval = 128
>>
>>    AND read_repair_chance = 0.0
>>
>>    AND speculative_retry = '99PERCENTILE';
>>
>>
>> CREATE TABLE system_distributed.repair_history (
>>
>>    keyspace_name text,
>>
>>    columnfamily_name text,
>>
>>    id timeuuid,
>>
>>    coordinator inet,
>>
>>    exception_message text,
>>
>>    exception_stacktrace text,
>>
>>    finished_at timestamp,
>>
>>    parent_id timeuuid,
>>
>>    participants set<inet>,
>>
>>    range_begin text,
>>
>>    range_end text,
>>
>>    started_at timestamp,
>>
>>    status text,
>>
>>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>
>> ) WITH CLUSTERING ORDER BY (id ASC)
>>
>>    AND bloom_filter_fp_chance = 0.01
>>
>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>>    AND comment = 'Repair history'
>>
>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>>    AND crc_check_chance = 1.0
>>
>>    AND dclocal_read_repair_chance = 0.0
>>
>>    AND default_time_to_live = 0
>>
>>    AND gc_grace_seconds = 0
>>
>>    AND max_index_interval = 2048
>>
>>    AND memtable_flush_period_in_ms = 3600000
>>
>>    AND min_index_interval = 128
>>
>>    AND read_repair_chance = 0.0
>>
>>    AND speculative_retry = '99PERCENTILE';
>>
>> CREATE TABLE system_distributed.parent_repair_history (
>>
>>    parent_id timeuuid PRIMARY KEY,
>>
>>    columnfamily_names set<text>,
>>
>>    exception_message text,
>>
>>    exception_stacktrace text,
>>
>>    finished_at timestamp,
>>
>>    keyspace_name text,
>>
>>    requested_ranges set<text>,
>>
>>    started_at timestamp,
>>
>>    successful_ranges set<text>
>>
>> ) WITH bloom_filter_fp_chance = 0.01
>>
>>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>>    AND comment = 'Repair history'
>>
>>    AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>>    AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>>    AND crc_check_chance = 1.0
>>
>>    AND dclocal_read_repair_chance = 0.0
>>
>>    AND default_time_to_live = 0
>>
>>    AND gc_grace_seconds = 0
>>
>>    AND max_index_interval = 2048
>>
>>    AND memtable_flush_period_in_ms = 3600000
>>
>>    AND min_index_interval = 128
>>
>>    AND read_repair_chance = 0.0
>>
>>    AND speculative_retry = '99PERCENTILE';
>>
>>
>>
>> # nodetool status
>>
>> Datacenter: DC1
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns    Host ID
>>                               Rack
>>
>> UN  xxx.xxx.145.5    693,63 GB  256          ?
>>       6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>
>> UN  xxx.xxx.145.225  648,55 GB  256          ?
>>       f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>
>> UN  xxx.xxx.145.160  608,31 GB  256          ?
>>       d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>
>> UN  xxx.xxx.145.67   552,93 GB  256          ?
>>       1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>
>> UN  xxx.xxx.145.227  636,68 GB  256          ?
>>       47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>
>> UN  xxx.xxx.146.105  610,9 GB   256          ?
>>       8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>
>> UN  xxx.xxx.147.136  666,82 GB  256          ?
>>       bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>
>> UN  xxx.xxx.146.213  609,79 GB  256          ?
>>       6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>
>> UN  xxx.xxx.146.20   664,44 GB  256          ?
>>       b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>
>> UN  xxx.xxx.146.209  615,44 GB  256          ?
>>       898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>
>> UN  xxx.xxx.146.241  668,91 GB  256          ?
>>       0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>
>> UN  xxx.xxx.147.211  641,33 GB  256          ?
>>       16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>
>> UN  xxx.xxx.147.125  647,03 GB  256          ?
>>       2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>
>> Datacenter: DC2
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns    Host ID
>>                               Rack
>>
>> UN  xxx.xxx.7.99     18,76 MB   256          ?
>>       d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>
>> UN  xxx.xxx.6.135    16,04 MB   256          ?
>>       463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>
>> UN  xxx.xxx.7.229    17,36 MB   256          ?
>>       9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>
>> UN  xxx.xxx.7.5      14,01 MB   256          ?
>>       ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>
>> UN  xxx.xxx.7.4      14,93 MB   256          ?
>>       122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>
>> UN  xxx.xxx.6.10     16,77 MB   256          ?
>>       bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>
>> UN  xxx.xxx.6.15     14,95 MB   256          ?
>>       668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>
>> UN  xxx.xxx.7.140    17,38 MB   256          ?
>>       7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>
>> UN  xxx.xxx.7.113    19,14 MB   256          ?
>>       46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>
>> UN  xxx.xxx.6.118    16,7 MB    256          ?
>>       9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>
>> UN  xxx.xxx.6.248    17,29 MB   256          ?
>>       35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>
>> UN  xxx.xxx.5.24     16,55 MB   256          ?
>>       5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>
>> UN  xxx.xxx.7.189    16,63 MB   256          ?
>>       be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>
>> UN  xxx.xxx.5.124    20,37 MB   256          ?
>>       638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>
>> UN  xxx.xxx.6.60     24,57 MB   256          ?
>>       cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>
>> Datacenter: DC3
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns    Host ID
>>                               Rack
>>
>> UN  xxx.xxx.151.102  389,41 GB  256          ?
>>       1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>
>> UN  xxx.xxx.149.161  367,82 GB  256          ?
>>       3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>
>> UN  xxx.xxx.149.226  390,88 GB  256          ?
>>       b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>
>> UN  xxx.xxx.151.162  408,35 GB  256          ?
>>       54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>
>> UN  xxx.xxx.149.109  369,33 GB  256          ?
>>       9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>
>> UN  xxx.xxx.150.172  362,32 GB  256          ?
>>       ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>
>> UN  xxx.xxx.149.238  388,98 GB  256          ?
>>       a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>
>> UN  xxx.xxx.151.232  435,31 GB  256          ?
>>       500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>
>> UN  xxx.xxx.151.43   410,69 GB  256          ?
>>       b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>
>> UN  xxx.xxx.151.139  407,47 GB  256          ?
>>       ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>
>> UN  xxx.xxx.151.213  375,05 GB  256          ?
>>       9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>
>> UN  xxx.xxx.149.177  401,91 GB  256          ?
>>       b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>
>> UN  xxx.xxx.150.145  388,76 GB  256          ?
>>       1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>
>> UN  xxx.xxx.149.48   385,43 GB  256          ?
>>       ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>
>> UN  xxx.xxx.150.189  384,52 GB  256          ?
>>       f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>
>> UN  xxx.xxx.151.220  357,56 GB  256          ?
>>       feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>
>> UN  xxx.xxx.149.121  355,64 GB  256          ?
>>       47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>
>> UN  xxx.xxx.151.218  416,57 GB  256          ?
>>       bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>>
>> UN  xxx.xxx.150.26   383,06 GB  256          ?
>>       1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>>
>>
>> # nodetool status system_distributed
>>
>> Datacenter: DC1
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>                               Rack
>>
>> UN  xxx.xxx.145.5    693,63 GB  256          6,2%
>>              6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>>
>> UN  xxx.xxx.145.225  648,55 GB  256          6,8%
>>              f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>>
>> UN  xxx.xxx.145.160  608,31 GB  256          6,5%
>>              d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>>
>> UN  xxx.xxx.145.67   552,93 GB  256          6,1%
>>              1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>>
>> UN  xxx.xxx.145.227  636,68 GB  256          6,0%
>>              47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>>
>> UN  xxx.xxx.146.105  610,9 GB   256          6,1%
>>              8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>>
>> UN  xxx.xxx.147.136  666,82 GB  256          6,3%
>>              bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>>
>> UN  xxx.xxx.146.213  609,79 GB  256          6,0%
>>              6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>>
>> UN  xxx.xxx.146.20   664,44 GB  256          7,0%
>>              b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>>
>> UN  xxx.xxx.146.209  615,44 GB  256          6,6%
>>              898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>>
>> UN  xxx.xxx.146.241  668,91 GB  256          6,2%
>>              0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>>
>> UN  xxx.xxx.147.211  641,33 GB  256          6,5%
>>              16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>>
>> UN  xxx.xxx.147.125  647,03 GB  256          6,3%
>>              2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>>
>> Datacenter: DC2
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>                               Rack
>>
>> UN  xxx.xxx.7.99     18,76 MB   256          6,3%
>>              d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>>
>> UN  xxx.xxx.6.135    16,04 MB   256          6,1%
>>              463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>>
>> UN  xxx.xxx.7.229    17,36 MB   256          5,9%
>>              9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>>
>> UN  xxx.xxx.7.5      14,01 MB   256          6,2%
>>              ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>>
>> UN  xxx.xxx.7.4      14,93 MB   256          6,4%
>>              122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>>
>> UN  xxx.xxx.6.10     16,77 MB   256          6,4%
>>              bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>>
>> UN  xxx.xxx.6.15     14,95 MB   256          6,1%
>>              668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>>
>> UN  xxx.xxx.7.140    17,38 MB   256          6,7%
>>              7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>>
>> UN  xxx.xxx.7.113    19,14 MB   256          6,8%
>>              46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>>
>> UN  xxx.xxx.6.118    16,7 MB    256          6,7%
>>              9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>>
>> UN  xxx.xxx.6.248    17,29 MB   256          6,9%
>>              35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>>
>> UN  xxx.xxx.5.24     16,55 MB   256          6,8%
>>              5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>>
>> UN  xxx.xxx.7.189    16,63 MB   256          6,2%
>>              be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>>
>> UN  xxx.xxx.5.124    20,37 MB   256          6,3%
>>              638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>>
>> UN  xxx.xxx.6.60     24,57 MB   256          6,4%
>>              cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>>
>> Datacenter: DC3
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> --  Address         Load       Tokens       Owns (effective)  Host ID
>>                               Rack
>>
>> UN  xxx.xxx.151.102  389,41 GB  256          6,4%
>>              1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>>
>> UN  xxx.xxx.149.161  367,82 GB  256          6,3%
>>              3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>>
>> UN  xxx.xxx.149.226  390,88 GB  256          6,2%
>>              b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>>
>> UN  xxx.xxx.151.162  408,35 GB  256          6,4%
>>              54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>>
>> UN  xxx.xxx.149.109  369,33 GB  256          6,2%
>>              9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>>
>> UN  xxx.xxx.150.172  362,32 GB  256          6,0%
>>              ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>>
>> UN  xxx.xxx.149.238  388,98 GB  256          6,4%
>>              a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>>
>> UN  xxx.xxx.151.232  435,31 GB  256          6,6%
>>              500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>>
>> UN  xxx.xxx.151.43   410,69 GB  256          6,2%
>>              b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>>
>> UN  xxx.xxx.151.139  407,47 GB  256          6,2%
>>              ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>>
>> UN  xxx.xxx.151.213  375,05 GB  256          6,5%
>>              9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>>
>> UN  xxx.xxx.149.177  401,91 GB  256          6,6%
>>              b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>>
>> UN  xxx.xxx.150.145  388,76 GB  256          7,1%
>>              1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>>
>> UN  xxx.xxx.149.48   385,43 GB  256          6,2%
>>              ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>>
>> UN  xxx.xxx.150.189  384,52 GB  256          6,4%
>>              f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>>
>> UN  xxx.xxx.151.220  357,56 GB  256          6,1%
>>              feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>>
>> UN  xxx.xxx.149.121  355,64 GB  256          6,4%
>>              47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>>
>> UN  xxx.xxx.151.218  416,57 GB  256          6,3%
>>              bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>> UN  xxx.xxx.150.26   383,06 GB  256          6,7%
>>              1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>>
>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
>> seen from the data loads).
>>
>> For the snitch we are using GossipingPropertyFileSnitch and a
>> cassandra-rackdc.properties with config such as:
>> dc=DC1
>> rack=RAC1
>>
>> Just noticed that we also have cassandra-topology.properties present on
>> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>>
>> I was wondering on whether the replication settings for the
>> system_distributed keyspace might need a change, but didn't find any yet
>> documentation pointing to that.
>>
>> Best regards,
>> Timo
>>
>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com>
>> wrote:
>>
>>> It could be a bug.
>>>
>>> Yet I am not very aware of this system_distributed keyspace, but from
>>> what I see, it is using a simple strategy:
>>>
>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>>> cqlsh $(hostname -I | awk '{print $1}')
>>>
>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>>
>>> Let's first check some stuff. Could you share the output of:
>>>
>>>
>>>    - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>    [ip_address_of_the_server]
>>>    - nodetool status
>>>    - nodetool status system_distributed
>>>    - Let us know about the snitch you are using and the corresponding
>>>    configuration.
>>>
>>>
>>> I am trying to make sure the command you used is expected to work, given
>>> your setup.
>>>
>>> My guess is this you might need to alter this keyspace accordingly to
>>> your cluster setup.
>>>
>>> Just guessing, hope that helps.
>>>
>>> C*heers,
>>> -----------------------
>>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>>> France
>>>
>>> The Last Pickle - Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>>> are adding a third data center before decommissioning one of the earlier
>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>>> cluster (not set to bootstrap, as documented in
>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>> ons/opsAddDCToCluster.html).
>>>>
>>>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>>>> rebuild -- DC1), we get the following error:
>>>>
>>>> Unable to find sufficient sources for streaming range
>>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>>
>>>> The same error occurs which ever of the 2 existing DCs we try to
>>>> rebuild from.
>>>>
>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>>> cron.
>>>>
>>>> Any advice on how to get the rebuild started?
>>>>
>>>> Best regards,
>>>> Timo
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Rebuild failing when adding new datacenter (3.0.8)

Posted by Timo Ahokas <ti...@gmail.com>.
Hi Alain,

Our normal user keyspaces have RF3 in all DCs, e.g:

create keyspace reporting with replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};

Any idea would it be safe to change the system_distributed keyspace to
match this?

-Timo

On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:

> Hi Alain,
>
> Thanks a lot for a helping out!
>
> Some of the basic keyspace / cluster info you requested:
>
> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>
> CREATE KEYSPACE system_distributed WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>
> CREATE TABLE system_distributed.repair_history (
>
>    keyspace_name text,
>
>    columnfamily_name text,
>
>    id timeuuid,
>
>    coordinator inet,
>
>    exception_message text,
>
>    exception_stacktrace text,
>
>    finished_at timestamp,
>
>    parent_id timeuuid,
>
>    participants set<inet>,
>
>    range_begin text,
>
>    range_end text,
>
>    started_at timestamp,
>
>    status text,
>
>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>
> ) WITH CLUSTERING ORDER BY (id ASC)
>
>    AND bloom_filter_fp_chance = 0.01
>
>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>    AND comment = 'Repair history'
>
>    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
>    AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
>    AND crc_check_chance = 1.0
>
>    AND dclocal_read_repair_chance = 0.0
>
>    AND default_time_to_live = 0
>
>    AND gc_grace_seconds = 0
>
>    AND max_index_interval = 2048
>
>    AND memtable_flush_period_in_ms = 3600000
>
>    AND min_index_interval = 128
>
>    AND read_repair_chance = 0.0
>
>    AND speculative_retry = '99PERCENTILE';
>
> CREATE TABLE system_distributed.parent_repair_history (
>
>    parent_id timeuuid PRIMARY KEY,
>
>    columnfamily_names set<text>,
>
>    exception_message text,
>
>    exception_stacktrace text,
>
>    finished_at timestamp,
>
>    keyspace_name text,
>
>    requested_ranges set<text>,
>
>    started_at timestamp,
>
>    successful_ranges set<text>
>
> ) WITH bloom_filter_fp_chance = 0.01
>
>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>    AND comment = 'Repair history'
>
>    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
>    AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
>    AND crc_check_chance = 1.0
>
>    AND dclocal_read_repair_chance = 0.0
>
>    AND default_time_to_live = 0
>
>    AND gc_grace_seconds = 0
>
>    AND max_index_interval = 2048
>
>    AND memtable_flush_period_in_ms = 3600000
>
>    AND min_index_interval = 128
>
>    AND read_repair_chance = 0.0
>
>    AND speculative_retry = '99PERCENTILE';
>
>
> CREATE TABLE system_distributed.repair_history (
>
>    keyspace_name text,
>
>    columnfamily_name text,
>
>    id timeuuid,
>
>    coordinator inet,
>
>    exception_message text,
>
>    exception_stacktrace text,
>
>    finished_at timestamp,
>
>    parent_id timeuuid,
>
>    participants set<inet>,
>
>    range_begin text,
>
>    range_end text,
>
>    started_at timestamp,
>
>    status text,
>
>    PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>
> ) WITH CLUSTERING ORDER BY (id ASC)
>
>    AND bloom_filter_fp_chance = 0.01
>
>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>    AND comment = 'Repair history'
>
>    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
>    AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
>    AND crc_check_chance = 1.0
>
>    AND dclocal_read_repair_chance = 0.0
>
>    AND default_time_to_live = 0
>
>    AND gc_grace_seconds = 0
>
>    AND max_index_interval = 2048
>
>    AND memtable_flush_period_in_ms = 3600000
>
>    AND min_index_interval = 128
>
>    AND read_repair_chance = 0.0
>
>    AND speculative_retry = '99PERCENTILE';
>
> CREATE TABLE system_distributed.parent_repair_history (
>
>    parent_id timeuuid PRIMARY KEY,
>
>    columnfamily_names set<text>,
>
>    exception_message text,
>
>    exception_stacktrace text,
>
>    finished_at timestamp,
>
>    keyspace_name text,
>
>    requested_ranges set<text>,
>
>    started_at timestamp,
>
>    successful_ranges set<text>
>
> ) WITH bloom_filter_fp_chance = 0.01
>
>    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>    AND comment = 'Repair history'
>
>    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
>    AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
>    AND crc_check_chance = 1.0
>
>    AND dclocal_read_repair_chance = 0.0
>
>    AND default_time_to_live = 0
>
>    AND gc_grace_seconds = 0
>
>    AND max_index_interval = 2048
>
>    AND memtable_flush_period_in_ms = 3600000
>
>    AND min_index_interval = 128
>
>    AND read_repair_chance = 0.0
>
>    AND speculative_retry = '99PERCENTILE';
>
>
>
> # nodetool status
>
> Datacenter: DC1
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns    Host ID
>                               Rack
>
> UN  xxx.xxx.145.5    693,63 GB  256          ?
>       6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>
> UN  xxx.xxx.145.225  648,55 GB  256          ?
>       f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>
> UN  xxx.xxx.145.160  608,31 GB  256          ?
>       d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>
> UN  xxx.xxx.145.67   552,93 GB  256          ?
>       1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>
> UN  xxx.xxx.145.227  636,68 GB  256          ?
>       47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>
> UN  xxx.xxx.146.105  610,9 GB   256          ?
>       8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>
> UN  xxx.xxx.147.136  666,82 GB  256          ?
>       bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>
> UN  xxx.xxx.146.213  609,79 GB  256          ?
>       6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>
> UN  xxx.xxx.146.20   664,44 GB  256          ?
>       b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>
> UN  xxx.xxx.146.209  615,44 GB  256          ?
>       898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>
> UN  xxx.xxx.146.241  668,91 GB  256          ?
>       0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>
> UN  xxx.xxx.147.211  641,33 GB  256          ?
>       16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>
> UN  xxx.xxx.147.125  647,03 GB  256          ?
>       2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>
> Datacenter: DC2
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns    Host ID
>                               Rack
>
> UN  xxx.xxx.7.99     18,76 MB   256          ?
>       d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>
> UN  xxx.xxx.6.135    16,04 MB   256          ?
>       463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>
> UN  xxx.xxx.7.229    17,36 MB   256          ?
>       9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>
> UN  xxx.xxx.7.5      14,01 MB   256          ?
>       ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>
> UN  xxx.xxx.7.4      14,93 MB   256          ?
>       122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>
> UN  xxx.xxx.6.10     16,77 MB   256          ?
>       bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>
> UN  xxx.xxx.6.15     14,95 MB   256          ?
>       668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>
> UN  xxx.xxx.7.140    17,38 MB   256          ?
>       7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>
> UN  xxx.xxx.7.113    19,14 MB   256          ?
>       46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>
> UN  xxx.xxx.6.118    16,7 MB    256          ?
>       9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>
> UN  xxx.xxx.6.248    17,29 MB   256          ?
>       35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>
> UN  xxx.xxx.5.24     16,55 MB   256          ?
>       5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>
> UN  xxx.xxx.7.189    16,63 MB   256          ?
>       be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>
> UN  xxx.xxx.5.124    20,37 MB   256          ?
>       638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>
> UN  xxx.xxx.6.60     24,57 MB   256          ?
>       cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>
> Datacenter: DC3
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns    Host ID
>                               Rack
>
> UN  xxx.xxx.151.102  389,41 GB  256          ?
>       1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>
> UN  xxx.xxx.149.161  367,82 GB  256          ?
>       3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>
> UN  xxx.xxx.149.226  390,88 GB  256          ?
>       b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>
> UN  xxx.xxx.151.162  408,35 GB  256          ?
>       54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>
> UN  xxx.xxx.149.109  369,33 GB  256          ?
>       9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>
> UN  xxx.xxx.150.172  362,32 GB  256          ?
>       ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>
> UN  xxx.xxx.149.238  388,98 GB  256          ?
>       a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>
> UN  xxx.xxx.151.232  435,31 GB  256          ?
>       500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>
> UN  xxx.xxx.151.43   410,69 GB  256          ?
>       b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>
> UN  xxx.xxx.151.139  407,47 GB  256          ?
>       ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>
> UN  xxx.xxx.151.213  375,05 GB  256          ?
>       9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>
> UN  xxx.xxx.149.177  401,91 GB  256          ?
>       b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>
> UN  xxx.xxx.150.145  388,76 GB  256          ?
>       1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>
> UN  xxx.xxx.149.48   385,43 GB  256          ?
>       ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>
> UN  xxx.xxx.150.189  384,52 GB  256          ?
>       f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>
> UN  xxx.xxx.151.220  357,56 GB  256          ?
>       feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>
> UN  xxx.xxx.149.121  355,64 GB  256          ?
>       47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>
> UN  xxx.xxx.151.218  416,57 GB  256          ?
>       bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
>
> UN  xxx.xxx.150.26   383,06 GB  256          ?
>       1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
>
>
> # nodetool status system_distributed
>
> Datacenter: DC1
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns (effective)  Host ID
>                               Rack
>
> UN  xxx.xxx.145.5    693,63 GB  256          6,2%
>              6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1
>
> UN  xxx.xxx.145.225  648,55 GB  256          6,8%
>              f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1
>
> UN  xxx.xxx.145.160  608,31 GB  256          6,5%
>              d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1
>
> UN  xxx.xxx.145.67   552,93 GB  256          6,1%
>              1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1
>
> UN  xxx.xxx.145.227  636,68 GB  256          6,0%
>              47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1
>
> UN  xxx.xxx.146.105  610,9 GB   256          6,1%
>              8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1
>
> UN  xxx.xxx.147.136  666,82 GB  256          6,3%
>              bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1
>
> UN  xxx.xxx.146.213  609,79 GB  256          6,0%
>              6416275c-7570-48a9-957f-2daca71d31aa  RAC1
>
> UN  xxx.xxx.146.20   664,44 GB  256          7,0%
>              b016df7e-f694-4ef3-928c-8783853e9a07  RAC1
>
> UN  xxx.xxx.146.209  615,44 GB  256          6,6%
>              898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1
>
> UN  xxx.xxx.146.241  668,91 GB  256          6,2%
>              0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1
>
> UN  xxx.xxx.147.211  641,33 GB  256          6,5%
>              16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1
>
> UN  xxx.xxx.147.125  647,03 GB  256          6,3%
>              2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1
>
> Datacenter: DC2
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns (effective)  Host ID
>                               Rack
>
> UN  xxx.xxx.7.99     18,76 MB   256          6,3%
>              d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1
>
> UN  xxx.xxx.6.135    16,04 MB   256          6,1%
>              463f480a-baf3-4230-86b7-1106251ebfad  RAC1
>
> UN  xxx.xxx.7.229    17,36 MB   256          5,9%
>              9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1
>
> UN  xxx.xxx.7.5      14,01 MB   256          6,2%
>              ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1
>
> UN  xxx.xxx.7.4      14,93 MB   256          6,4%
>              122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1
>
> UN  xxx.xxx.6.10     16,77 MB   256          6,4%
>              bbb66068-bf06-438d-81ee-965e201e8fff  RAC1
>
> UN  xxx.xxx.6.15     14,95 MB   256          6,1%
>              668a864d-9fd3-41b7-88fb-824e75e71953  RAC1
>
> UN  xxx.xxx.7.140    17,38 MB   256          6,7%
>              7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1
>
> UN  xxx.xxx.7.113    19,14 MB   256          6,8%
>              46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1
>
> UN  xxx.xxx.6.118    16,7 MB    256          6,7%
>              9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1
>
> UN  xxx.xxx.6.248    17,29 MB   256          6,9%
>              35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1
>
> UN  xxx.xxx.5.24     16,55 MB   256          6,8%
>              5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1
>
> UN  xxx.xxx.7.189    16,63 MB   256          6,2%
>              be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1
>
> UN  xxx.xxx.5.124    20,37 MB   256          6,3%
>              638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1
>
> UN  xxx.xxx.6.60     24,57 MB   256          6,4%
>              cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1
>
> Datacenter: DC3
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address         Load       Tokens       Owns (effective)  Host ID
>                               Rack
>
> UN  xxx.xxx.151.102  389,41 GB  256          6,4%
>              1740a473-e304-467c-a682-d1b4b0595ffa  RAC1
>
> UN  xxx.xxx.149.161  367,82 GB  256          6,3%
>              3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1
>
> UN  xxx.xxx.149.226  390,88 GB  256          6,2%
>              b8ca4576-2632-4198-ac87-10243c0c554e  RAC1
>
> UN  xxx.xxx.151.162  408,35 GB  256          6,4%
>              54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1
>
> UN  xxx.xxx.149.109  369,33 GB  256          6,2%
>              9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1
>
> UN  xxx.xxx.150.172  362,32 GB  256          6,0%
>              ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1
>
> UN  xxx.xxx.149.238  388,98 GB  256          6,4%
>              a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1
>
> UN  xxx.xxx.151.232  435,31 GB  256          6,6%
>              500a43ab-ae77-4a07-876c-171cb34c549b  RAC1
>
> UN  xxx.xxx.151.43   410,69 GB  256          6,2%
>              b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1
>
> UN  xxx.xxx.151.139  407,47 GB  256          6,2%
>              ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1
>
> UN  xxx.xxx.151.213  375,05 GB  256          6,5%
>              9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1
>
> UN  xxx.xxx.149.177  401,91 GB  256          6,6%
>              b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1
>
> UN  xxx.xxx.150.145  388,76 GB  256          7,1%
>              1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1
>
> UN  xxx.xxx.149.48   385,43 GB  256          6,2%
>              ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1
>
> UN  xxx.xxx.150.189  384,52 GB  256          6,4%
>              f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1
>
> UN  xxx.xxx.151.220  357,56 GB  256          6,1%
>              feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1
>
> UN  xxx.xxx.149.121  355,64 GB  256          6,4%
>              47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1
>
> UN  xxx.xxx.151.218  416,57 GB  256          6,3%
>              bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
> UN  xxx.xxx.150.26   383,06 GB  256          6,7%
>              1ca0085d-93a5-4650-891a-b45f988150a4  RAC1
>
> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
> seen from the data loads).
>
> For the snitch we are using GossipingPropertyFileSnitch and a
> cassandra-rackdc.properties with config such as:
> dc=DC1
> rack=RAC1
>
> Just noticed that we also have cassandra-topology.properties present on
> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>
> I was wondering on whether the replication settings for the
> system_distributed keyspace might need a change, but didn't find any yet
> documentation pointing to that.
>
> Best regards,
> Timo
>
> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>
>> It could be a bug.
>>
>> Yet I am not very aware of this system_distributed keyspace, but from
>> what I see, it is using a simple strategy:
>>
>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>> cqlsh $(hostname -I | awk '{print $1}')
>>
>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>>
>> Let's first check some stuff. Could you share the output of:
>>
>>
>>    - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>    [ip_address_of_the_server]
>>    - nodetool status
>>    - nodetool status system_distributed
>>    - Let us know about the snitch you are using and the corresponding
>>    configuration.
>>
>>
>> I am trying to make sure the command you used is expected to work, given
>> your setup.
>>
>> My guess is this you might need to alter this keyspace accordingly to
>> your cluster setup.
>>
>> Just guessing, hope that helps.
>>
>> C*heers,
>> -----------------------
>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>
>>> Hi,
>>>
>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>> are adding a third data center before decommissioning one of the earlier
>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>> cluster (not set to bootstrap, as documented in
>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>> ons/opsAddDCToCluster.html).
>>>
>>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>>> rebuild -- DC1), we get the following error:
>>>
>>> Unable to find sufficient sources for streaming range
>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>
>>> The same error occurs which ever of the 2 existing DCs we try to rebuild
>>> from.
>>>
>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>> cron.
>>>
>>> Any advice on how to get the rebuild started?
>>>
>>> Best regards,
>>> Timo
>>>
>>>
>>>
>>>
>>>
>>
>

Re: Rebuild failing when adding new datacenter (3.0.8)

Posted by Timo Ahokas <ti...@gmail.com>.
Hi Alain,

Thanks a lot for a helping out!

Some of the basic keyspace / cluster info you requested:

# echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh

CREATE KEYSPACE system_distributed WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE system_distributed.repair_history (

   keyspace_name text,

   columnfamily_name text,

   id timeuuid,

   coordinator inet,

   exception_message text,

   exception_stacktrace text,

   finished_at timestamp,

   parent_id timeuuid,

   participants set<inet>,

   range_begin text,

   range_end text,

   started_at timestamp,

   status text,

   PRIMARY KEY ((keyspace_name, columnfamily_name), id)

) WITH CLUSTERING ORDER BY (id ASC)

   AND bloom_filter_fp_chance = 0.01

   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

   AND comment = 'Repair history'

   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

   AND crc_check_chance = 1.0

   AND dclocal_read_repair_chance = 0.0

   AND default_time_to_live = 0

   AND gc_grace_seconds = 0

   AND max_index_interval = 2048

   AND memtable_flush_period_in_ms = 3600000

   AND min_index_interval = 128

   AND read_repair_chance = 0.0

   AND speculative_retry = '99PERCENTILE';

CREATE TABLE system_distributed.parent_repair_history (

   parent_id timeuuid PRIMARY KEY,

   columnfamily_names set<text>,

   exception_message text,

   exception_stacktrace text,

   finished_at timestamp,

   keyspace_name text,

   requested_ranges set<text>,

   started_at timestamp,

   successful_ranges set<text>

) WITH bloom_filter_fp_chance = 0.01

   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

   AND comment = 'Repair history'

   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

   AND crc_check_chance = 1.0

   AND dclocal_read_repair_chance = 0.0

   AND default_time_to_live = 0

   AND gc_grace_seconds = 0

   AND max_index_interval = 2048

   AND memtable_flush_period_in_ms = 3600000

   AND min_index_interval = 128

   AND read_repair_chance = 0.0

   AND speculative_retry = '99PERCENTILE';


CREATE TABLE system_distributed.repair_history (

   keyspace_name text,

   columnfamily_name text,

   id timeuuid,

   coordinator inet,

   exception_message text,

   exception_stacktrace text,

   finished_at timestamp,

   parent_id timeuuid,

   participants set<inet>,

   range_begin text,

   range_end text,

   started_at timestamp,

   status text,

   PRIMARY KEY ((keyspace_name, columnfamily_name), id)

) WITH CLUSTERING ORDER BY (id ASC)

   AND bloom_filter_fp_chance = 0.01

   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

   AND comment = 'Repair history'

   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

   AND crc_check_chance = 1.0

   AND dclocal_read_repair_chance = 0.0

   AND default_time_to_live = 0

   AND gc_grace_seconds = 0

   AND max_index_interval = 2048

   AND memtable_flush_period_in_ms = 3600000

   AND min_index_interval = 128

   AND read_repair_chance = 0.0

   AND speculative_retry = '99PERCENTILE';

CREATE TABLE system_distributed.parent_repair_history (

   parent_id timeuuid PRIMARY KEY,

   columnfamily_names set<text>,

   exception_message text,

   exception_stacktrace text,

   finished_at timestamp,

   keyspace_name text,

   requested_ranges set<text>,

   started_at timestamp,

   successful_ranges set<text>

) WITH bloom_filter_fp_chance = 0.01

   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}

   AND comment = 'Repair history'

   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}

   AND crc_check_chance = 1.0

   AND dclocal_read_repair_chance = 0.0

   AND default_time_to_live = 0

   AND gc_grace_seconds = 0

   AND max_index_interval = 2048

   AND memtable_flush_period_in_ms = 3600000

   AND min_index_interval = 128

   AND read_repair_chance = 0.0

   AND speculative_retry = '99PERCENTILE';



# nodetool status

Datacenter: DC1

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns    Host ID
                              Rack

UN  xxx.xxx.145.5    693,63 GB  256          ?
      6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1

UN  xxx.xxx.145.225  648,55 GB  256          ?
      f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1

UN  xxx.xxx.145.160  608,31 GB  256          ?
      d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1

UN  xxx.xxx.145.67   552,93 GB  256          ?
      1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1

UN  xxx.xxx.145.227  636,68 GB  256          ?
      47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1

UN  xxx.xxx.146.105  610,9 GB   256          ?
      8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1

UN  xxx.xxx.147.136  666,82 GB  256          ?
      bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1

UN  xxx.xxx.146.213  609,79 GB  256          ?
      6416275c-7570-48a9-957f-2daca71d31aa  RAC1

UN  xxx.xxx.146.20   664,44 GB  256          ?
      b016df7e-f694-4ef3-928c-8783853e9a07  RAC1

UN  xxx.xxx.146.209  615,44 GB  256          ?
      898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1

UN  xxx.xxx.146.241  668,91 GB  256          ?
      0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1

UN  xxx.xxx.147.211  641,33 GB  256          ?
      16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1

UN  xxx.xxx.147.125  647,03 GB  256          ?
      2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1

Datacenter: DC2

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns    Host ID
                              Rack

UN  xxx.xxx.7.99     18,76 MB   256          ?
      d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1

UN  xxx.xxx.6.135    16,04 MB   256          ?
      463f480a-baf3-4230-86b7-1106251ebfad  RAC1

UN  xxx.xxx.7.229    17,36 MB   256          ?
      9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1

UN  xxx.xxx.7.5      14,01 MB   256          ?
      ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1

UN  xxx.xxx.7.4      14,93 MB   256          ?
      122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1

UN  xxx.xxx.6.10     16,77 MB   256          ?
      bbb66068-bf06-438d-81ee-965e201e8fff  RAC1

UN  xxx.xxx.6.15     14,95 MB   256          ?
      668a864d-9fd3-41b7-88fb-824e75e71953  RAC1

UN  xxx.xxx.7.140    17,38 MB   256          ?
      7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1

UN  xxx.xxx.7.113    19,14 MB   256          ?
      46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1

UN  xxx.xxx.6.118    16,7 MB    256          ?
      9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1

UN  xxx.xxx.6.248    17,29 MB   256          ?
      35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1

UN  xxx.xxx.5.24     16,55 MB   256          ?
      5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1

UN  xxx.xxx.7.189    16,63 MB   256          ?
      be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1

UN  xxx.xxx.5.124    20,37 MB   256          ?
      638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1

UN  xxx.xxx.6.60     24,57 MB   256          ?
      cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1

Datacenter: DC3

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns    Host ID
                              Rack

UN  xxx.xxx.151.102  389,41 GB  256          ?
      1740a473-e304-467c-a682-d1b4b0595ffa  RAC1

UN  xxx.xxx.149.161  367,82 GB  256          ?
      3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1

UN  xxx.xxx.149.226  390,88 GB  256          ?
      b8ca4576-2632-4198-ac87-10243c0c554e  RAC1

UN  xxx.xxx.151.162  408,35 GB  256          ?
      54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1

UN  xxx.xxx.149.109  369,33 GB  256          ?
      9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1

UN  xxx.xxx.150.172  362,32 GB  256          ?
      ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1

UN  xxx.xxx.149.238  388,98 GB  256          ?
      a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1

UN  xxx.xxx.151.232  435,31 GB  256          ?
      500a43ab-ae77-4a07-876c-171cb34c549b  RAC1

UN  xxx.xxx.151.43   410,69 GB  256          ?
      b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1

UN  xxx.xxx.151.139  407,47 GB  256          ?
      ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1

UN  xxx.xxx.151.213  375,05 GB  256          ?
      9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1

UN  xxx.xxx.149.177  401,91 GB  256          ?
      b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1

UN  xxx.xxx.150.145  388,76 GB  256          ?
      1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1

UN  xxx.xxx.149.48   385,43 GB  256          ?
      ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1

UN  xxx.xxx.150.189  384,52 GB  256          ?
      f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1

UN  xxx.xxx.151.220  357,56 GB  256          ?
      feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1

UN  xxx.xxx.149.121  355,64 GB  256          ?
      47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1

UN  xxx.xxx.151.218  416,57 GB  256          ?
      bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1

UN  xxx.xxx.150.26   383,06 GB  256          ?
      1ca0085d-93a5-4650-891a-b45f988150a4  RAC1

Note: Non-system keyspaces don't have the same replication settings,
effective ownership information is meaningless



# nodetool status system_distributed

Datacenter: DC1

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns (effective)  Host ID
                              Rack

UN  xxx.xxx.145.5    693,63 GB  256          6,2%
             6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56  RAC1

UN  xxx.xxx.145.225  648,55 GB  256          6,8%
             f900847a-63e4-44c5-b4d7-e439c7cb6a8e  RAC1

UN  xxx.xxx.145.160  608,31 GB  256          6,5%
             d257e76d-9e40-4215-94c7-3076c8ff4b7f  RAC1

UN  xxx.xxx.145.67   552,93 GB  256          6,1%
             1d47cbdd-cdf1-45b6-aa0e-0c6123899dca  RAC1

UN  xxx.xxx.145.227  636,68 GB  256          6,0%
             47e5f207-f9fd-4a86-be8a-66e7630d1baa  RAC1

UN  xxx.xxx.146.105  610,9 GB   256          6,1%
             8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2  RAC1

UN  xxx.xxx.147.136  666,82 GB  256          6,3%
             bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6  RAC1

UN  xxx.xxx.146.213  609,79 GB  256          6,0%
             6416275c-7570-48a9-957f-2daca71d31aa  RAC1

UN  xxx.xxx.146.20   664,44 GB  256          7,0%
             b016df7e-f694-4ef3-928c-8783853e9a07  RAC1

UN  xxx.xxx.146.209  615,44 GB  256          6,6%
             898e6d98-1b92-4e86-b52c-f851fd4fda71  RAC1

UN  xxx.xxx.146.241  668,91 GB  256          6,2%
             0b5d4c6c-4b7c-4265-92bc-ad74464d85cc  RAC1

UN  xxx.xxx.147.211  641,33 GB  256          6,5%
             16cdc4a7-b694-4125-91d6-05b9099cb765  RAC1

UN  xxx.xxx.147.125  647,03 GB  256          6,3%
             2e97ed0a-039c-413b-9693-a87fadf40f82  RAC1

Datacenter: DC2

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns (effective)  Host ID
                              Rack

UN  xxx.xxx.7.99     18,76 MB   256          6,3%
             d7b907ad-15f5-4c79-962c-c604a5723a7b  RAC1

UN  xxx.xxx.6.135    16,04 MB   256          6,1%
             463f480a-baf3-4230-86b7-1106251ebfad  RAC1

UN  xxx.xxx.7.229    17,36 MB   256          5,9%
             9487a975-6183-43b8-9208-cd8e09a0ae18  RAC1

UN  xxx.xxx.7.5      14,01 MB   256          6,2%
             ae039e49-4d79-4e4e-87bd-921cd6b3291a  RAC1

UN  xxx.xxx.7.4      14,93 MB   256          6,4%
             122a47fb-b5ca-46d1-aae9-e6993ab58b66  RAC1

UN  xxx.xxx.6.10     16,77 MB   256          6,4%
             bbb66068-bf06-438d-81ee-965e201e8fff  RAC1

UN  xxx.xxx.6.15     14,95 MB   256          6,1%
             668a864d-9fd3-41b7-88fb-824e75e71953  RAC1

UN  xxx.xxx.7.140    17,38 MB   256          6,7%
             7b016c96-eaa1-4ee1-8657-f4260c70ed37  RAC1

UN  xxx.xxx.7.113    19,14 MB   256          6,8%
             46c06c44-ce2f-4ab6-9597-a1314cecf9bc  RAC1

UN  xxx.xxx.6.118    16,7 MB    256          6,7%
             9c3c3107-a1d3-4254-ad10-909713a38f8c  RAC1

UN  xxx.xxx.6.248    17,29 MB   256          6,9%
             35ff4d3d-d993-468b-9a54-88b40ceec6d4  RAC1

UN  xxx.xxx.5.24     16,55 MB   256          6,8%
             5f1f34bd-110f-4d60-9af5-a3abd01b55a5  RAC1

UN  xxx.xxx.7.189    16,63 MB   256          6,2%
             be7cbf84-5838-487a-8bd4-b340a1c70fab  RAC1

UN  xxx.xxx.5.124    20,37 MB   256          6,3%
             638f2656-fb92-4b70-ba2a-251a749c4c58  RAC1

UN  xxx.xxx.6.60     24,57 MB   256          6,4%
             cf16209a-a9a0-4f27-9341-c76d47e50261  RAC1

Datacenter: DC3

===============

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address         Load       Tokens       Owns (effective)  Host ID
                              Rack

UN  xxx.xxx.151.102  389,41 GB  256          6,4%
             1740a473-e304-467c-a682-d1b4b0595ffa  RAC1

UN  xxx.xxx.149.161  367,82 GB  256          6,3%
             3a5322d4-e49f-45ed-85b5-fd658502859c  RAC1

UN  xxx.xxx.149.226  390,88 GB  256          6,2%
             b8ca4576-2632-4198-ac87-10243c0c554e  RAC1

UN  xxx.xxx.151.162  408,35 GB  256          6,4%
             54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a  RAC1

UN  xxx.xxx.149.109  369,33 GB  256          6,2%
             9172c7d8-0c55-4e8e-a17b-89fdb0dce878  RAC1

UN  xxx.xxx.150.172  362,32 GB  256          6,0%
             ba394a29-1a0c-4f50-ab85-4db19011b190  RAC1

UN  xxx.xxx.149.238  388,98 GB  256          6,4%
             a3d7228c-ccb4-4787-a4bb-f7720aeedc8e  RAC1

UN  xxx.xxx.151.232  435,31 GB  256          6,6%
             500a43ab-ae77-4a07-876c-171cb34c549b  RAC1

UN  xxx.xxx.151.43   410,69 GB  256          6,2%
             b8bc80e2-2107-447a-85e4-57a39dc9c595  RAC1

UN  xxx.xxx.151.139  407,47 GB  256          6,2%
             ecfa4ba7-7783-47a4-8b17-aadc91a3e776  RAC1

UN  xxx.xxx.151.213  375,05 GB  256          6,5%
             9bf53ee1-53d4-4d18-a58e-0b0a17e18a69  RAC1

UN  xxx.xxx.149.177  401,91 GB  256          6,6%
             b903faf1-1ae9-45ad-bdce-3c9377458a03  RAC1

UN  xxx.xxx.150.145  388,76 GB  256          7,1%
             1c4e4232-db27-4cc1-9985-9eb7f0b984d1  RAC1

UN  xxx.xxx.149.48   385,43 GB  256          6,2%
             ad3ea388-203c-4b26-a368-934a6105cc6e  RAC1

UN  xxx.xxx.150.189  384,52 GB  256          6,4%
             f361ebad-b0a6-47b7-a55c-245c98f84508  RAC1

UN  xxx.xxx.151.220  357,56 GB  256          6,1%
             feb814e6-6d2f-4cef-ae3b-4924c1cbac60  RAC1

UN  xxx.xxx.149.121  355,64 GB  256          6,4%
             47fbb104-6a5a-49c0-b086-3f14c853c83b  RAC1

UN  xxx.xxx.151.218  416,57 GB  256          6,3%
             bbb21d16-da85-4cfd-87d4-2333c8b02dad  RAC1
UN  xxx.xxx.150.26   383,06 GB  256          6,7%
             1ca0085d-93a5-4650-891a-b45f988150a4  RAC1

DC1 and DC3 are the old data centers. DC2 is the new one being added (as
seen from the data loads).

For the snitch we are using GossipingPropertyFileSnitch and a
cassandra-rackdc.properties with config such as:
dc=DC1
rack=RAC1

Just noticed that we also have cassandra-topology.properties present on the
nodes, but it's up-to-date with all the nodes from the 3 data centers.

I was wondering on whether the replication settings for the
system_distributed keyspace might need a change, but didn't find any yet
documentation pointing to that.

Best regards,
Timo

On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> It could be a bug.
>
> Yet I am not very aware of this system_distributed keyspace, but from what
> I see, it is using a simple strategy:
>
> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
> cqlsh $(hostname -I | awk '{print $1}')
>
> CREATE KEYSPACE system_distributed WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
>
> Let's first check some stuff. Could you share the output of:
>
>
>    - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>    [ip_address_of_the_server]
>    - nodetool status
>    - nodetool status system_distributed
>    - Let us know about the snitch you are using and the corresponding
>    configuration.
>
>
> I am trying to make sure the command you used is expected to work, given
> your setup.
>
> My guess is this you might need to alter this keyspace accordingly to your
> cluster setup.
>
> Just guessing, hope that helps.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - alain@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>
>> Hi,
>>
>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>> are adding a third data center before decommissioning one of the earlier
>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>> cluster (not set to bootstrap, as documented in
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>> ons/opsAddDCToCluster.html).
>>
>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>> rebuild -- DC1), we get the following error:
>>
>> Unable to find sufficient sources for streaming range
>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>
>> The same error occurs which ever of the 2 existing DCs we try to rebuild
>> from.
>>
>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>> cron.
>>
>> Any advice on how to get the rebuild started?
>>
>> Best regards,
>> Timo
>>
>>
>>
>>
>>
>

Re: Rebuild failing when adding new datacenter (3.0.8)

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
It could be a bug.

Yet I am not very aware of this system_distributed keyspace, but from what
I see, it is using a simple strategy:

root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
cqlsh $(hostname -I | awk '{print $1}')

CREATE KEYSPACE system_distributed WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;

Let's first check some stuff. Could you share the output of:


   - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
   [ip_address_of_the_server]
   - nodetool status
   - nodetool status system_distributed
   - Let us know about the snitch you are using and the corresponding
   configuration.


I am trying to make sure the command you used is expected to work, given
your setup.

My guess is this you might need to alter this keyspace accordingly to your
cluster setup.

Just guessing, hope that helps.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:

> Hi,
>
> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
> currently running in two data centers (13 and 19 nodes, RF3 in both). We
> are adding a third data center before decommissioning one of the earlier
> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
> cluster (not set to bootstrap, as documented in
> https://docs.datastax.com/en/cassandra/3.0/cassandra/
> operations/opsAddDCToCluster.html).
>
> When trying to rebuild nodes in the new DC from a previous DC (nodetool
> rebuild -- DC1), we get the following error:
>
> Unable to find sufficient sources for streaming range (597769692463489739,597931451954862346]
> in keyspace system_distributed
>
> The same error occurs which ever of the 2 existing DCs we try to rebuild
> from.
>
> We run pr repairs (nodetool repair -pr) on all nodes twice a week via cron.
>
> Any advice on how to get the rebuild started?
>
> Best regards,
> Timo
>
>
>
>
>