You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Timo Ahokas <ti...@gmail.com> on 2016/09/22 13:47:17 UTC
Rebuild failing when adding new datacenter (3.0.8)
Hi,
We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15) currently
running in two data centers (13 and 19 nodes, RF3 in both). We are adding a
third data center before decommissioning one of the earlier ones.
Installing Cassandra (3.0.8) goes fine and all the nodes join the cluster
(not set to bootstrap, as documented in
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
).
When trying to rebuild nodes in the new DC from a previous DC (nodetool
rebuild -- DC1), we get the following error:
Unable to find sufficient sources for streaming range
(597769692463489739,597931451954862346] in keyspace system_distributed
The same error occurs which ever of the 2 existing DCs we try to rebuild
from.
We run pr repairs (nodetool repair -pr) on all nodes twice a week via cron.
Any advice on how to get the rebuild started?
Best regards,
Timo
Re: Rebuild failing when adding new datacenter (3.0.8)
Posted by Timo Ahokas <ti...@gmail.com>.
Hi Yabin/Alain,
I changed the replication strategies for system_distributed, system_auth
and system_traces to use NetworkTopologyStrategies and repaired the
affected keyspaces. Now the rebuild process starts up ok without errors.
Thanks a lot for your help!
Best regards,
Timo
On 22 September 2016 at 21:16, Yabin Meng <ya...@gmail.com> wrote:
> It is a Cassandra bug. The workaround is to change system_distributed
> keyspce replication strategy to something as below:
>
> alter keyspace system_distributed with replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>
> You may see similar problem for other system keyspaces. Do the same thing.
>
> Cheers,
>
> Yabin
>
> On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <ti...@gmail.com>
> wrote:
>
>> Hi Alain,
>>
>> Our normal user keyspaces have RF3 in all DCs, e.g:
>>
>> create keyspace reporting with replication = {'class':
>> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>>
>> Any idea would it be safe to change the system_distributed keyspace to
>> match this?
>>
>> -Timo
>>
>> On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:
>>
>>> Hi Alain,
>>>
>>> Thanks a lot for a helping out!
>>>
>>> Some of the basic keyspace / cluster info you requested:
>>>
>>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>
>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>> keyspace_name text,
>>>
>>> columnfamily_name text,
>>>
>>> id timeuuid,
>>>
>>> coordinator inet,
>>>
>>> exception_message text,
>>>
>>> exception_stacktrace text,
>>>
>>> finished_at timestamp,
>>>
>>> parent_id timeuuid,
>>>
>>> participants set<inet>,
>>>
>>> range_begin text,
>>>
>>> range_end text,
>>>
>>> started_at timestamp,
>>>
>>> status text,
>>>
>>> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>> AND bloom_filter_fp_chance = 0.01
>>>
>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>> AND comment = 'Repair history'
>>>
>>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>> AND crc_check_chance = 1.0
>>>
>>> AND dclocal_read_repair_chance = 0.0
>>>
>>> AND default_time_to_live = 0
>>>
>>> AND gc_grace_seconds = 0
>>>
>>> AND max_index_interval = 2048
>>>
>>> AND memtable_flush_period_in_ms = 3600000
>>>
>>> AND min_index_interval = 128
>>>
>>> AND read_repair_chance = 0.0
>>>
>>> AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>> parent_id timeuuid PRIMARY KEY,
>>>
>>> columnfamily_names set<text>,
>>>
>>> exception_message text,
>>>
>>> exception_stacktrace text,
>>>
>>> finished_at timestamp,
>>>
>>> keyspace_name text,
>>>
>>> requested_ranges set<text>,
>>>
>>> started_at timestamp,
>>>
>>> successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>> AND comment = 'Repair history'
>>>
>>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>> AND crc_check_chance = 1.0
>>>
>>> AND dclocal_read_repair_chance = 0.0
>>>
>>> AND default_time_to_live = 0
>>>
>>> AND gc_grace_seconds = 0
>>>
>>> AND max_index_interval = 2048
>>>
>>> AND memtable_flush_period_in_ms = 3600000
>>>
>>> AND min_index_interval = 128
>>>
>>> AND read_repair_chance = 0.0
>>>
>>> AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>> CREATE TABLE system_distributed.repair_history (
>>>
>>> keyspace_name text,
>>>
>>> columnfamily_name text,
>>>
>>> id timeuuid,
>>>
>>> coordinator inet,
>>>
>>> exception_message text,
>>>
>>> exception_stacktrace text,
>>>
>>> finished_at timestamp,
>>>
>>> parent_id timeuuid,
>>>
>>> participants set<inet>,
>>>
>>> range_begin text,
>>>
>>> range_end text,
>>>
>>> started_at timestamp,
>>>
>>> status text,
>>>
>>> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>>
>>> ) WITH CLUSTERING ORDER BY (id ASC)
>>>
>>> AND bloom_filter_fp_chance = 0.01
>>>
>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>> AND comment = 'Repair history'
>>>
>>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>> AND crc_check_chance = 1.0
>>>
>>> AND dclocal_read_repair_chance = 0.0
>>>
>>> AND default_time_to_live = 0
>>>
>>> AND gc_grace_seconds = 0
>>>
>>> AND max_index_interval = 2048
>>>
>>> AND memtable_flush_period_in_ms = 3600000
>>>
>>> AND min_index_interval = 128
>>>
>>> AND read_repair_chance = 0.0
>>>
>>> AND speculative_retry = '99PERCENTILE';
>>>
>>> CREATE TABLE system_distributed.parent_repair_history (
>>>
>>> parent_id timeuuid PRIMARY KEY,
>>>
>>> columnfamily_names set<text>,
>>>
>>> exception_message text,
>>>
>>> exception_stacktrace text,
>>>
>>> finished_at timestamp,
>>>
>>> keyspace_name text,
>>>
>>> requested_ranges set<text>,
>>>
>>> started_at timestamp,
>>>
>>> successful_ranges set<text>
>>>
>>> ) WITH bloom_filter_fp_chance = 0.01
>>>
>>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>
>>> AND comment = 'Repair history'
>>>
>>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>>> 'min_threshold': '4'}
>>>
>>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>>
>>> AND crc_check_chance = 1.0
>>>
>>> AND dclocal_read_repair_chance = 0.0
>>>
>>> AND default_time_to_live = 0
>>>
>>> AND gc_grace_seconds = 0
>>>
>>> AND max_index_interval = 2048
>>>
>>> AND memtable_flush_period_in_ms = 3600000
>>>
>>> AND min_index_interval = 128
>>>
>>> AND read_repair_chance = 0.0
>>>
>>> AND speculative_retry = '99PERCENTILE';
>>>
>>>
>>>
>>> # nodetool status
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns Host ID
>>> Rack
>>>
>>> UN xxx.xxx.145.5 693,63 GB 256 ?
>>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>>>
>>> UN xxx.xxx.145.225 648,55 GB 256 ?
>>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>>>
>>> UN xxx.xxx.145.160 608,31 GB 256 ?
>>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>>>
>>> UN xxx.xxx.145.67 552,93 GB 256 ?
>>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>>>
>>> UN xxx.xxx.145.227 636,68 GB 256 ?
>>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>>>
>>> UN xxx.xxx.146.105 610,9 GB 256 ?
>>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>>>
>>> UN xxx.xxx.147.136 666,82 GB 256 ?
>>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>>>
>>> UN xxx.xxx.146.213 609,79 GB 256 ?
>>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>>>
>>> UN xxx.xxx.146.20 664,44 GB 256 ?
>>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>>>
>>> UN xxx.xxx.146.209 615,44 GB 256 ?
>>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>>>
>>> UN xxx.xxx.146.241 668,91 GB 256 ?
>>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>>>
>>> UN xxx.xxx.147.211 641,33 GB 256 ?
>>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>>>
>>> UN xxx.xxx.147.125 647,03 GB 256 ?
>>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns Host ID
>>> Rack
>>>
>>> UN xxx.xxx.7.99 18,76 MB 256 ?
>>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>>>
>>> UN xxx.xxx.6.135 16,04 MB 256 ?
>>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>>>
>>> UN xxx.xxx.7.229 17,36 MB 256 ?
>>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>>>
>>> UN xxx.xxx.7.5 14,01 MB 256 ?
>>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>>>
>>> UN xxx.xxx.7.4 14,93 MB 256 ?
>>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>>>
>>> UN xxx.xxx.6.10 16,77 MB 256 ?
>>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>>>
>>> UN xxx.xxx.6.15 14,95 MB 256 ?
>>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>>>
>>> UN xxx.xxx.7.140 17,38 MB 256 ?
>>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>>>
>>> UN xxx.xxx.7.113 19,14 MB 256 ?
>>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>>>
>>> UN xxx.xxx.6.118 16,7 MB 256 ?
>>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>>>
>>> UN xxx.xxx.6.248 17,29 MB 256 ?
>>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>>>
>>> UN xxx.xxx.5.24 16,55 MB 256 ?
>>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>>>
>>> UN xxx.xxx.7.189 16,63 MB 256 ?
>>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>>>
>>> UN xxx.xxx.5.124 20,37 MB 256 ?
>>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>>>
>>> UN xxx.xxx.6.60 24,57 MB 256 ?
>>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns Host ID
>>> Rack
>>>
>>> UN xxx.xxx.151.102 389,41 GB 256 ?
>>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>>>
>>> UN xxx.xxx.149.161 367,82 GB 256 ?
>>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>>>
>>> UN xxx.xxx.149.226 390,88 GB 256 ?
>>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>>>
>>> UN xxx.xxx.151.162 408,35 GB 256 ?
>>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>>>
>>> UN xxx.xxx.149.109 369,33 GB 256 ?
>>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>>>
>>> UN xxx.xxx.150.172 362,32 GB 256 ?
>>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>>>
>>> UN xxx.xxx.149.238 388,98 GB 256 ?
>>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>>>
>>> UN xxx.xxx.151.232 435,31 GB 256 ?
>>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>>>
>>> UN xxx.xxx.151.43 410,69 GB 256 ?
>>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>>>
>>> UN xxx.xxx.151.139 407,47 GB 256 ?
>>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>>>
>>> UN xxx.xxx.151.213 375,05 GB 256 ?
>>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>>>
>>> UN xxx.xxx.149.177 401,91 GB 256 ?
>>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>>>
>>> UN xxx.xxx.150.145 388,76 GB 256 ?
>>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>>>
>>> UN xxx.xxx.149.48 385,43 GB 256 ?
>>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>>>
>>> UN xxx.xxx.150.189 384,52 GB 256 ?
>>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>>>
>>> UN xxx.xxx.151.220 357,56 GB 256 ?
>>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>>>
>>> UN xxx.xxx.149.121 355,64 GB 256 ?
>>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>>>
>>> UN xxx.xxx.151.218 416,57 GB 256 ?
>>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
>>>
>>> UN xxx.xxx.150.26 383,06 GB 256 ?
>>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>>>
>>> Note: Non-system keyspaces don't have the same replication settings,
>>> effective ownership information is meaningless
>>>
>>>
>>>
>>> # nodetool status system_distributed
>>>
>>> Datacenter: DC1
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns (effective) Host ID
>>> Rack
>>>
>>> UN xxx.xxx.145.5 693,63 GB 256 6,2%
>>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>>>
>>> UN xxx.xxx.145.225 648,55 GB 256 6,8%
>>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>>>
>>> UN xxx.xxx.145.160 608,31 GB 256 6,5%
>>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>>>
>>> UN xxx.xxx.145.67 552,93 GB 256 6,1%
>>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>>>
>>> UN xxx.xxx.145.227 636,68 GB 256 6,0%
>>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>>>
>>> UN xxx.xxx.146.105 610,9 GB 256 6,1%
>>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>>>
>>> UN xxx.xxx.147.136 666,82 GB 256 6,3%
>>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>>>
>>> UN xxx.xxx.146.213 609,79 GB 256 6,0%
>>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>>>
>>> UN xxx.xxx.146.20 664,44 GB 256 7,0%
>>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>>>
>>> UN xxx.xxx.146.209 615,44 GB 256 6,6%
>>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>>>
>>> UN xxx.xxx.146.241 668,91 GB 256 6,2%
>>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>>>
>>> UN xxx.xxx.147.211 641,33 GB 256 6,5%
>>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>>>
>>> UN xxx.xxx.147.125 647,03 GB 256 6,3%
>>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>>>
>>> Datacenter: DC2
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns (effective) Host ID
>>> Rack
>>>
>>> UN xxx.xxx.7.99 18,76 MB 256 6,3%
>>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>>>
>>> UN xxx.xxx.6.135 16,04 MB 256 6,1%
>>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>>>
>>> UN xxx.xxx.7.229 17,36 MB 256 5,9%
>>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>>>
>>> UN xxx.xxx.7.5 14,01 MB 256 6,2%
>>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>>>
>>> UN xxx.xxx.7.4 14,93 MB 256 6,4%
>>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>>>
>>> UN xxx.xxx.6.10 16,77 MB 256 6,4%
>>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>>>
>>> UN xxx.xxx.6.15 14,95 MB 256 6,1%
>>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>>>
>>> UN xxx.xxx.7.140 17,38 MB 256 6,7%
>>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>>>
>>> UN xxx.xxx.7.113 19,14 MB 256 6,8%
>>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>>>
>>> UN xxx.xxx.6.118 16,7 MB 256 6,7%
>>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>>>
>>> UN xxx.xxx.6.248 17,29 MB 256 6,9%
>>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>>>
>>> UN xxx.xxx.5.24 16,55 MB 256 6,8%
>>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>>>
>>> UN xxx.xxx.7.189 16,63 MB 256 6,2%
>>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>>>
>>> UN xxx.xxx.5.124 20,37 MB 256 6,3%
>>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>>>
>>> UN xxx.xxx.6.60 24,57 MB 256 6,4%
>>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>>>
>>> Datacenter: DC3
>>>
>>> ===============
>>>
>>> Status=Up/Down
>>>
>>> |/ State=Normal/Leaving/Joining/Moving
>>>
>>> -- Address Load Tokens Owns (effective) Host ID
>>> Rack
>>>
>>> UN xxx.xxx.151.102 389,41 GB 256 6,4%
>>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>>>
>>> UN xxx.xxx.149.161 367,82 GB 256 6,3%
>>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>>>
>>> UN xxx.xxx.149.226 390,88 GB 256 6,2%
>>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>>>
>>> UN xxx.xxx.151.162 408,35 GB 256 6,4%
>>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>>>
>>> UN xxx.xxx.149.109 369,33 GB 256 6,2%
>>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>>>
>>> UN xxx.xxx.150.172 362,32 GB 256 6,0%
>>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>>>
>>> UN xxx.xxx.149.238 388,98 GB 256 6,4%
>>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>>>
>>> UN xxx.xxx.151.232 435,31 GB 256 6,6%
>>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>>>
>>> UN xxx.xxx.151.43 410,69 GB 256 6,2%
>>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>>>
>>> UN xxx.xxx.151.139 407,47 GB 256 6,2%
>>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>>>
>>> UN xxx.xxx.151.213 375,05 GB 256 6,5%
>>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>>>
>>> UN xxx.xxx.149.177 401,91 GB 256 6,6%
>>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>>>
>>> UN xxx.xxx.150.145 388,76 GB 256 7,1%
>>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>>>
>>> UN xxx.xxx.149.48 385,43 GB 256 6,2%
>>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>>>
>>> UN xxx.xxx.150.189 384,52 GB 256 6,4%
>>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>>>
>>> UN xxx.xxx.151.220 357,56 GB 256 6,1%
>>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>>>
>>> UN xxx.xxx.149.121 355,64 GB 256 6,4%
>>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>>>
>>> UN xxx.xxx.151.218 416,57 GB 256 6,3%
>>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
>>> UN xxx.xxx.150.26 383,06 GB 256 6,7%
>>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>>>
>>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
>>> seen from the data loads).
>>>
>>> For the snitch we are using GossipingPropertyFileSnitch and a
>>> cassandra-rackdc.properties with config such as:
>>> dc=DC1
>>> rack=RAC1
>>>
>>> Just noticed that we also have cassandra-topology.properties present on
>>> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>>>
>>> I was wondering on whether the replication settings for the
>>> system_distributed keyspace might need a change, but didn't find any yet
>>> documentation pointing to that.
>>>
>>> Best regards,
>>> Timo
>>>
>>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com>
>>> wrote:
>>>
>>>> It could be a bug.
>>>>
>>>> Yet I am not very aware of this system_distributed keyspace, but from
>>>> what I see, it is using a simple strategy:
>>>>
>>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>>>> cqlsh $(hostname -I | awk '{print $1}')
>>>>
>>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>>>>
>>>> Let's first check some stuff. Could you share the output of:
>>>>
>>>>
>>>> - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>>> [ip_address_of_the_server]
>>>> - nodetool status
>>>> - nodetool status system_distributed
>>>> - Let us know about the snitch you are using and the corresponding
>>>> configuration.
>>>>
>>>>
>>>> I am trying to make sure the command you used is expected to work,
>>>> given your setup.
>>>>
>>>> My guess is this you might need to alter this keyspace accordingly to
>>>> your cluster setup.
>>>>
>>>> Just guessing, hope that helps.
>>>>
>>>> C*heers,
>>>> -----------------------
>>>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>>>> France
>>>>
>>>> The Last Pickle - Apache Cassandra Consulting
>>>> http://www.thelastpickle.com
>>>>
>>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>>>> are adding a third data center before decommissioning one of the earlier
>>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>>>> cluster (not set to bootstrap, as documented in
>>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>>> ons/opsAddDCToCluster.html).
>>>>>
>>>>> When trying to rebuild nodes in the new DC from a previous DC
>>>>> (nodetool rebuild -- DC1), we get the following error:
>>>>>
>>>>> Unable to find sufficient sources for streaming range
>>>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>>>
>>>>> The same error occurs which ever of the 2 existing DCs we try to
>>>>> rebuild from.
>>>>>
>>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>>>> cron.
>>>>>
>>>>> Any advice on how to get the rebuild started?
>>>>>
>>>>> Best regards,
>>>>> Timo
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
Re: Rebuild failing when adding new datacenter (3.0.8)
Posted by Yabin Meng <ya...@gmail.com>.
It is a Cassandra bug. The workaround is to change system_distributed
keyspce replication strategy to something as below:
alter keyspace system_distributed with replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
You may see similar problem for other system keyspaces. Do the same thing.
Cheers,
Yabin
On Thu, Sep 22, 2016 at 1:44 PM, Timo Ahokas <ti...@gmail.com> wrote:
> Hi Alain,
>
> Our normal user keyspaces have RF3 in all DCs, e.g:
>
> create keyspace reporting with replication = {'class':
> 'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
>
> Any idea would it be safe to change the system_distributed keyspace to
> match this?
>
> -Timo
>
> On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:
>
>> Hi Alain,
>>
>> Thanks a lot for a helping out!
>>
>> Some of the basic keyspace / cluster info you requested:
>>
>> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>
>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>>
>> CREATE TABLE system_distributed.repair_history (
>>
>> keyspace_name text,
>>
>> columnfamily_name text,
>>
>> id timeuuid,
>>
>> coordinator inet,
>>
>> exception_message text,
>>
>> exception_stacktrace text,
>>
>> finished_at timestamp,
>>
>> parent_id timeuuid,
>>
>> participants set<inet>,
>>
>> range_begin text,
>>
>> range_end text,
>>
>> started_at timestamp,
>>
>> status text,
>>
>> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>
>> ) WITH CLUSTERING ORDER BY (id ASC)
>>
>> AND bloom_filter_fp_chance = 0.01
>>
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>> AND comment = 'Repair history'
>>
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>> AND crc_check_chance = 1.0
>>
>> AND dclocal_read_repair_chance = 0.0
>>
>> AND default_time_to_live = 0
>>
>> AND gc_grace_seconds = 0
>>
>> AND max_index_interval = 2048
>>
>> AND memtable_flush_period_in_ms = 3600000
>>
>> AND min_index_interval = 128
>>
>> AND read_repair_chance = 0.0
>>
>> AND speculative_retry = '99PERCENTILE';
>>
>> CREATE TABLE system_distributed.parent_repair_history (
>>
>> parent_id timeuuid PRIMARY KEY,
>>
>> columnfamily_names set<text>,
>>
>> exception_message text,
>>
>> exception_stacktrace text,
>>
>> finished_at timestamp,
>>
>> keyspace_name text,
>>
>> requested_ranges set<text>,
>>
>> started_at timestamp,
>>
>> successful_ranges set<text>
>>
>> ) WITH bloom_filter_fp_chance = 0.01
>>
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>> AND comment = 'Repair history'
>>
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>> AND crc_check_chance = 1.0
>>
>> AND dclocal_read_repair_chance = 0.0
>>
>> AND default_time_to_live = 0
>>
>> AND gc_grace_seconds = 0
>>
>> AND max_index_interval = 2048
>>
>> AND memtable_flush_period_in_ms = 3600000
>>
>> AND min_index_interval = 128
>>
>> AND read_repair_chance = 0.0
>>
>> AND speculative_retry = '99PERCENTILE';
>>
>>
>> CREATE TABLE system_distributed.repair_history (
>>
>> keyspace_name text,
>>
>> columnfamily_name text,
>>
>> id timeuuid,
>>
>> coordinator inet,
>>
>> exception_message text,
>>
>> exception_stacktrace text,
>>
>> finished_at timestamp,
>>
>> parent_id timeuuid,
>>
>> participants set<inet>,
>>
>> range_begin text,
>>
>> range_end text,
>>
>> started_at timestamp,
>>
>> status text,
>>
>> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>>
>> ) WITH CLUSTERING ORDER BY (id ASC)
>>
>> AND bloom_filter_fp_chance = 0.01
>>
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>> AND comment = 'Repair history'
>>
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>> AND crc_check_chance = 1.0
>>
>> AND dclocal_read_repair_chance = 0.0
>>
>> AND default_time_to_live = 0
>>
>> AND gc_grace_seconds = 0
>>
>> AND max_index_interval = 2048
>>
>> AND memtable_flush_period_in_ms = 3600000
>>
>> AND min_index_interval = 128
>>
>> AND read_repair_chance = 0.0
>>
>> AND speculative_retry = '99PERCENTILE';
>>
>> CREATE TABLE system_distributed.parent_repair_history (
>>
>> parent_id timeuuid PRIMARY KEY,
>>
>> columnfamily_names set<text>,
>>
>> exception_message text,
>>
>> exception_stacktrace text,
>>
>> finished_at timestamp,
>>
>> keyspace_name text,
>>
>> requested_ranges set<text>,
>>
>> started_at timestamp,
>>
>> successful_ranges set<text>
>>
>> ) WITH bloom_filter_fp_chance = 0.01
>>
>> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>
>> AND comment = 'Repair history'
>>
>> AND compaction = {'class': 'org.apache.cassandra.db.compa
>> ction.SizeTieredCompactionStrategy', 'max_threshold': '32',
>> 'min_threshold': '4'}
>>
>> AND compression = {'chunk_length_in_kb': '64', 'class': '
>> org.apache.cassandra.io.compress.LZ4Compressor'}
>>
>> AND crc_check_chance = 1.0
>>
>> AND dclocal_read_repair_chance = 0.0
>>
>> AND default_time_to_live = 0
>>
>> AND gc_grace_seconds = 0
>>
>> AND max_index_interval = 2048
>>
>> AND memtable_flush_period_in_ms = 3600000
>>
>> AND min_index_interval = 128
>>
>> AND read_repair_chance = 0.0
>>
>> AND speculative_retry = '99PERCENTILE';
>>
>>
>>
>> # nodetool status
>>
>> Datacenter: DC1
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns Host ID
>> Rack
>>
>> UN xxx.xxx.145.5 693,63 GB 256 ?
>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>>
>> UN xxx.xxx.145.225 648,55 GB 256 ?
>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>>
>> UN xxx.xxx.145.160 608,31 GB 256 ?
>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>>
>> UN xxx.xxx.145.67 552,93 GB 256 ?
>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>>
>> UN xxx.xxx.145.227 636,68 GB 256 ?
>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>>
>> UN xxx.xxx.146.105 610,9 GB 256 ?
>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>>
>> UN xxx.xxx.147.136 666,82 GB 256 ?
>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>>
>> UN xxx.xxx.146.213 609,79 GB 256 ?
>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>>
>> UN xxx.xxx.146.20 664,44 GB 256 ?
>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>>
>> UN xxx.xxx.146.209 615,44 GB 256 ?
>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>>
>> UN xxx.xxx.146.241 668,91 GB 256 ?
>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>>
>> UN xxx.xxx.147.211 641,33 GB 256 ?
>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>>
>> UN xxx.xxx.147.125 647,03 GB 256 ?
>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>>
>> Datacenter: DC2
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns Host ID
>> Rack
>>
>> UN xxx.xxx.7.99 18,76 MB 256 ?
>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>>
>> UN xxx.xxx.6.135 16,04 MB 256 ?
>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>>
>> UN xxx.xxx.7.229 17,36 MB 256 ?
>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>>
>> UN xxx.xxx.7.5 14,01 MB 256 ?
>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>>
>> UN xxx.xxx.7.4 14,93 MB 256 ?
>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>>
>> UN xxx.xxx.6.10 16,77 MB 256 ?
>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>>
>> UN xxx.xxx.6.15 14,95 MB 256 ?
>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>>
>> UN xxx.xxx.7.140 17,38 MB 256 ?
>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>>
>> UN xxx.xxx.7.113 19,14 MB 256 ?
>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>>
>> UN xxx.xxx.6.118 16,7 MB 256 ?
>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>>
>> UN xxx.xxx.6.248 17,29 MB 256 ?
>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>>
>> UN xxx.xxx.5.24 16,55 MB 256 ?
>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>>
>> UN xxx.xxx.7.189 16,63 MB 256 ?
>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>>
>> UN xxx.xxx.5.124 20,37 MB 256 ?
>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>>
>> UN xxx.xxx.6.60 24,57 MB 256 ?
>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>>
>> Datacenter: DC3
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns Host ID
>> Rack
>>
>> UN xxx.xxx.151.102 389,41 GB 256 ?
>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>>
>> UN xxx.xxx.149.161 367,82 GB 256 ?
>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>>
>> UN xxx.xxx.149.226 390,88 GB 256 ?
>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>>
>> UN xxx.xxx.151.162 408,35 GB 256 ?
>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>>
>> UN xxx.xxx.149.109 369,33 GB 256 ?
>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>>
>> UN xxx.xxx.150.172 362,32 GB 256 ?
>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>>
>> UN xxx.xxx.149.238 388,98 GB 256 ?
>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>>
>> UN xxx.xxx.151.232 435,31 GB 256 ?
>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>>
>> UN xxx.xxx.151.43 410,69 GB 256 ?
>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>>
>> UN xxx.xxx.151.139 407,47 GB 256 ?
>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>>
>> UN xxx.xxx.151.213 375,05 GB 256 ?
>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>>
>> UN xxx.xxx.149.177 401,91 GB 256 ?
>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>>
>> UN xxx.xxx.150.145 388,76 GB 256 ?
>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>>
>> UN xxx.xxx.149.48 385,43 GB 256 ?
>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>>
>> UN xxx.xxx.150.189 384,52 GB 256 ?
>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>>
>> UN xxx.xxx.151.220 357,56 GB 256 ?
>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>>
>> UN xxx.xxx.149.121 355,64 GB 256 ?
>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>>
>> UN xxx.xxx.151.218 416,57 GB 256 ?
>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
>>
>> UN xxx.xxx.150.26 383,06 GB 256 ?
>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>>
>> Note: Non-system keyspaces don't have the same replication settings,
>> effective ownership information is meaningless
>>
>>
>>
>> # nodetool status system_distributed
>>
>> Datacenter: DC1
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns (effective) Host ID
>> Rack
>>
>> UN xxx.xxx.145.5 693,63 GB 256 6,2%
>> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>>
>> UN xxx.xxx.145.225 648,55 GB 256 6,8%
>> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>>
>> UN xxx.xxx.145.160 608,31 GB 256 6,5%
>> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>>
>> UN xxx.xxx.145.67 552,93 GB 256 6,1%
>> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>>
>> UN xxx.xxx.145.227 636,68 GB 256 6,0%
>> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>>
>> UN xxx.xxx.146.105 610,9 GB 256 6,1%
>> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>>
>> UN xxx.xxx.147.136 666,82 GB 256 6,3%
>> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>>
>> UN xxx.xxx.146.213 609,79 GB 256 6,0%
>> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>>
>> UN xxx.xxx.146.20 664,44 GB 256 7,0%
>> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>>
>> UN xxx.xxx.146.209 615,44 GB 256 6,6%
>> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>>
>> UN xxx.xxx.146.241 668,91 GB 256 6,2%
>> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>>
>> UN xxx.xxx.147.211 641,33 GB 256 6,5%
>> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>>
>> UN xxx.xxx.147.125 647,03 GB 256 6,3%
>> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>>
>> Datacenter: DC2
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns (effective) Host ID
>> Rack
>>
>> UN xxx.xxx.7.99 18,76 MB 256 6,3%
>> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>>
>> UN xxx.xxx.6.135 16,04 MB 256 6,1%
>> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>>
>> UN xxx.xxx.7.229 17,36 MB 256 5,9%
>> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>>
>> UN xxx.xxx.7.5 14,01 MB 256 6,2%
>> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>>
>> UN xxx.xxx.7.4 14,93 MB 256 6,4%
>> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>>
>> UN xxx.xxx.6.10 16,77 MB 256 6,4%
>> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>>
>> UN xxx.xxx.6.15 14,95 MB 256 6,1%
>> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>>
>> UN xxx.xxx.7.140 17,38 MB 256 6,7%
>> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>>
>> UN xxx.xxx.7.113 19,14 MB 256 6,8%
>> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>>
>> UN xxx.xxx.6.118 16,7 MB 256 6,7%
>> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>>
>> UN xxx.xxx.6.248 17,29 MB 256 6,9%
>> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>>
>> UN xxx.xxx.5.24 16,55 MB 256 6,8%
>> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>>
>> UN xxx.xxx.7.189 16,63 MB 256 6,2%
>> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>>
>> UN xxx.xxx.5.124 20,37 MB 256 6,3%
>> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>>
>> UN xxx.xxx.6.60 24,57 MB 256 6,4%
>> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>>
>> Datacenter: DC3
>>
>> ===============
>>
>> Status=Up/Down
>>
>> |/ State=Normal/Leaving/Joining/Moving
>>
>> -- Address Load Tokens Owns (effective) Host ID
>> Rack
>>
>> UN xxx.xxx.151.102 389,41 GB 256 6,4%
>> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>>
>> UN xxx.xxx.149.161 367,82 GB 256 6,3%
>> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>>
>> UN xxx.xxx.149.226 390,88 GB 256 6,2%
>> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>>
>> UN xxx.xxx.151.162 408,35 GB 256 6,4%
>> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>>
>> UN xxx.xxx.149.109 369,33 GB 256 6,2%
>> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>>
>> UN xxx.xxx.150.172 362,32 GB 256 6,0%
>> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>>
>> UN xxx.xxx.149.238 388,98 GB 256 6,4%
>> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>>
>> UN xxx.xxx.151.232 435,31 GB 256 6,6%
>> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>>
>> UN xxx.xxx.151.43 410,69 GB 256 6,2%
>> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>>
>> UN xxx.xxx.151.139 407,47 GB 256 6,2%
>> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>>
>> UN xxx.xxx.151.213 375,05 GB 256 6,5%
>> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>>
>> UN xxx.xxx.149.177 401,91 GB 256 6,6%
>> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>>
>> UN xxx.xxx.150.145 388,76 GB 256 7,1%
>> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>>
>> UN xxx.xxx.149.48 385,43 GB 256 6,2%
>> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>>
>> UN xxx.xxx.150.189 384,52 GB 256 6,4%
>> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>>
>> UN xxx.xxx.151.220 357,56 GB 256 6,1%
>> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>>
>> UN xxx.xxx.149.121 355,64 GB 256 6,4%
>> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>>
>> UN xxx.xxx.151.218 416,57 GB 256 6,3%
>> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
>> UN xxx.xxx.150.26 383,06 GB 256 6,7%
>> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>>
>> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
>> seen from the data loads).
>>
>> For the snitch we are using GossipingPropertyFileSnitch and a
>> cassandra-rackdc.properties with config such as:
>> dc=DC1
>> rack=RAC1
>>
>> Just noticed that we also have cassandra-topology.properties present on
>> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>>
>> I was wondering on whether the replication settings for the
>> system_distributed keyspace might need a change, but didn't find any yet
>> documentation pointing to that.
>>
>> Best regards,
>> Timo
>>
>> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com>
>> wrote:
>>
>>> It could be a bug.
>>>
>>> Yet I am not very aware of this system_distributed keyspace, but from
>>> what I see, it is using a simple strategy:
>>>
>>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>>> cqlsh $(hostname -I | awk '{print $1}')
>>>
>>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>>>
>>> Let's first check some stuff. Could you share the output of:
>>>
>>>
>>> - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>>> [ip_address_of_the_server]
>>> - nodetool status
>>> - nodetool status system_distributed
>>> - Let us know about the snitch you are using and the corresponding
>>> configuration.
>>>
>>>
>>> I am trying to make sure the command you used is expected to work, given
>>> your setup.
>>>
>>> My guess is this you might need to alter this keyspace accordingly to
>>> your cluster setup.
>>>
>>> Just guessing, hope that helps.
>>>
>>> C*heers,
>>> -----------------------
>>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>>> France
>>>
>>> The Last Pickle - Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>>> are adding a third data center before decommissioning one of the earlier
>>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>>> cluster (not set to bootstrap, as documented in
>>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>>> ons/opsAddDCToCluster.html).
>>>>
>>>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>>>> rebuild -- DC1), we get the following error:
>>>>
>>>> Unable to find sufficient sources for streaming range
>>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>>
>>>> The same error occurs which ever of the 2 existing DCs we try to
>>>> rebuild from.
>>>>
>>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>>> cron.
>>>>
>>>> Any advice on how to get the rebuild started?
>>>>
>>>> Best regards,
>>>> Timo
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
Re: Rebuild failing when adding new datacenter (3.0.8)
Posted by Timo Ahokas <ti...@gmail.com>.
Hi Alain,
Our normal user keyspaces have RF3 in all DCs, e.g:
create keyspace reporting with replication = {'class':
'NetworkTopologyStrategy', 'DC1': '3', 'DC2': '3', 'DC3': '3'};
Any idea would it be safe to change the system_distributed keyspace to
match this?
-Timo
On 22 September 2016 at 19:23, Timo Ahokas <ti...@gmail.com> wrote:
> Hi Alain,
>
> Thanks a lot for a helping out!
>
> Some of the basic keyspace / cluster info you requested:
>
> # echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>
> CREATE KEYSPACE system_distributed WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>
> CREATE TABLE system_distributed.repair_history (
>
> keyspace_name text,
>
> columnfamily_name text,
>
> id timeuuid,
>
> coordinator inet,
>
> exception_message text,
>
> exception_stacktrace text,
>
> finished_at timestamp,
>
> parent_id timeuuid,
>
> participants set<inet>,
>
> range_begin text,
>
> range_end text,
>
> started_at timestamp,
>
> status text,
>
> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>
> ) WITH CLUSTERING ORDER BY (id ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = 'Repair history'
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.0
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 0
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 3600000
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
> CREATE TABLE system_distributed.parent_repair_history (
>
> parent_id timeuuid PRIMARY KEY,
>
> columnfamily_names set<text>,
>
> exception_message text,
>
> exception_stacktrace text,
>
> finished_at timestamp,
>
> keyspace_name text,
>
> requested_ranges set<text>,
>
> started_at timestamp,
>
> successful_ranges set<text>
>
> ) WITH bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = 'Repair history'
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.0
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 0
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 3600000
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
>
> CREATE TABLE system_distributed.repair_history (
>
> keyspace_name text,
>
> columnfamily_name text,
>
> id timeuuid,
>
> coordinator inet,
>
> exception_message text,
>
> exception_stacktrace text,
>
> finished_at timestamp,
>
> parent_id timeuuid,
>
> participants set<inet>,
>
> range_begin text,
>
> range_end text,
>
> started_at timestamp,
>
> status text,
>
> PRIMARY KEY ((keyspace_name, columnfamily_name), id)
>
> ) WITH CLUSTERING ORDER BY (id ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = 'Repair history'
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.0
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 0
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 3600000
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
> CREATE TABLE system_distributed.parent_repair_history (
>
> parent_id timeuuid PRIMARY KEY,
>
> columnfamily_names set<text>,
>
> exception_message text,
>
> exception_stacktrace text,
>
> finished_at timestamp,
>
> keyspace_name text,
>
> requested_ranges set<text>,
>
> started_at timestamp,
>
> successful_ranges set<text>
>
> ) WITH bloom_filter_fp_chance = 0.01
>
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
> AND comment = 'Repair history'
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>
> AND compression = {'chunk_length_in_kb': '64', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
> AND crc_check_chance = 1.0
>
> AND dclocal_read_repair_chance = 0.0
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 0
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 3600000
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99PERCENTILE';
>
>
>
> # nodetool status
>
> Datacenter: DC1
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns Host ID
> Rack
>
> UN xxx.xxx.145.5 693,63 GB 256 ?
> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>
> UN xxx.xxx.145.225 648,55 GB 256 ?
> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>
> UN xxx.xxx.145.160 608,31 GB 256 ?
> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>
> UN xxx.xxx.145.67 552,93 GB 256 ?
> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>
> UN xxx.xxx.145.227 636,68 GB 256 ?
> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>
> UN xxx.xxx.146.105 610,9 GB 256 ?
> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>
> UN xxx.xxx.147.136 666,82 GB 256 ?
> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>
> UN xxx.xxx.146.213 609,79 GB 256 ?
> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>
> UN xxx.xxx.146.20 664,44 GB 256 ?
> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>
> UN xxx.xxx.146.209 615,44 GB 256 ?
> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>
> UN xxx.xxx.146.241 668,91 GB 256 ?
> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>
> UN xxx.xxx.147.211 641,33 GB 256 ?
> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>
> UN xxx.xxx.147.125 647,03 GB 256 ?
> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>
> Datacenter: DC2
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns Host ID
> Rack
>
> UN xxx.xxx.7.99 18,76 MB 256 ?
> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>
> UN xxx.xxx.6.135 16,04 MB 256 ?
> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>
> UN xxx.xxx.7.229 17,36 MB 256 ?
> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>
> UN xxx.xxx.7.5 14,01 MB 256 ?
> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>
> UN xxx.xxx.7.4 14,93 MB 256 ?
> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>
> UN xxx.xxx.6.10 16,77 MB 256 ?
> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>
> UN xxx.xxx.6.15 14,95 MB 256 ?
> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>
> UN xxx.xxx.7.140 17,38 MB 256 ?
> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>
> UN xxx.xxx.7.113 19,14 MB 256 ?
> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>
> UN xxx.xxx.6.118 16,7 MB 256 ?
> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>
> UN xxx.xxx.6.248 17,29 MB 256 ?
> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>
> UN xxx.xxx.5.24 16,55 MB 256 ?
> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>
> UN xxx.xxx.7.189 16,63 MB 256 ?
> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>
> UN xxx.xxx.5.124 20,37 MB 256 ?
> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>
> UN xxx.xxx.6.60 24,57 MB 256 ?
> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>
> Datacenter: DC3
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns Host ID
> Rack
>
> UN xxx.xxx.151.102 389,41 GB 256 ?
> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>
> UN xxx.xxx.149.161 367,82 GB 256 ?
> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>
> UN xxx.xxx.149.226 390,88 GB 256 ?
> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>
> UN xxx.xxx.151.162 408,35 GB 256 ?
> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>
> UN xxx.xxx.149.109 369,33 GB 256 ?
> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>
> UN xxx.xxx.150.172 362,32 GB 256 ?
> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>
> UN xxx.xxx.149.238 388,98 GB 256 ?
> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>
> UN xxx.xxx.151.232 435,31 GB 256 ?
> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>
> UN xxx.xxx.151.43 410,69 GB 256 ?
> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>
> UN xxx.xxx.151.139 407,47 GB 256 ?
> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>
> UN xxx.xxx.151.213 375,05 GB 256 ?
> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>
> UN xxx.xxx.149.177 401,91 GB 256 ?
> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>
> UN xxx.xxx.150.145 388,76 GB 256 ?
> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>
> UN xxx.xxx.149.48 385,43 GB 256 ?
> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>
> UN xxx.xxx.150.189 384,52 GB 256 ?
> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>
> UN xxx.xxx.151.220 357,56 GB 256 ?
> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>
> UN xxx.xxx.149.121 355,64 GB 256 ?
> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>
> UN xxx.xxx.151.218 416,57 GB 256 ?
> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
>
> UN xxx.xxx.150.26 383,06 GB 256 ?
> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>
> Note: Non-system keyspaces don't have the same replication settings,
> effective ownership information is meaningless
>
>
>
> # nodetool status system_distributed
>
> Datacenter: DC1
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns (effective) Host ID
> Rack
>
> UN xxx.xxx.145.5 693,63 GB 256 6,2%
> 6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
>
> UN xxx.xxx.145.225 648,55 GB 256 6,8%
> f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
>
> UN xxx.xxx.145.160 608,31 GB 256 6,5%
> d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
>
> UN xxx.xxx.145.67 552,93 GB 256 6,1%
> 1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
>
> UN xxx.xxx.145.227 636,68 GB 256 6,0%
> 47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
>
> UN xxx.xxx.146.105 610,9 GB 256 6,1%
> 8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
>
> UN xxx.xxx.147.136 666,82 GB 256 6,3%
> bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
>
> UN xxx.xxx.146.213 609,79 GB 256 6,0%
> 6416275c-7570-48a9-957f-2daca71d31aa RAC1
>
> UN xxx.xxx.146.20 664,44 GB 256 7,0%
> b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
>
> UN xxx.xxx.146.209 615,44 GB 256 6,6%
> 898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
>
> UN xxx.xxx.146.241 668,91 GB 256 6,2%
> 0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
>
> UN xxx.xxx.147.211 641,33 GB 256 6,5%
> 16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
>
> UN xxx.xxx.147.125 647,03 GB 256 6,3%
> 2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
>
> Datacenter: DC2
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns (effective) Host ID
> Rack
>
> UN xxx.xxx.7.99 18,76 MB 256 6,3%
> d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
>
> UN xxx.xxx.6.135 16,04 MB 256 6,1%
> 463f480a-baf3-4230-86b7-1106251ebfad RAC1
>
> UN xxx.xxx.7.229 17,36 MB 256 5,9%
> 9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
>
> UN xxx.xxx.7.5 14,01 MB 256 6,2%
> ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
>
> UN xxx.xxx.7.4 14,93 MB 256 6,4%
> 122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
>
> UN xxx.xxx.6.10 16,77 MB 256 6,4%
> bbb66068-bf06-438d-81ee-965e201e8fff RAC1
>
> UN xxx.xxx.6.15 14,95 MB 256 6,1%
> 668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
>
> UN xxx.xxx.7.140 17,38 MB 256 6,7%
> 7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
>
> UN xxx.xxx.7.113 19,14 MB 256 6,8%
> 46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
>
> UN xxx.xxx.6.118 16,7 MB 256 6,7%
> 9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
>
> UN xxx.xxx.6.248 17,29 MB 256 6,9%
> 35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
>
> UN xxx.xxx.5.24 16,55 MB 256 6,8%
> 5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
>
> UN xxx.xxx.7.189 16,63 MB 256 6,2%
> be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
>
> UN xxx.xxx.5.124 20,37 MB 256 6,3%
> 638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
>
> UN xxx.xxx.6.60 24,57 MB 256 6,4%
> cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
>
> Datacenter: DC3
>
> ===============
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> -- Address Load Tokens Owns (effective) Host ID
> Rack
>
> UN xxx.xxx.151.102 389,41 GB 256 6,4%
> 1740a473-e304-467c-a682-d1b4b0595ffa RAC1
>
> UN xxx.xxx.149.161 367,82 GB 256 6,3%
> 3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
>
> UN xxx.xxx.149.226 390,88 GB 256 6,2%
> b8ca4576-2632-4198-ac87-10243c0c554e RAC1
>
> UN xxx.xxx.151.162 408,35 GB 256 6,4%
> 54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
>
> UN xxx.xxx.149.109 369,33 GB 256 6,2%
> 9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
>
> UN xxx.xxx.150.172 362,32 GB 256 6,0%
> ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
>
> UN xxx.xxx.149.238 388,98 GB 256 6,4%
> a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
>
> UN xxx.xxx.151.232 435,31 GB 256 6,6%
> 500a43ab-ae77-4a07-876c-171cb34c549b RAC1
>
> UN xxx.xxx.151.43 410,69 GB 256 6,2%
> b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
>
> UN xxx.xxx.151.139 407,47 GB 256 6,2%
> ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
>
> UN xxx.xxx.151.213 375,05 GB 256 6,5%
> 9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
>
> UN xxx.xxx.149.177 401,91 GB 256 6,6%
> b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
>
> UN xxx.xxx.150.145 388,76 GB 256 7,1%
> 1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
>
> UN xxx.xxx.149.48 385,43 GB 256 6,2%
> ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
>
> UN xxx.xxx.150.189 384,52 GB 256 6,4%
> f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
>
> UN xxx.xxx.151.220 357,56 GB 256 6,1%
> feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
>
> UN xxx.xxx.149.121 355,64 GB 256 6,4%
> 47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
>
> UN xxx.xxx.151.218 416,57 GB 256 6,3%
> bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
> UN xxx.xxx.150.26 383,06 GB 256 6,7%
> 1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
>
> DC1 and DC3 are the old data centers. DC2 is the new one being added (as
> seen from the data loads).
>
> For the snitch we are using GossipingPropertyFileSnitch and a
> cassandra-rackdc.properties with config such as:
> dc=DC1
> rack=RAC1
>
> Just noticed that we also have cassandra-topology.properties present on
> the nodes, but it's up-to-date with all the nodes from the 3 data centers.
>
> I was wondering on whether the replication settings for the
> system_distributed keyspace might need a change, but didn't find any yet
> documentation pointing to that.
>
> Best regards,
> Timo
>
> On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com> wrote:
>
>> It could be a bug.
>>
>> Yet I am not very aware of this system_distributed keyspace, but from
>> what I see, it is using a simple strategy:
>>
>> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
>> cqlsh $(hostname -I | awk '{print $1}')
>>
>> CREATE KEYSPACE system_distributed WITH replication = {'class':
>> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>>
>> Let's first check some stuff. Could you share the output of:
>>
>>
>> - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
>> [ip_address_of_the_server]
>> - nodetool status
>> - nodetool status system_distributed
>> - Let us know about the snitch you are using and the corresponding
>> configuration.
>>
>>
>> I am trying to make sure the command you used is expected to work, given
>> your setup.
>>
>> My guess is this you might need to alter this keyspace accordingly to
>> your cluster setup.
>>
>> Just guessing, hope that helps.
>>
>> C*heers,
>> -----------------------
>> Alain Rodriguez - @arodream - alain@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>>
>>> Hi,
>>>
>>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>>> are adding a third data center before decommissioning one of the earlier
>>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>>> cluster (not set to bootstrap, as documented in
>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>>> ons/opsAddDCToCluster.html).
>>>
>>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>>> rebuild -- DC1), we get the following error:
>>>
>>> Unable to find sufficient sources for streaming range
>>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>>
>>> The same error occurs which ever of the 2 existing DCs we try to rebuild
>>> from.
>>>
>>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>>> cron.
>>>
>>> Any advice on how to get the rebuild started?
>>>
>>> Best regards,
>>> Timo
>>>
>>>
>>>
>>>
>>>
>>
>
Re: Rebuild failing when adding new datacenter (3.0.8)
Posted by Timo Ahokas <ti...@gmail.com>.
Hi Alain,
Thanks a lot for a helping out!
Some of the basic keyspace / cluster info you requested:
# echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
CREATE KEYSPACE system_distributed WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
CREATE TABLE system_distributed.repair_history (
keyspace_name text,
columnfamily_name text,
id timeuuid,
coordinator inet,
exception_message text,
exception_stacktrace text,
finished_at timestamp,
parent_id timeuuid,
participants set<inet>,
range_begin text,
range_end text,
started_at timestamp,
status text,
PRIMARY KEY ((keyspace_name, columnfamily_name), id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'Repair history'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE TABLE system_distributed.parent_repair_history (
parent_id timeuuid PRIMARY KEY,
columnfamily_names set<text>,
exception_message text,
exception_stacktrace text,
finished_at timestamp,
keyspace_name text,
requested_ranges set<text>,
started_at timestamp,
successful_ranges set<text>
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'Repair history'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE TABLE system_distributed.repair_history (
keyspace_name text,
columnfamily_name text,
id timeuuid,
coordinator inet,
exception_message text,
exception_stacktrace text,
finished_at timestamp,
parent_id timeuuid,
participants set<inet>,
range_begin text,
range_end text,
started_at timestamp,
status text,
PRIMARY KEY ((keyspace_name, columnfamily_name), id)
) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'Repair history'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE TABLE system_distributed.parent_repair_history (
parent_id timeuuid PRIMARY KEY,
columnfamily_names set<text>,
exception_message text,
exception_stacktrace text,
finished_at timestamp,
keyspace_name text,
requested_ranges set<text>,
started_at timestamp,
successful_ranges set<text>
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'Repair history'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 3600000
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
# nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN xxx.xxx.145.5 693,63 GB 256 ?
6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
UN xxx.xxx.145.225 648,55 GB 256 ?
f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
UN xxx.xxx.145.160 608,31 GB 256 ?
d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
UN xxx.xxx.145.67 552,93 GB 256 ?
1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
UN xxx.xxx.145.227 636,68 GB 256 ?
47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
UN xxx.xxx.146.105 610,9 GB 256 ?
8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
UN xxx.xxx.147.136 666,82 GB 256 ?
bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
UN xxx.xxx.146.213 609,79 GB 256 ?
6416275c-7570-48a9-957f-2daca71d31aa RAC1
UN xxx.xxx.146.20 664,44 GB 256 ?
b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
UN xxx.xxx.146.209 615,44 GB 256 ?
898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
UN xxx.xxx.146.241 668,91 GB 256 ?
0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
UN xxx.xxx.147.211 641,33 GB 256 ?
16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
UN xxx.xxx.147.125 647,03 GB 256 ?
2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
Datacenter: DC2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN xxx.xxx.7.99 18,76 MB 256 ?
d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
UN xxx.xxx.6.135 16,04 MB 256 ?
463f480a-baf3-4230-86b7-1106251ebfad RAC1
UN xxx.xxx.7.229 17,36 MB 256 ?
9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
UN xxx.xxx.7.5 14,01 MB 256 ?
ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
UN xxx.xxx.7.4 14,93 MB 256 ?
122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
UN xxx.xxx.6.10 16,77 MB 256 ?
bbb66068-bf06-438d-81ee-965e201e8fff RAC1
UN xxx.xxx.6.15 14,95 MB 256 ?
668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
UN xxx.xxx.7.140 17,38 MB 256 ?
7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
UN xxx.xxx.7.113 19,14 MB 256 ?
46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
UN xxx.xxx.6.118 16,7 MB 256 ?
9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
UN xxx.xxx.6.248 17,29 MB 256 ?
35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
UN xxx.xxx.5.24 16,55 MB 256 ?
5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
UN xxx.xxx.7.189 16,63 MB 256 ?
be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
UN xxx.xxx.5.124 20,37 MB 256 ?
638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
UN xxx.xxx.6.60 24,57 MB 256 ?
cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
Datacenter: DC3
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID
Rack
UN xxx.xxx.151.102 389,41 GB 256 ?
1740a473-e304-467c-a682-d1b4b0595ffa RAC1
UN xxx.xxx.149.161 367,82 GB 256 ?
3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
UN xxx.xxx.149.226 390,88 GB 256 ?
b8ca4576-2632-4198-ac87-10243c0c554e RAC1
UN xxx.xxx.151.162 408,35 GB 256 ?
54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
UN xxx.xxx.149.109 369,33 GB 256 ?
9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
UN xxx.xxx.150.172 362,32 GB 256 ?
ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
UN xxx.xxx.149.238 388,98 GB 256 ?
a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
UN xxx.xxx.151.232 435,31 GB 256 ?
500a43ab-ae77-4a07-876c-171cb34c549b RAC1
UN xxx.xxx.151.43 410,69 GB 256 ?
b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
UN xxx.xxx.151.139 407,47 GB 256 ?
ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
UN xxx.xxx.151.213 375,05 GB 256 ?
9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
UN xxx.xxx.149.177 401,91 GB 256 ?
b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
UN xxx.xxx.150.145 388,76 GB 256 ?
1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
UN xxx.xxx.149.48 385,43 GB 256 ?
ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
UN xxx.xxx.150.189 384,52 GB 256 ?
f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
UN xxx.xxx.151.220 357,56 GB 256 ?
feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
UN xxx.xxx.149.121 355,64 GB 256 ?
47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
UN xxx.xxx.151.218 416,57 GB 256 ?
bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
UN xxx.xxx.150.26 383,06 GB 256 ?
1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
Note: Non-system keyspaces don't have the same replication settings,
effective ownership information is meaningless
# nodetool status system_distributed
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN xxx.xxx.145.5 693,63 GB 256 6,2%
6f1a0fdd-e3f9-474d-9a49-7bfeeadb3f56 RAC1
UN xxx.xxx.145.225 648,55 GB 256 6,8%
f900847a-63e4-44c5-b4d7-e439c7cb6a8e RAC1
UN xxx.xxx.145.160 608,31 GB 256 6,5%
d257e76d-9e40-4215-94c7-3076c8ff4b7f RAC1
UN xxx.xxx.145.67 552,93 GB 256 6,1%
1d47cbdd-cdf1-45b6-aa0e-0c6123899dca RAC1
UN xxx.xxx.145.227 636,68 GB 256 6,0%
47e5f207-f9fd-4a86-be8a-66e7630d1baa RAC1
UN xxx.xxx.146.105 610,9 GB 256 6,1%
8edf1aaa-49d1-4e4b-9f09-99c4ab6136c2 RAC1
UN xxx.xxx.147.136 666,82 GB 256 6,3%
bafbf6a2-cff9-489f-a2dd-fc6e8cb08ff6 RAC1
UN xxx.xxx.146.213 609,79 GB 256 6,0%
6416275c-7570-48a9-957f-2daca71d31aa RAC1
UN xxx.xxx.146.20 664,44 GB 256 7,0%
b016df7e-f694-4ef3-928c-8783853e9a07 RAC1
UN xxx.xxx.146.209 615,44 GB 256 6,6%
898e6d98-1b92-4e86-b52c-f851fd4fda71 RAC1
UN xxx.xxx.146.241 668,91 GB 256 6,2%
0b5d4c6c-4b7c-4265-92bc-ad74464d85cc RAC1
UN xxx.xxx.147.211 641,33 GB 256 6,5%
16cdc4a7-b694-4125-91d6-05b9099cb765 RAC1
UN xxx.xxx.147.125 647,03 GB 256 6,3%
2e97ed0a-039c-413b-9693-a87fadf40f82 RAC1
Datacenter: DC2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN xxx.xxx.7.99 18,76 MB 256 6,3%
d7b907ad-15f5-4c79-962c-c604a5723a7b RAC1
UN xxx.xxx.6.135 16,04 MB 256 6,1%
463f480a-baf3-4230-86b7-1106251ebfad RAC1
UN xxx.xxx.7.229 17,36 MB 256 5,9%
9487a975-6183-43b8-9208-cd8e09a0ae18 RAC1
UN xxx.xxx.7.5 14,01 MB 256 6,2%
ae039e49-4d79-4e4e-87bd-921cd6b3291a RAC1
UN xxx.xxx.7.4 14,93 MB 256 6,4%
122a47fb-b5ca-46d1-aae9-e6993ab58b66 RAC1
UN xxx.xxx.6.10 16,77 MB 256 6,4%
bbb66068-bf06-438d-81ee-965e201e8fff RAC1
UN xxx.xxx.6.15 14,95 MB 256 6,1%
668a864d-9fd3-41b7-88fb-824e75e71953 RAC1
UN xxx.xxx.7.140 17,38 MB 256 6,7%
7b016c96-eaa1-4ee1-8657-f4260c70ed37 RAC1
UN xxx.xxx.7.113 19,14 MB 256 6,8%
46c06c44-ce2f-4ab6-9597-a1314cecf9bc RAC1
UN xxx.xxx.6.118 16,7 MB 256 6,7%
9c3c3107-a1d3-4254-ad10-909713a38f8c RAC1
UN xxx.xxx.6.248 17,29 MB 256 6,9%
35ff4d3d-d993-468b-9a54-88b40ceec6d4 RAC1
UN xxx.xxx.5.24 16,55 MB 256 6,8%
5f1f34bd-110f-4d60-9af5-a3abd01b55a5 RAC1
UN xxx.xxx.7.189 16,63 MB 256 6,2%
be7cbf84-5838-487a-8bd4-b340a1c70fab RAC1
UN xxx.xxx.5.124 20,37 MB 256 6,3%
638f2656-fb92-4b70-ba2a-251a749c4c58 RAC1
UN xxx.xxx.6.60 24,57 MB 256 6,4%
cf16209a-a9a0-4f27-9341-c76d47e50261 RAC1
Datacenter: DC3
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID
Rack
UN xxx.xxx.151.102 389,41 GB 256 6,4%
1740a473-e304-467c-a682-d1b4b0595ffa RAC1
UN xxx.xxx.149.161 367,82 GB 256 6,3%
3a5322d4-e49f-45ed-85b5-fd658502859c RAC1
UN xxx.xxx.149.226 390,88 GB 256 6,2%
b8ca4576-2632-4198-ac87-10243c0c554e RAC1
UN xxx.xxx.151.162 408,35 GB 256 6,4%
54d3dd90-f9ab-47c2-ae31-5f3e87b91e2a RAC1
UN xxx.xxx.149.109 369,33 GB 256 6,2%
9172c7d8-0c55-4e8e-a17b-89fdb0dce878 RAC1
UN xxx.xxx.150.172 362,32 GB 256 6,0%
ba394a29-1a0c-4f50-ab85-4db19011b190 RAC1
UN xxx.xxx.149.238 388,98 GB 256 6,4%
a3d7228c-ccb4-4787-a4bb-f7720aeedc8e RAC1
UN xxx.xxx.151.232 435,31 GB 256 6,6%
500a43ab-ae77-4a07-876c-171cb34c549b RAC1
UN xxx.xxx.151.43 410,69 GB 256 6,2%
b8bc80e2-2107-447a-85e4-57a39dc9c595 RAC1
UN xxx.xxx.151.139 407,47 GB 256 6,2%
ecfa4ba7-7783-47a4-8b17-aadc91a3e776 RAC1
UN xxx.xxx.151.213 375,05 GB 256 6,5%
9bf53ee1-53d4-4d18-a58e-0b0a17e18a69 RAC1
UN xxx.xxx.149.177 401,91 GB 256 6,6%
b903faf1-1ae9-45ad-bdce-3c9377458a03 RAC1
UN xxx.xxx.150.145 388,76 GB 256 7,1%
1c4e4232-db27-4cc1-9985-9eb7f0b984d1 RAC1
UN xxx.xxx.149.48 385,43 GB 256 6,2%
ad3ea388-203c-4b26-a368-934a6105cc6e RAC1
UN xxx.xxx.150.189 384,52 GB 256 6,4%
f361ebad-b0a6-47b7-a55c-245c98f84508 RAC1
UN xxx.xxx.151.220 357,56 GB 256 6,1%
feb814e6-6d2f-4cef-ae3b-4924c1cbac60 RAC1
UN xxx.xxx.149.121 355,64 GB 256 6,4%
47fbb104-6a5a-49c0-b086-3f14c853c83b RAC1
UN xxx.xxx.151.218 416,57 GB 256 6,3%
bbb21d16-da85-4cfd-87d4-2333c8b02dad RAC1
UN xxx.xxx.150.26 383,06 GB 256 6,7%
1ca0085d-93a5-4650-891a-b45f988150a4 RAC1
DC1 and DC3 are the old data centers. DC2 is the new one being added (as
seen from the data loads).
For the snitch we are using GossipingPropertyFileSnitch and a
cassandra-rackdc.properties with config such as:
dc=DC1
rack=RAC1
Just noticed that we also have cassandra-topology.properties present on the
nodes, but it's up-to-date with all the nodes from the 3 data centers.
I was wondering on whether the replication settings for the
system_distributed keyspace might need a change, but didn't find any yet
documentation pointing to that.
Best regards,
Timo
On 22 September 2016 at 18:00, Alain RODRIGUEZ <ar...@gmail.com> wrote:
> It could be a bug.
>
> Yet I am not very aware of this system_distributed keyspace, but from what
> I see, it is using a simple strategy:
>
> root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
> cqlsh $(hostname -I | awk '{print $1}')
>
> CREATE KEYSPACE system_distributed WITH replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
>
> Let's first check some stuff. Could you share the output of:
>
>
> - echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
> [ip_address_of_the_server]
> - nodetool status
> - nodetool status system_distributed
> - Let us know about the snitch you are using and the corresponding
> configuration.
>
>
> I am trying to make sure the command you used is expected to work, given
> your setup.
>
> My guess is this you might need to alter this keyspace accordingly to your
> cluster setup.
>
> Just guessing, hope that helps.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - alain@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
>
>> Hi,
>>
>> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
>> currently running in two data centers (13 and 19 nodes, RF3 in both). We
>> are adding a third data center before decommissioning one of the earlier
>> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
>> cluster (not set to bootstrap, as documented in
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
>> ons/opsAddDCToCluster.html).
>>
>> When trying to rebuild nodes in the new DC from a previous DC (nodetool
>> rebuild -- DC1), we get the following error:
>>
>> Unable to find sufficient sources for streaming range
>> (597769692463489739,597931451954862346] in keyspace system_distributed
>>
>> The same error occurs which ever of the 2 existing DCs we try to rebuild
>> from.
>>
>> We run pr repairs (nodetool repair -pr) on all nodes twice a week via
>> cron.
>>
>> Any advice on how to get the rebuild started?
>>
>> Best regards,
>> Timo
>>
>>
>>
>>
>>
>
Re: Rebuild failing when adding new datacenter (3.0.8)
Posted by Alain RODRIGUEZ <ar...@gmail.com>.
It could be a bug.
Yet I am not very aware of this system_distributed keyspace, but from what
I see, it is using a simple strategy:
root@tlp-cassandra-2:~# echo "DESCRIBE KEYSPACE system_distributed;" |
cqlsh $(hostname -I | awk '{print $1}')
CREATE KEYSPACE system_distributed WITH replication = {'class':
'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
Let's first check some stuff. Could you share the output of:
- echo "DESCRIBE KEYSPACE system_distributed;" | cqlsh
[ip_address_of_the_server]
- nodetool status
- nodetool status system_distributed
- Let us know about the snitch you are using and the corresponding
configuration.
I am trying to make sure the command you used is expected to work, given
your setup.
My guess is this you might need to alter this keyspace accordingly to your
cluster setup.
Just guessing, hope that helps.
C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France
The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com
2016-09-22 15:47 GMT+02:00 Timo Ahokas <ti...@gmail.com>:
> Hi,
>
> We have a Cassandra 3.0.8 cluster (recently upgraded from 2.1.15)
> currently running in two data centers (13 and 19 nodes, RF3 in both). We
> are adding a third data center before decommissioning one of the earlier
> ones. Installing Cassandra (3.0.8) goes fine and all the nodes join the
> cluster (not set to bootstrap, as documented in
> https://docs.datastax.com/en/cassandra/3.0/cassandra/
> operations/opsAddDCToCluster.html).
>
> When trying to rebuild nodes in the new DC from a previous DC (nodetool
> rebuild -- DC1), we get the following error:
>
> Unable to find sufficient sources for streaming range (597769692463489739,597931451954862346]
> in keyspace system_distributed
>
> The same error occurs which ever of the 2 existing DCs we try to rebuild
> from.
>
> We run pr repairs (nodetool repair -pr) on all nodes twice a week via cron.
>
> Any advice on how to get the rebuild started?
>
> Best regards,
> Timo
>
>
>
>
>