You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Saranya Krishnakumar (Jira)" <ji...@apache.org> on 2022/11/04 20:10:00 UTC

[jira] [Updated] (CASSANDRA-17870) nodetool/rebuild: Add flag to exclude nodes from local datacenter

     [ https://issues.apache.org/jira/browse/CASSANDRA-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Saranya Krishnakumar updated CASSANDRA-17870:
---------------------------------------------
    Description: 
During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild the data from other DCs. If src-dc is not passed explicitly, then C* tries to rebuild the data from the same (new dc) dc. 

We don’t exclude other nodes in the same DC. Only down sources and the local node itself are excluded.
```
 // We're _always_ filtering out a local node and down sources
        addSourceFilter(new RangeStreamer.FailureDetectorSourceFilter(failureDetector));
        addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter());
```

We should fix nodetool/rebuild to exclude the local DC (from where we’re executing the command) while issuing nodetool/rebuild without passing src dc

 

Example:
in a 3 DC cluster, 
ks1 has DC1, DC2
ks2 has DC1, DC2, DC3
ks3 has DC2

now, we add a new DC [DC4] and configured it to all 3 keyspaces.

if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. 

Now, without src DC, the expectation is rebuild would auto pick up DCs for each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail due to under-replicated keyspaces.

The issue with this approach (without src dc) is that, DC4 is getting picked up during rebuild (as src), but DC4 does not have any data yet!

so, with the patch (ignore local dc flag), DC4 can be filtered out and let the database pick up the right dc for each keyspace [from existing 3 DCs]. 
  -- this is what is the expectation after the patch.

  was:
During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild the data from other DCs. If src-dc is not passed explicitly, then C* tries to rebuild the data from the same (new dc) dc. 

We don’t exclude other nodes in the same DC. Only down sources and the local node itself are excluded.
```
 // We're _always_ filtering out a local node and down sources
        addSourceFilter(new RangeStreamer.FailureDetectorSourceFilter(failureDetector));
        addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter());
```

We should fix nodetool/rebuild to exclude the local DC (from where we’re executing the command) while issuing nodetool/rebuild without passing src dc


> nodetool/rebuild: Add flag to exclude nodes from local datacenter
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-17870
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17870
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Tool/nodetool
>            Reporter: Saranya Krishnakumar
>            Assignee: Saranya Krishnakumar
>            Priority: Normal
>         Attachments: fix_nodetool_rebuild.diff
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> During expansion by Dc, when we issue nodetool/rebuild from new dc to rebuild the data from other DCs. If src-dc is not passed explicitly, then C* tries to rebuild the data from the same (new dc) dc. 
> We don’t exclude other nodes in the same DC. Only down sources and the local node itself are excluded.
> ```
>  // We're _always_ filtering out a local node and down sources
>         addSourceFilter(new RangeStreamer.FailureDetectorSourceFilter(failureDetector));
>         addSourceFilter(new RangeStreamer.ExcludeLocalNodeFilter());
> ```
> We should fix nodetool/rebuild to exclude the local DC (from where we’re executing the command) while issuing nodetool/rebuild without passing src dc
>  
> Example:
> in a 3 DC cluster, 
> ks1 has DC1, DC2
> ks2 has DC1, DC2, DC3
> ks3 has DC2
> now, we add a new DC [DC4] and configured it to all 3 keyspaces.
> if we run rebuild with src DC as DC1, ks3 will fail as it does not have DC1. 
> Now, without src DC, the expectation is rebuild would auto pick up DCs for each keyspace (let's say ks1: DC1, ks2: DC1, ks3: DC2) and would never fail due to under-replicated keyspaces.
> The issue with this approach (without src dc) is that, DC4 is getting picked up during rebuild (as src), but DC4 does not have any data yet!
> so, with the patch (ignore local dc flag), DC4 can be filtered out and let the database pick up the right dc for each keyspace [from existing 3 DCs]. 
>   -- this is what is the expectation after the patch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org