You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Reynald Bourtembourg <re...@esrf.fr> on 2015/11/18 16:45:49 UTC

Migrating to incremental repairs

Hi,

We currently have a 3 nodes Cassandra cluster with RF = 3.
We are using Cassandra 2.1.7.

We would like to start using incremental repairs.
We have some tables using LCS compaction strategy and some others using 
STCS.

Here is the procedure written in the documentation:

To migrate to incremental repair, one node at a time:

 1. Disable compaction on the node using nodetool disableautocompaction.
 2. Run the default full, sequential repair.
 3. Stop the node.
 4. Use the tool sstablerepairedset to mark all the SSTables that were
    created before you disabled compaction.
 5. Restart cassandra


In our case, a full sequential repair takes about 5 days.
If I follow the procedure described above and if my understanding is 
correct, it's gonna take at least 15 days (3 repairs of 5 days) before 
to be able to use the incremental repairs, right(?), since we need to do 
it one node at a time (one full sequential repair per node?).

If my understanding is correct, what is the rationale behind the fact 
that we need to run a full sequential repair once for each node?
I understood a full sequential repair would repair all the sstables on 
all the nodes. So doing it only once should be enough, right?

Is it possible to do the following instead of what is written in the 
documentation?:
  - disableautocompaction on all nodes at the same time
  - run the full sequential repair
  - For each node:
         - stop one node
         - Use the tool sstablerepairedset to mark all the SSTables that 
were created before you disabled compaction.
         - Restart Cassandra
Without having to run the full sequential repair 3 times?

The documentation states that if we don't execute this migration 
procedure, the first time we will run incremental repair, Cassandra will 
perform size-tiering on all SSTables because the repair/unrepaired 
status is unknown and this operation can take a long time.
Do you think this operation could take more than 15 days in our case?

I understood that we only need to use sstablerepairedset on the SSTables 
related to the tables using LCS compaction strategy and which were 
created before the auto compaction was disabled.
Is my understanding correct?

The documentation is not very explicit but I suppose the following sentence:
"4. Use the tool sstablerepairedset to mark all the SSTables that were 
created before you disabled compaction."
means we need to invoke "sstablerepairedset --is-repaired -f 
list_of_sstable_names.txt" on the LCS SSTables that were created before 
the compaction was disabled.

Is this correct?

Do we need to enableautocompaction again after the Cassandra restart or 
is it done automatically?

Would you recommend us to upgrade our Cassandra version before starting 
the incremental repair migration?

Thank you for your help and sorry for the long e-mail.

Reynald






Re: [Marketing Mail] Migrating to incremental repairs

Posted by Reynald Bourtembourg <re...@esrf.fr>.
Dear Lorina,

Thank you very much for your answer and to your support team.
I understand why you won't document the special case RF = N = 3. Thank 
you for clarifying this special case.

I've found this answer from Marcus Eriksson stating that autocompaction 
should be reenabled after the migration:
https://www.mail-archive.com/user@cassandra.apache.org/msg40305.html

It would be nice to add this step in the documentation if it's not 
automatic after the cassandra restart.

What about what Stefano was asking? (thanks by the way, Stefano!)
Can anyone confirm the recent versions of Cassandra (from which version) 
do not require anything specific but the -inc switch when migrating to 
incremental repair?
I can understand the migration procedure is not mandatory, especially 
when there is not much data to migrate, but the documentation seems to 
imply that if the migration steps are not done, the first incremental 
repair could take a very long time.
Can anyone clarify this point please?
Did anyone try incremental repairs without the migration procedure with 
a sensible amount of data to migrate?
How much longer did it take?

Thank you very much for your help

Kind regards

Reynald

On 19/11/2015 22:43, Lorina Poland wrote:
> Hi Reynald,
>
> I asked Support about your email. They said that since your RF = N (3 
> nodes, 3 replicates), one full repair on any node will suffice. You 
> are an edge case, so we won't be documenting this solution.
> As for your other questions, I'm guessing you are not a DataStax 
> customer. I would recommend the Apache Cassandra user forum 
> (http://www.mail-archive.com/user@cassandra.apache.org/), where you 
> can ask your questions and receive feedback. Stackoverflow is another 
> good forum for Cassandra questions 
> (http://stackoverflow.com/questions/tagged/cassandra).
>
> Sincerely,
> Lorina Poland
>
> datastax_logo.png <http://www.datastax.com/>
>
> LORINA POLAND
>
> Senior Technical Writer | lorina@datastax.com <ma...@datastax.com>
>
> linkedin.png <https://www.linkedin.com/in/lorinapoland>twitter.png 
> <https://twitter.com/datastax>
>
> <http://cassandrasummit-datastax.com/?utm_campaign=summit15&utm_medium=summiticon&utm_source=emailsignature>
>
> On Thu, Nov 19, 2015 at 3:13 AM, Stefano Ortolani <ostefano@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     As far as I know, docs is quite inconsistent on the matter.
>     Based on some research here and on IRC, recent versions of
>     Cassandra do no require anything specific when migrating to
>     incremental repairs but the the -inc switch even on LCS.
>     Any confirmation on the matter is more than welcome.
>     Regards,
>     Stefano
>     On Wed, Nov 18, 2015 at 3:59 PM, Reynald Bourtembourg
>     <reynald.bourtembourg@esrf.fr
>     <ma...@esrf.fr>> wrote:
>
>         Well, By re-reading my e-mail, I understood the rationale
>         behind doing a full sequential repair for each node. I was
>         confused by the fact that in our case, we have 3 nodes with RF
>         = 3, so all the nodes are storing all replicas. So we are in a
>         special case. As soon as you have more than 3 nodes, this is
>         no longer the case. In any case, in our special case (3 nodes
>         and RF=3), could we apply the following migration procedure?:-
>         disableautocompaction on all nodes at the same time  - run the
>         full sequential repair  - For each node:         - stop the
>         node        - Use the tool sstablerepairedset to mark all the
>         SSTables that were created before you disabled compaction.    
>             - Restart Cassandra I'd be glad if someone could answer to
>         my other questions in any case ;-). Thanks in advance for your
>         helpReynald
>         On 18/11/2015 16:45, Reynald Bourtembourg wrote:
>>         Hi, We currently have a 3 nodes Cassandra cluster with RF =
>>         3. We are using Cassandra 2.1.7. We would like to start using
>>         incremental repairs. We have some tables using LCS compaction
>>         strategy and some others using STCS. Here is the procedure
>>         written in the documentation:
>>
>>         To migrate to incremental repair, one node at a time:
>>
>>          1. Disable compaction on the node using nodetool
>>             disableautocompaction.
>>          2. Run the default full, sequential repair.
>>          3. Stop the node.
>>          4. Use the tool sstablerepairedset to mark all the SSTables
>>             that were created before you disabled compaction.
>>          5. Restart cassandra
>>
>>         In our case, a full sequential repair takes about 5 days. If
>>         I follow the procedure described above and if my
>>         understanding is correct, it's gonna take at least 15 days (3
>>         repairs of 5 days) before to be able to use the incremental
>>         repairs, right(?), since we need to do it one node at a time
>>         (one full sequential repair per node?). If my understanding
>>         is correct, what is the rationale behind the fact that we
>>         need to run a full sequential repair once for each node? I
>>         understood a full sequential repair would repair all the
>>         sstables on all the nodes. So doing it only once should be
>>         enough, right? Is it possible to do the following instead of
>>         what is written in the documentation?:  -
>>         disableautocompaction on all nodes at the same time  - run
>>         the full sequential repair  - For each node:         - stop
>>         one node         - Use the tool sstablerepairedset to mark
>>         all the SSTables that were created before you disabled
>>         compaction.         - Restart Cassandra Without having to run
>>         the full sequential repair 3 times? The documentation states
>>         that if we don't execute this migration procedure, the first
>>         time we will run incremental repair, Cassandra will perform
>>         size-tiering on all SSTables because the repair/unrepaired
>>         status is unknown and this operation can take a long time. Do
>>         you think this operation could take more than 15 days in our
>>         case? I understood that we only need to use
>>         sstablerepairedset on the SSTables related to the tables
>>         using LCS compaction strategy and which were created before
>>         the auto compaction was disabled. Is my understanding
>>         correct? The documentation is not very explicit but I suppose
>>         the following sentence: "4. Use the tool sstablerepairedset
>>         to mark all the SSTables that were created before you
>>         disabled compaction." means we need to invoke
>>         "sstablerepairedset --is-repaired -f
>>         list_of_sstable_names.txt" on the LCS SSTables that were
>>         created before the compaction was disabled. Is this correct?
>>         Do we need to enableautocompaction again after the Cassandra
>>         restart or is it done automatically? Would you recommend us
>>         to upgrade our Cassandra version before starting the
>>         incremental repair migration? Thank you for your help and
>>         sorry for the long e-mail. Reynald 
>

Re: [Marketing Mail] Migrating to incremental repairs

Posted by Stefano Ortolani <os...@gmail.com>.
As far as I know, docs is quite inconsistent on the matter.
Based on some research here and on IRC, recent versions of Cassandra do no
require anything specific when migrating to incremental repairs but the the
-inc switch even on LCS.

Any confirmation on the matter is more than welcome.

Regards,
Stefano

On Wed, Nov 18, 2015 at 3:59 PM, Reynald Bourtembourg <
reynald.bourtembourg@esrf.fr> wrote:

> Well,
>
> By re-reading my e-mail, I understood the rationale behind doing a full
> sequential repair for each node.
> I was confused by the fact that in our case, we have 3 nodes with RF = 3,
> so all the nodes are storing all replicas.
> So we are in a special case.
> As soon as you have more than 3 nodes, this is no longer the case.
>
> In any case, in our special case (3 nodes and RF=3), could we apply the
> following migration procedure?:
> - disableautocompaction on all nodes at the same time
>  - run the full sequential repair
>  - For each node:
>         - stop the node
>         - Use the tool sstablerepairedset to mark all the SSTables that
> were created before you disabled compaction.
>         - Restart Cassandra
>
> I'd be glad if someone could answer to my other questions in any case ;-).
>
> Thanks in advance for your help
>
> Reynald
>
>
>
> On 18/11/2015 16:45, Reynald Bourtembourg wrote:
>
> Hi,
>
> We currently have a 3 nodes Cassandra cluster with RF = 3.
> We are using Cassandra 2.1.7.
>
> We would like to start using incremental repairs.
> We have some tables using LCS compaction strategy and some others using
> STCS.
>
> Here is the procedure written in the documentation:
>
> To migrate to incremental repair, one node at a time:
>
>    1. Disable compaction on the node using nodetool disableautocompaction.
>    2. Run the default full, sequential repair.
>    3. Stop the node.
>    4. Use the tool sstablerepairedset to mark all the SSTables that were
>    created before you disabled compaction.
>    5. Restart cassandra
>
>
> In our case, a full sequential repair takes about 5 days.
> If I follow the procedure described above and if my understanding is
> correct, it's gonna take at least 15 days (3 repairs of 5 days) before to
> be able to use the incremental repairs, right(?), since we need to do it
> one node at a time (one full sequential repair per node?).
>
> If my understanding is correct, what is the rationale behind the fact that
> we need to run a full sequential repair once for each node?
> I understood a full sequential repair would repair all the sstables on all
> the nodes. So doing it only once should be enough, right?
>
> Is it possible to do the following instead of what is written in the
> documentation?:
>  - disableautocompaction on all nodes at the same time
>  - run the full sequential repair
>  - For each node:
>         - stop one node
>         - Use the tool sstablerepairedset to mark all the SSTables that
> were created before you disabled compaction.
>         - Restart Cassandra
> Without having to run the full sequential repair 3 times?
>
> The documentation states that if we don't execute this migration
> procedure, the first time we will run incremental repair, Cassandra will
> perform size-tiering on all SSTables because the repair/unrepaired status
> is unknown and this operation can take a long time.
> Do you think this operation could take more than 15 days in our case?
>
> I understood that we only need to use sstablerepairedset on the SSTables
> related to the tables using LCS compaction strategy and which were created
> before the auto compaction was disabled.
> Is my understanding correct?
>
> The documentation is not very explicit but I suppose the following
> sentence:
> "4. Use the tool sstablerepairedset to mark all the SSTables that were
> created before you disabled compaction."
> means we need to invoke "sstablerepairedset --is-repaired -f
> list_of_sstable_names.txt" on the LCS SSTables that were created before the
> compaction was disabled.
>
> Is this correct?
>
> Do we need to enableautocompaction again after the Cassandra restart or is
> it done automatically?
>
> Would you recommend us to upgrade our Cassandra version before starting
> the incremental repair migration?
>
> Thank you for your help and sorry for the long e-mail.
>
> Reynald
>
>
>
>
>
>
>

Re: [Marketing Mail] Migrating to incremental repairs

Posted by Reynald Bourtembourg <re...@esrf.fr>.
Well,

By re-reading my e-mail, I understood the rationale behind doing a full 
sequential repair for each node.
I was confused by the fact that in our case, we have 3 nodes with RF = 
3, so all the nodes are storing all replicas.
So we are in a special case.
As soon as you have more than 3 nodes, this is no longer the case.

In any case, in our special case (3 nodes and RF=3), could we apply the 
following migration procedure?:
- disableautocompaction on all nodes at the same time
  - run the full sequential repair
  - For each node:
         - stop the node
         - Use the tool sstablerepairedset to mark all the SSTables that 
were created before you disabled compaction.
         - Restart Cassandra

I'd be glad if someone could answer to my other questions in any case ;-).

Thanks in advance for your help

Reynald


On 18/11/2015 16:45, Reynald Bourtembourg wrote:
> Hi,
>
> We currently have a 3 nodes Cassandra cluster with RF = 3.
> We are using Cassandra 2.1.7.
>
> We would like to start using incremental repairs.
> We have some tables using LCS compaction strategy and some others 
> using STCS.
>
> Here is the procedure written in the documentation:
>
> To migrate to incremental repair, one node at a time:
>
>  1. Disable compaction on the node using nodetool disableautocompaction.
>  2. Run the default full, sequential repair.
>  3. Stop the node.
>  4. Use the tool sstablerepairedset to mark all the SSTables that were
>     created before you disabled compaction.
>  5. Restart cassandra
>
>
> In our case, a full sequential repair takes about 5 days.
> If I follow the procedure described above and if my understanding is 
> correct, it's gonna take at least 15 days (3 repairs of 5 days) before 
> to be able to use the incremental repairs, right(?), since we need to 
> do it one node at a time (one full sequential repair per node?).
>
> If my understanding is correct, what is the rationale behind the fact 
> that we need to run a full sequential repair once for each node?
> I understood a full sequential repair would repair all the sstables on 
> all the nodes. So doing it only once should be enough, right?
>
> Is it possible to do the following instead of what is written in the 
> documentation?:
>  - disableautocompaction on all nodes at the same time
>  - run the full sequential repair
>  - For each node:
>         - stop one node
>         - Use the tool sstablerepairedset to mark all the SSTables 
> that were created before you disabled compaction.
>         - Restart Cassandra
> Without having to run the full sequential repair 3 times?
>
> The documentation states that if we don't execute this migration 
> procedure, the first time we will run incremental repair, Cassandra 
> will perform size-tiering on all SSTables because the 
> repair/unrepaired status is unknown and this operation can take a long 
> time.
> Do you think this operation could take more than 15 days in our case?
>
> I understood that we only need to use sstablerepairedset on the 
> SSTables related to the tables using LCS compaction strategy and which 
> were created before the auto compaction was disabled.
> Is my understanding correct?
>
> The documentation is not very explicit but I suppose the following 
> sentence:
> "4. Use the tool sstablerepairedset to mark all the SSTables that were 
> created before you disabled compaction."
> means we need to invoke "sstablerepairedset --is-repaired -f 
> list_of_sstable_names.txt" on the LCS SSTables that were created 
> before the compaction was disabled.
>
> Is this correct?
>
> Do we need to enableautocompaction again after the Cassandra restart 
> or is it done automatically?
>
> Would you recommend us to upgrade our Cassandra version before 
> starting the incremental repair migration?
>
> Thank you for your help and sorry for the long e-mail.
>
> Reynald
>
>
>
>
>