You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jai Bheemsen Rao Dhanwada <ja...@gmail.com> on 2022/08/15 15:46:37 UTC

Cassandra 4.0 upgrade - Upgradesstables

Hello,

I am evaluating the upgrade from 3.11.x to 4.0.x and as per CASSANDRA-14197
<https://issues.apache.org/jira/browse/CASSANDRA-14197> we don't need to
run upgradesstables any more. We have tested this in a test environment and
see that setting "-Dcassandra.automatic_sstable_upgrade=true" takes care of
the upgrading sstables to the new format. Since my actual cluster has a
very high data size and more datacenters, I would like to know the general
guidance on the upgradesstables option. Is this option guaranteed to take
care of upgradesstables or it's a best case scenario and expects the users
to run the upgradesstables (as I see some posts still run upgradesstables
for 4.x upgrade). If it's taken care of automatically, is there a way I can
see the progress/ time it takes or does the user need not worry about how
long it takes but can just proceed with the upgrade on all the data
centers sequentially one after another?

Thanks in advance.

Re: Cassandra 4.0 upgrade - Upgradesstables

Posted by Jim Shaw <jx...@gmail.com>.
Though it is not required to run upgradesstables, but upgradesstables -a
will re-write the file to kick out tombstones, in sizeTieredcompaction, the
largest files may stay a long time to wait for the next compaction to
kick out tombstones.
So it really depends,  to run it or not,  usually upgrades have a change
window, applications may be no load or less load, why don't take the chance
to run it.

Regards,

Jim

On Tue, Aug 16, 2022 at 3:17 PM Jai Bheemsen Rao Dhanwada <
jaibheemsen@gmail.com> wrote:

> Thank you
>
> On Tue, Aug 16, 2022 at 11:48 AM C. Scott Andreas <sc...@paradoxica.net>
> wrote:
>
>> No downside at all for 3.x -> 4.x (however, Cassandra 3.x reading 2.1
>> SSTables incurred a performance hit).
>>
>> Many users of Cassandra don't run upgradesstables after 3.x -> 4.x
>> upgrades at all. It's not necessary to run until a hypothetical future time
>> if/when support for reading Cassandra 3.x SSTables is removed from
>> Cassandra. One of the most common reasons to avoid running upgradesstables
>> is because doing so causes 100% churn of the data files, meaning your
>> backup processes will need to upload a full copy of the data. Allowing
>> SSTables to organically churn into the new version via compaction avoids
>> this.
>>
>> If you're upgrading from 3.x to 4.x, don't feel like you have to - but it
>> does avoid the need to run upgradesstables in a hypothetical distant future.
>>
>> – Scott
>>
>> On Aug 16, 2022, at 6:32 AM, Jai Bheemsen Rao Dhanwada <
>> jaibheemsen@gmail.com> wrote:
>>
>>
>> Thank you Erick,
>>
>> > it is going to be single-threaded by default so it will take a while to
>> get through all the sstables on dense nodes
>> Is there any downside if the upgradesstables take longer (example 1-2
>> days), other than I/O?
>>
>> Also when is the upgradesstable get triggered? after every node is
>> upgraded or it will kick in only when all the nodes in the cluster upgraded
>> to 4.0.x?
>>
>> On Tue, Aug 16, 2022 at 2:12 AM Erick Ramirez <er...@apache.org>
>> wrote:
>>
>>> As convenient as it is, there are a few caveats and it isn't a silver
>>> bullet. The automatic feature will only kick in if there are no other
>>> compactions scheduled. Also, it is going to be single-threaded by default
>>> so it will take a while to get through all the sstables on dense nodes.
>>>
>>> In contrast, you'll have a bit more control if you manually upgrade the
>>> sstables. For example, you can schedule the upgrade during low traffic
>>> periods so reads are not competing with compactions for IO. Cheers!
>>>
>>>>
>>>>
>>

Re: Cassandra 4.0 upgrade - Upgradesstables

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Thank you

On Tue, Aug 16, 2022 at 11:48 AM C. Scott Andreas <sc...@paradoxica.net>
wrote:

> No downside at all for 3.x -> 4.x (however, Cassandra 3.x reading 2.1
> SSTables incurred a performance hit).
>
> Many users of Cassandra don't run upgradesstables after 3.x -> 4.x
> upgrades at all. It's not necessary to run until a hypothetical future time
> if/when support for reading Cassandra 3.x SSTables is removed from
> Cassandra. One of the most common reasons to avoid running upgradesstables
> is because doing so causes 100% churn of the data files, meaning your
> backup processes will need to upload a full copy of the data. Allowing
> SSTables to organically churn into the new version via compaction avoids
> this.
>
> If you're upgrading from 3.x to 4.x, don't feel like you have to - but it
> does avoid the need to run upgradesstables in a hypothetical distant future.
>
> – Scott
>
> On Aug 16, 2022, at 6:32 AM, Jai Bheemsen Rao Dhanwada <
> jaibheemsen@gmail.com> wrote:
>
>
> Thank you Erick,
>
> > it is going to be single-threaded by default so it will take a while to
> get through all the sstables on dense nodes
> Is there any downside if the upgradesstables take longer (example 1-2
> days), other than I/O?
>
> Also when is the upgradesstable get triggered? after every node is
> upgraded or it will kick in only when all the nodes in the cluster upgraded
> to 4.0.x?
>
> On Tue, Aug 16, 2022 at 2:12 AM Erick Ramirez <er...@apache.org>
> wrote:
>
>> As convenient as it is, there are a few caveats and it isn't a silver
>> bullet. The automatic feature will only kick in if there are no other
>> compactions scheduled. Also, it is going to be single-threaded by default
>> so it will take a while to get through all the sstables on dense nodes.
>>
>> In contrast, you'll have a bit more control if you manually upgrade the
>> sstables. For example, you can schedule the upgrade during low traffic
>> periods so reads are not competing with compactions for IO. Cheers!
>>
>>>
>>>
>

Re: Cassandra 4.0 upgrade - Upgradesstables

Posted by "C. Scott Andreas" <sc...@paradoxica.net>.
No downside at all for 3.x -> 4.x (however, Cassandra 3.x reading 2.1 SSTables incurred a performance hit).Many users of Cassandra don't run upgradesstables after 3.x -> 4.x upgrades at all. It's not necessary to run until a hypothetical future time if/when support for reading Cassandra 3.x SSTables is removed from Cassandra. One of the most common reasons to avoid running upgradesstables is because doing so causes 100% churn of the data files, meaning your backup processes will need to upload a full copy of the data. Allowing SSTables to organically churn into the new version via compaction avoids this.If you're upgrading from 3.x to 4.x, don't feel like you have to - but it does avoid the need to run upgradesstables in a hypothetical distant future.– ScottOn Aug 16, 2022, at 6:32 AM, Jai Bheemsen Rao Dhanwada <ja...@gmail.com> wrote:Thank you Erick,> it is going to be single-threaded by default so it will take a while to get through all the sstables on dense nodesIs there any downside if the upgradesstables take longer (example 1-2 days), other than I/O?Also when is the upgradesstable get triggered? after every node is upgraded or it will kick in only when all the nodes in the cluster upgraded to 4.0.x?On Tue, Aug 16, 2022 at 2:12 AM Erick Ramirez <er...@apache.org> wrote:As convenient as it is, there are a few caveats and it isn't a silver bullet. The automatic feature will only kick in if there are no other compactions scheduled. Also, it is going to be single-threaded by default so it will take a while to get through all the sstables on dense nodes.In contrast, you'll have a bit more control if you manually upgrade the sstables. For example, you can schedule the upgrade during low traffic periods so reads are not competing with compactions for IO. Cheers!

Re: Cassandra 4.0 upgrade - Upgradesstables

Posted by Jai Bheemsen Rao Dhanwada <ja...@gmail.com>.
Thank you Erick,

> it is going to be single-threaded by default so it will take a while to
get through all the sstables on dense nodes
Is there any downside if the upgradesstables take longer (example 1-2
days), other than I/O?

Also when is the upgradesstable get triggered? after every node is
upgraded or it will kick in only when all the nodes in the cluster upgraded
to 4.0.x?

On Tue, Aug 16, 2022 at 2:12 AM Erick Ramirez <er...@apache.org>
wrote:

> As convenient as it is, there are a few caveats and it isn't a silver
> bullet. The automatic feature will only kick in if there are no other
> compactions scheduled. Also, it is going to be single-threaded by default
> so it will take a while to get through all the sstables on dense nodes.
>
> In contrast, you'll have a bit more control if you manually upgrade the
> sstables. For example, you can schedule the upgrade during low traffic
> periods so reads are not competing with compactions for IO. Cheers!
>
>>

Re: Cassandra 4.0 upgrade - Upgradesstables

Posted by Erick Ramirez <er...@apache.org>.
As convenient as it is, there are a few caveats and it isn't a silver
bullet. The automatic feature will only kick in if there are no other
compactions scheduled. Also, it is going to be single-threaded by default
so it will take a while to get through all the sstables on dense nodes.

In contrast, you'll have a bit more control if you manually upgrade the
sstables. For example, you can schedule the upgrade during low traffic
periods so reads are not competing with compactions for IO. Cheers!

>