You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by "Claude Warren, Jr via dev" <de...@cassandra.apache.org> on 2024/03/18 14:39:41 UTC

Default table compression defined in yaml.

After much work by several people, I have pulled together the changes to
define the default compression in the cassandra.yaml file and have created
a pull request [1].

If you are interested this in topic, please take a look at the changes and
give at least a cursory review.

[1]  https://github.com/apache/cassandra/pull/3168

Thanks,
Claude

Re: Default table compression defined in yaml.

Posted by "Claude Warren, Jr via dev" <de...@cassandra.apache.org>.
The earlier request was to be able to take the CQL statement and (with very
little modification) use it in the YAML.  While I agree that introducing
the new setting to remove it later is a bit wonky, it is necessary to
support the current CQL statement.  Unless the CQL statement has changed
already.

On Tue, Mar 19, 2024 at 10:52 AM Bowen Song via dev <
dev@cassandra.apache.org> wrote:

> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
> introduce new settings entries with the deprecated format only to be
> removed at a later version?
>
>
> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>
> After much work by several people, I have pulled together the changes to
> define the default compression in the cassandra.yaml file and have created
> a pull request [1].
>
> If you are interested this in topic, please take a look at the changes and
> give at least a cursory review.
>
> [1]  https://github.com/apache/cassandra/pull/3168
>
> Thanks,
> Claude
>
>

Re: Default table compression defined in yaml.

Posted by "Claude Warren, Jr via dev" <de...@cassandra.apache.org>.
The text I posted above is directly from the yaml.  Is it intended that
"sstable:" is to be first segment of the yaml key for
"default_compaction"?  If so, it won't because column_index_cache starts in
the first column.

I am happy to move the new configuration section, but I don't follow how
this is to work.



On Thu, Mar 21, 2024 at 1:23 PM Jacek Lewandowski <
lewandowski.jacek@gmail.com> wrote:

> Only indented items below "sstable" belong to "sstable". It is commented
> out by default to make it clear that it is not required and the default
> values apply.
>
> There are a number of sstable parameters which are historically spread
> across the yaml with no structure. The point is that we should not add to
> that mess and try to group the new stuff.
>
> "default_compression" under ""sstable" key sounds good to me.
>
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
> czw., 21 mar 2024 o 08:32 Claude Warren, Jr via dev <
> dev@cassandra.apache.org> napisał(a):
>
>> Jacek,
>>
>> I am a bit confused here.  I find a key for "sstable" in the yaml but it
>> is commented out by default.  There are a number of options under it that
>> are commented out and then one that is not and then the
>> "default_compaction" section, which I assume is supposed to apply to the
>> "sstable" section.  Are you saying that the "sstable_compression" section
>> that we introduced should be placed as a child to the "sstable" key (and
>> probably renamed to default_compression"?
>>
>> I have included the keys from the trunk yaml below with non-key comments
>> excluded.  The way I read it either the "sstable" key is not required and a
>> user can just uncomment "column_index_size"; or "column_index_cache_size"
>> is not really used because it would be under
>> "sstable/column_index_cache_size" in the Config; or the "sstable:" is only
>> intended to be a visual break / section for the human editor.
>>
>> Can you or someone clarify this form me?
>>
>> #sstable:
>> #  selected_format: big
>> # column_index_size: 4KiB
>> column_index_cache_size: 2KiB
>> # default_compaction:
>> #   class_name: SizeTieredCompactionStrategy
>> #   parameters:
>> #     min_threshold: 4
>> #     max_threshold: 32
>>
>> On Wed, Mar 20, 2024 at 10:31 PM Jacek Lewandowski <
>> lewandowski.jacek@gmail.com> wrote:
>>
>>> Compression params for sstables should be under the "sstable" key.
>>>
>>>
>>> - - -- --- ----- -------- -------------
>>> Jacek Lewandowski
>>>
>>>
>>> wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova <e....@gmail.com>
>>> napisał(a):
>>>
>>>> Any new settings are expected to be added in the new format
>>>>
>>>> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <
>>>> dev@cassandra.apache.org> wrote:
>>>>
>>>>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>>>>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>>>>> introduce new settings entries with the deprecated format only to be
>>>>> removed at a later version?
>>>>>
>>>>>
>>>>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>>>>
>>>>> After much work by several people, I have pulled together the changes
>>>>> to define the default compression in the cassandra.yaml file and have
>>>>> created a pull request [1].
>>>>>
>>>>> If you are interested this in topic, please take a look at the changes
>>>>> and give at least a cursory review.
>>>>>
>>>>> [1]  https://github.com/apache/cassandra/pull/3168
>>>>>
>>>>> Thanks,
>>>>> Claude
>>>>>
>>>>>

Re: Default table compression defined in yaml.

Posted by Jacek Lewandowski <le...@gmail.com>.
Only indented items below "sstable" belong to "sstable". It is commented
out by default to make it clear that it is not required and the default
values apply.

There are a number of sstable parameters which are historically spread
across the yaml with no structure. The point is that we should not add to
that mess and try to group the new stuff.

"default_compression" under ""sstable" key sounds good to me.

- - -- --- ----- -------- -------------
Jacek Lewandowski


czw., 21 mar 2024 o 08:32 Claude Warren, Jr via dev <
dev@cassandra.apache.org> napisał(a):

> Jacek,
>
> I am a bit confused here.  I find a key for "sstable" in the yaml but it
> is commented out by default.  There are a number of options under it that
> are commented out and then one that is not and then the
> "default_compaction" section, which I assume is supposed to apply to the
> "sstable" section.  Are you saying that the "sstable_compression" section
> that we introduced should be placed as a child to the "sstable" key (and
> probably renamed to default_compression"?
>
> I have included the keys from the trunk yaml below with non-key comments
> excluded.  The way I read it either the "sstable" key is not required and a
> user can just uncomment "column_index_size"; or "column_index_cache_size"
> is not really used because it would be under
> "sstable/column_index_cache_size" in the Config; or the "sstable:" is only
> intended to be a visual break / section for the human editor.
>
> Can you or someone clarify this form me?
>
> #sstable:
> #  selected_format: big
> # column_index_size: 4KiB
> column_index_cache_size: 2KiB
> # default_compaction:
> #   class_name: SizeTieredCompactionStrategy
> #   parameters:
> #     min_threshold: 4
> #     max_threshold: 32
>
> On Wed, Mar 20, 2024 at 10:31 PM Jacek Lewandowski <
> lewandowski.jacek@gmail.com> wrote:
>
>> Compression params for sstables should be under the "sstable" key.
>>
>>
>> - - -- --- ----- -------- -------------
>> Jacek Lewandowski
>>
>>
>> wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova <e....@gmail.com>
>> napisał(a):
>>
>>> Any new settings are expected to be added in the new format
>>>
>>> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <
>>> dev@cassandra.apache.org> wrote:
>>>
>>>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>>>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>>>> introduce new settings entries with the deprecated format only to be
>>>> removed at a later version?
>>>>
>>>>
>>>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>>>
>>>> After much work by several people, I have pulled together the changes
>>>> to define the default compression in the cassandra.yaml file and have
>>>> created a pull request [1].
>>>>
>>>> If you are interested this in topic, please take a look at the changes
>>>> and give at least a cursory review.
>>>>
>>>> [1]  https://github.com/apache/cassandra/pull/3168
>>>>
>>>> Thanks,
>>>> Claude
>>>>
>>>>

Re: Default table compression defined in yaml.

Posted by "Claude Warren, Jr via dev" <de...@cassandra.apache.org>.
Jacek,

I am a bit confused here.  I find a key for "sstable" in the yaml but it is
commented out by default.  There are a number of options under it that are
commented out and then one that is not and then the "default_compaction"
section, which I assume is supposed to apply to the "sstable" section.  Are
you saying that the "sstable_compression" section that we introduced should
be placed as a child to the "sstable" key (and probably renamed to
default_compression"?

I have included the keys from the trunk yaml below with non-key comments
excluded.  The way I read it either the "sstable" key is not required and a
user can just uncomment "column_index_size"; or "column_index_cache_size"
is not really used because it would be under
"sstable/column_index_cache_size" in the Config; or the "sstable:" is only
intended to be a visual break / section for the human editor.

Can you or someone clarify this form me?

#sstable:
#  selected_format: big
# column_index_size: 4KiB
column_index_cache_size: 2KiB
# default_compaction:
#   class_name: SizeTieredCompactionStrategy
#   parameters:
#     min_threshold: 4
#     max_threshold: 32

On Wed, Mar 20, 2024 at 10:31 PM Jacek Lewandowski <
lewandowski.jacek@gmail.com> wrote:

> Compression params for sstables should be under the "sstable" key.
>
>
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
> wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova <e....@gmail.com>
> napisał(a):
>
>> Any new settings are expected to be added in the new format
>>
>> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <de...@cassandra.apache.org>
>> wrote:
>>
>>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>>> introduce new settings entries with the deprecated format only to be
>>> removed at a later version?
>>>
>>>
>>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>>
>>> After much work by several people, I have pulled together the changes to
>>> define the default compression in the cassandra.yaml file and have created
>>> a pull request [1].
>>>
>>> If you are interested this in topic, please take a look at the changes
>>> and give at least a cursory review.
>>>
>>> [1]  https://github.com/apache/cassandra/pull/3168
>>>
>>> Thanks,
>>> Claude
>>>
>>>

Re: Default table compression defined in yaml.

Posted by Jacek Lewandowski <le...@gmail.com>.
Compression params for sstables should be under the "sstable" key.


- - -- --- ----- -------- -------------
Jacek Lewandowski


wt., 19 mar 2024 o 13:10 Ekaterina Dimitrova <e....@gmail.com>
napisał(a):

> Any new settings are expected to be added in the new format
>
> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <de...@cassandra.apache.org>
> wrote:
>
>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>> introduce new settings entries with the deprecated format only to be
>> removed at a later version?
>>
>>
>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>
>> After much work by several people, I have pulled together the changes to
>> define the default compression in the cassandra.yaml file and have created
>> a pull request [1].
>>
>> If you are interested this in topic, please take a look at the changes
>> and give at least a cursory review.
>>
>> [1]  https://github.com/apache/cassandra/pull/3168
>>
>> Thanks,
>> Claude
>>
>>

Re: Default table compression defined in yaml.

Posted by "Claude Warren, Jr via dev" <de...@cassandra.apache.org>.
We can not support both the "rule" that new settings have the new format,
and the design goal that the CQL statement format be accepted.

We came to a compromise.
We introduced the new chunk_length parameter that requires a DataStorageSpec
We reused the CQL chunk_length_in_kb parameter to accept the CQL format.

I think this is a reasonable compromise.  We could emphasize the
chunk_length parameter in documentation changes and leave the
chunk_length_in_kb parameter to a discussion of converting CQL to YAML
configuration.
We could put in a log message that recommends the correct chunk_length
parameter based on chunk_length_in_kb value.  But I do not see a way to
support both requirements for the new format and the CQL format support.

We can deprecate the chunk_length_in_kb as soon as CQL changes to use a
DataStorageSpec for its parameter, but I do not know of any proposals to
change CQL.


On Tue, Mar 19, 2024 at 1:09 PM Ekaterina Dimitrova <e....@gmail.com>
wrote:

> Any new settings are expected to be added in the new format
>
> On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <de...@cassandra.apache.org>
> wrote:
>
>> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
>> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
>> introduce new settings entries with the deprecated format only to be
>> removed at a later version?
>>
>>
>> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>>
>> After much work by several people, I have pulled together the changes to
>> define the default compression in the cassandra.yaml file and have created
>> a pull request [1].
>>
>> If you are interested this in topic, please take a look at the changes
>> and give at least a cursory review.
>>
>> [1]  https://github.com/apache/cassandra/pull/3168
>>
>> Thanks,
>> Claude
>>
>>

Re: Default table compression defined in yaml.

Posted by Ekaterina Dimitrova <e....@gmail.com>.
Any new settings are expected to be added in the new format

On Tue, 19 Mar 2024 at 5:52, Bowen Song via dev <de...@cassandra.apache.org>
wrote:

> I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is
> deprecated, and the new format is `foobar: 123KiB`. Is there a need to
> introduce new settings entries with the deprecated format only to be
> removed at a later version?
>
>
> On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
>
> After much work by several people, I have pulled together the changes to
> define the default compression in the cassandra.yaml file and have created
> a pull request [1].
>
> If you are interested this in topic, please take a look at the changes and
> give at least a cursory review.
>
> [1]  https://github.com/apache/cassandra/pull/3168
>
> Thanks,
> Claude
>
>

Re: Default table compression defined in yaml.

Posted by Bowen Song via dev <de...@cassandra.apache.org>.
I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is 
deprecated, and the new format is `foobar: 123KiB`. Is there a need to 
introduce new settings entries with the deprecated format only to be 
removed at a later version?


On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:
> After much work by several people, I have pulled together the changes 
> to define the default compression in the cassandra.yaml file and have 
> created a pull request [1].
>
> If you are interested this in topic, please take a look at the changes 
> and give at least a cursory review.
>
> [1] https://github.com/apache/cassandra/pull/3168 
> <https://github.com/apache/cassandra/pull/3168>
>
> Thanks,
> Claude