You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ekaterina Dimitrova (Jira)" <ji...@apache.org> on 2020/05/12 16:24:00 UTC

[jira] [Comment Edited] (CASSANDRA-15234) Standardise config and JVM parameters

    [ https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086118#comment-17086118 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-15234 at 5/12/20, 4:23 PM:
---------------------------------------------------------------------------

Done but with one caveat which was caught thankfully with the DTests. Unit tests were even on 100% green at some runs. (And my unit test was not covering the problem conversion explained below, I should add it!)
I am new to snakeyaml so I might be wrong, I would appreciate if someone who has the experience proves me wrong.
From the DTest failures I realized that turning into Strings the values from the YAML we hit one issue. To me it looks like we already have it actually with Integer and String parameters, just they are not many so it didn't bite us up to now. Not sure whether this was considered when the Config class was built or not.

So the issue with int parameters in Config in my solution is:
Currently in trunk, when we parse to int variable, If a parameter is commented in the cassandra.yaml, this is observed as null and it will raise an exception in some cases. (i.e. commitlog_sync_period_in_ms) But if it is not commented, just not assigned a value, then this is considered as 0 (as we parse to int). 
The situation is different with String, Integer. They will be in both cases null. No differentiator. But currently we don't have many parameters being Integer and String and this wasn't a problem.
Now with the units attached to the parameters' values in the cassandra.yaml file, we need to parse strings with snakeyaml and then convert to the proper int values in Config.  

A way to cope with this issue is to make the users use "". In this way, if we have a parameter not commented, but not also having an assigned value in cassandra.yaml, we can identify it by checking a string for being empty. If the parameter is commented or not presented, it will be null. But this would be one more new thing for the users to consider. That's why I came back to snakeyaml again and tried to think of a different way utilizing it. Unfortunately, up to now I didn't find any different solution. Also, the links to the examples in the official snakeyaml documentation are broken. Not sure whether there they had some interesting stuff. Maybe something with customized constructor but also, I try to make as less as possible disruptive changes in the codebase in order not to mess up something with the Config which might be extremely unpleasant.

I will be happy to hear an experienced opinion here. Does anyone know a better solution? Or are we ok to introduce  "" in the yaml for empty parameters/Strings recognition?

[Part1|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-2] (This part was already revised by [~dcapwell] but I paste the explanation again here for completeness).
 - New cassandra.yaml ---> currently cassandra-new.yaml in the branch as I wanted to test the backward compatibility on the old version (will have to rename properly also these yaml files with version probably)
reorganization of the file renames to make the names more consistent
- Backward compatibility with the old names - my approach is:
Cassandra comes with the new version of the yaml to ensure any new builds will be using it
During load of the yaml file I check whether old names are identified(means someone upgraded to 4 and wants to still use the old yaml). If this is the case, we load the values from the old yaml to the new variables (the already renamed parameters). Also, there is a warning to the user that there is new cassandra.yaml version and he/she can consider an update. 

[Part2|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-3] - units added to the values in the cassandra.yaml file. Parsing and unit conversion in place. One unit test added, should be further developed. 

[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest] updated accordingly for v.4

[Part3|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-5] - SysProperties class, accessors, hope I didn't miss anything... most of them are actually just using System.getProperty method.
TO DO: Extract the parser/converter in a separate class from Config

[~mck]? [~dcapwell]? [~benedict]?  Anyone else any thoughts here?

NOTE: The branches are not fully cleaned, this is still work in progress but we need to take a decision on the String issue first, I think.


was (Author: e.dimitrova):
Done but with one caveat which was caught thankfully with the DTests. Unit tests were even on 100% green at some runs. (And my unit test was not covering the problem conversion explained below, I should add it!)
I am new to snakeyaml so I might be wrong, I would appreciate if someone who has the experience proves me wrong.
From the DTest failures I realized that turning into Strings the values from the YAML we hit one issue. To me it looks like we already have it actually with Integer and String parameters, just they are not many so it didn't byte us up to now. Not sure whether this was considered when the Config class was built or not.

So the issue with int parameters in Config in my solution is:
Currently in trunk, when we parse to int variable, If a parameter is commented in the cassandra.yaml, this is observed as null and it will raise an exception in some cases. (i.e. commitlog_sync_period_in_ms) But if it is not commented, just not assigned a value, then this is considered as 0 (as we parse to int). 
The situation is different with String, Integer. They will be in both cases null. No differentiator. But currently we don't have many parameters being Integer and String and this wasn't a problem.
Now with the units attached to the parameters' values in the cassandra.yaml file, we need to parse strings with snakeyaml and then convert to the proper int values in Config.  

A way to cope with this issue is to make the users use "". In this way, if we have a parameter not commented, but not also having an assigned value in cassandra.yaml, we can identify it by checking a string for being empty. If the parameter is commented or not presented, it will be null. But this would be one more new thing for the users to consider. That's why I came back to snakeyaml again and tried to think of a different way utilizing it. Unfortunately, up to now I didn't find any different solution. Also, the links to the examples in the official snakeyaml documentation are broken. Not sure whether there they had some interesting stuff. Maybe something with customized constructor but also, I try to make as less as possible disruptive changes in the codebase in order not to mess up something with the Config which might be extremely unpleasant.

I will be happy to hear an experienced opinion here. Does anyone know a better solution? Or are we ok to introduce  "" in the yaml for empty parameters/Strings recognition?

[Part1|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-2] (This part was already revised by [~dcapwell] but I paste the explanation again here for completeness).
 - New cassandra.yaml ---> currently cassandra-new.yaml in the branch as I wanted to test the backward compatibility on the old version (will have to rename properly also these yaml files with version probably)
reorganization of the file renames to make the names more consistent
- Backward compatibility with the old names - my approach is:
Cassandra comes with the new version of the yaml to ensure any new builds will be using it
During load of the yaml file I check whether old names are identified(means someone upgraded to 4 and wants to still use the old yaml). If this is the case, we load the values from the old yaml to the new variables (the already renamed parameters). Also, there is a warning to the user that there is new cassandra.yaml version and he/she can consider an update. 

[Part2|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-3] - units added to the values in the cassandra.yaml file. Parsing and unit conversion in place. One unit test added, should be further developed. 

[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest] updated accordingly for v.4

[Part3|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-15234-5] - SysProperties class, accessors, hope I didn't miss anything... most of them are actually just using System.getProperty method.
TO DO: Extract the parser/converter in a separate class from Config

[~mck]? [~dcapwell]? [~benedict]?  Anyone else any thoughts here?

NOTE: The branches are not fully cleaned, this is still work in progress but we need to take a decision on the String issue first, I think.

> Standardise config and JVM parameters
> -------------------------------------
>
>                 Key: CASSANDRA-15234
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Config
>            Reporter: Benedict Elliott Smith
>            Assignee: Ekaterina Dimitrova
>            Priority: Normal
>             Fix For: 4.0, 4.0-beta
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, both from the yams and JVM properties.  It would be nice to standardise the naming (such as otc_ vs internode_) as well as the provision of values with units - while maintaining perpetual backwards compatibility with the old parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or powers of 1000 such as {{KB/s}}, given these are regularly used for either their old or new definition e.g. {{KiB/s}}, or we could support them and simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org