You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by "benedict@apache.org" <be...@apache.org> on 2022/03/14 09:41:35 UTC

Updating our Code Contribution/Style Guide

Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.

Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:

https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo

Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:


  *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
  *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
  *   Modified new-line rules for multi-line function calls
  *   External dependency rules (require DISCUSS thread before introducing)




Re: Updating our Code Contribution/Style Guide

Posted by Jacek Lewandowski <le...@gmail.com>.
Regarding `instance()` method / `instance` field - to clarify my point - we
usually use that in many places. While it is quite easy to access by method
rather than by a field from the beginning, regardless if there is a need
for a mock immediately or not, it would be a much bigger change in terms of
lines/conflicts when we decide to change that later. This is just a
suggestion to make it more testable in our world full of singleton without
mass refactoring.

Obviously, we can formulate the requirement in different words - do have
80%+ coverage of unit tests (not in-jvm dtests) for the new / changed code.

thanks
- - -- --- ----- -------- -------------
Jacek Lewandowski


On Sun, Mar 20, 2022 at 8:55 PM Mick Semb Wever <mc...@apache.org> wrote:

>
>
> On Tue, 15 Mar 2022 at 11:46, Ruslan Fomkin <ru...@gmail.com>
> wrote:
>
>> …
>> I support Jacek’s request to have each argument on a separate line when
>> they are many and need to be placed on multiple lines. For me it takes less
>> effort to grasp arguments on separate lines than when several arguments are
>> combined on the same line. IMHO the root cause is having too many
>> arguments, which is common issue for non-OOP languages.
>>
>
>
> I support this too. It has always bugged me that by having the first
> parameter not on a newline, then when the method (or variable assigned)
> name changes it causes all subsequent wrapped parameter lines have to be
> re-indented, which leads to more noise in, and less readability of, the
> patch.
>
> For example,
>
> method(
>     "aaaaaaaaaaaaa",
>     "bbbbbbbbbbbbb",
>     "ccccccccccccc"
> )
>
> is better than
>
> *method*("aaaaaaaaaaaaa",
>
>        "bbbbbbbbbbbbb",
>
>        "ccccccccccccc"
> )
>
> I also agree that several arguments on the one line should be avoided, that too many method parameters is the problem here.
>
> I would also like to suggest that an operator should always carry on line wraps. This makes it faster to read the difference between arguments per line wrapping, and operations over multiple lines.
> For example, ensuring it looks like
>
> var = bar == null
>
>       ? doFoo()
>
>       : doBar();
>
> and not
>
> var = bar == null ?
>
>       doFoo() :
>
>       doBar();
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by Mick Semb Wever <mc...@apache.org>.
> It looks like the doc already specified this behaviour for ternary
> operator line wrapping. For your proposal I’ve also added the following:
>
>
>
> It is usually preferable to carry the operator for multiline expressions,
> with the exception of some multiline string literals.
>
>
>
> Does that work for you? The “usually” at least leaves some wiggle room, as
> ultimately I would prefer this decision to be made by an author (even if a
> general norm of carrying the operator is preferable).
>
>
>
> I am concerned that this starts leaning towards being too specific,
> though. It opens up questions like whether we should also be specifying
> spacing for loop guards, conditions, casts, etc?
>
>

Yes I'm good with that, thanks.

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
It looks like the doc already specified this behaviour for ternary operator line wrapping. For your proposal I’ve also added the following:

It is usually preferable to carry the operator for multiline expressions, with the exception of some multiline string literals.

Does that work for you? The “usually” at least leaves some wiggle room, as ultimately I would prefer this decision to be made by an author (even if a general norm of carrying the operator is preferable).

I am concerned that this starts leaning towards being too specific, though. It opens up questions like whether we should also be specifying spacing for loop guards, conditions, casts, etc?


From: benedict@apache.org <be...@apache.org>
Date: Sunday, 20 March 2022 at 21:37
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
> We are talking about one extra line, not a dozen or more.

I think you are confused about the context. The case I was discussing often means 10+ additional lines at each call-site.

> Once the code gets more real, it is faster to read the difference between (a) and (c)

This isn’t a great example, as if you are line-wrapping with multiple parameters you should be assigning any computation to a clearly named local variable before passing the result to the constructor (amongst other things). We can perhaps highlight this in the style guide.

We also do not only produce multi-line computations in this context. If constructing into a variable (where there is no such ambiguity), it is much easier to parse parameters first without the concatenation operator preceding them.

My point is simply that legislating on this kind of detail is a waste of our time, and probably counter-productive. I don’t want to enumerate all the possible ways we might construct multi-line computations.

Ternary operators are pretty clear, so maybe we can just agree to define those, and leave the rest to the judgement of the authors?


From: Mick Semb Wever <mc...@apache.org>
Date: Sunday, 20 March 2022 at 20:56
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide


> I support this too… leads to more noise in, and less readability of, the patch.

Readability of the patch is not harmed with modern tooling (with whitespace being highlighted differently to content changes).

Legibility of the code (not patch) should always be preferred IMO. To aid code comprehension, we should aim for density of useful information for the reader; wasting a dozen or more lines on zero information density, solely to solve a problem already handled by modern diff tools, is a false economy.


We are talking about one extra line, not a dozen or more.
It also improves the readability of the code IMHO.


> I would also like to suggest that an operator should always carry on line wraps

For the ternary operator I agree, however I am less convinced in other cases. String concatenation is probably cleaner with the opposite norm, so that string literals are aligned.


IMHO it works for string concatenation too.
The example that comes to mind is
a)

method(
    "aaaaaaaaaaaaa",
    "bbbbbbbbbbbbb",
    "ccccccccccccc"
)
b)

method(
    "aaaaaaaaaaaaa" +
    "bbbbbbbbbbbbb" +
    "ccccccccccccc"
)
c)

method(
    "aaaaaaaaaaaaa"
    + "bbbbbbbbbbbbb"
    + "ccccccccccccc"
)

Once the code gets more real, it is faster to read the difference between (a) and (c) than it (a) and (b).




Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
> We are talking about one extra line, not a dozen or more.

I think you are confused about the context. The case I was discussing often means 10+ additional lines at each call-site.

> Once the code gets more real, it is faster to read the difference between (a) and (c)

This isn’t a great example, as if you are line-wrapping with multiple parameters you should be assigning any computation to a clearly named local variable before passing the result to the constructor (amongst other things). We can perhaps highlight this in the style guide.

We also do not only produce multi-line computations in this context. If constructing into a variable (where there is no such ambiguity), it is much easier to parse parameters first without the concatenation operator preceding them.

My point is simply that legislating on this kind of detail is a waste of our time, and probably counter-productive. I don’t want to enumerate all the possible ways we might construct multi-line computations.

Ternary operators are pretty clear, so maybe we can just agree to define those, and leave the rest to the judgement of the authors?


From: Mick Semb Wever <mc...@apache.org>
Date: Sunday, 20 March 2022 at 20:56
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide


> I support this too… leads to more noise in, and less readability of, the patch.

Readability of the patch is not harmed with modern tooling (with whitespace being highlighted differently to content changes).

Legibility of the code (not patch) should always be preferred IMO. To aid code comprehension, we should aim for density of useful information for the reader; wasting a dozen or more lines on zero information density, solely to solve a problem already handled by modern diff tools, is a false economy.


We are talking about one extra line, not a dozen or more.
It also improves the readability of the code IMHO.


> I would also like to suggest that an operator should always carry on line wraps

For the ternary operator I agree, however I am less convinced in other cases. String concatenation is probably cleaner with the opposite norm, so that string literals are aligned.


IMHO it works for string concatenation too.
The example that comes to mind is
a)

method(
    "aaaaaaaaaaaaa",
    "bbbbbbbbbbbbb",
    "ccccccccccccc"
)
b)

method(
    "aaaaaaaaaaaaa" +
    "bbbbbbbbbbbbb" +
    "ccccccccccccc"
)
c)

method(
    "aaaaaaaaaaaaa"
    + "bbbbbbbbbbbbb"
    + "ccccccccccccc"
)

Once the code gets more real, it is faster to read the difference between (a) and (c) than it (a) and (b).




Re: Updating our Code Contribution/Style Guide

Posted by Mick Semb Wever <mc...@apache.org>.
> I support this too… leads to more noise in, and less readability of, the
> patch.
>
>
>
> Readability of the patch is not harmed with modern tooling (with
> whitespace being highlighted differently to content changes).
>
>
>
> Legibility of the code (not patch) should always be preferred IMO. To aid
> code comprehension, we should aim for density of useful information for the
> reader; wasting a dozen or more lines on zero information density, solely
> to solve a problem already handled by modern diff tools, is a false economy.
>


We are talking about one extra line, not a dozen or more.
It also improves the readability of the code IMHO.



> > I would also like to suggest that an operator should always carry on
> line wraps
>
>
>
> For the ternary operator I agree, however I am less convinced in other
> cases. String concatenation is probably cleaner with the opposite norm, so
> that string literals are aligned.
>


IMHO it works for string concatenation too.
The example that comes to mind is
a)

*method*(
    "aaaaaaaaaaaaa",
    "bbbbbbbbbbbbb",
    "ccccccccccccc"
)

b)

*method*(
    "aaaaaaaaaaaaa" +
    "bbbbbbbbbbbbb" +
    "ccccccccccccc"
)

c)

*method*(
    "aaaaaaaaaaaaa"
    + "bbbbbbbbbbbbb"
    + "ccccccccccccc"
)


Once the code gets more real, it is faster to read the difference between
(a) and (c) than it (a) and (b).

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
> I support this too… leads to more noise in, and less readability of, the patch.

Readability of the patch is not harmed with modern tooling (with whitespace being highlighted differently to content changes).

Legibility of the code (not patch) should always be preferred IMO. To aid code comprehension, we should aim for density of useful information for the reader; wasting a dozen or more lines on zero information density, solely to solve a problem already handled by modern diff tools, is a false economy.


> I also agree that several arguments on the one line should be avoided, that too many method parameters is the problem here.

Method parameters aren’t a problem if they are strongly typed, parameters are cleanly grouped, and all call-sites utilise approximately the same behaviour. The alternatives are builders or mutability, the latter of which we broadly avoid. Builders make code navigation clunkier (creating more indirection to navigate when searching callers), as well as potentially creating additional heap pressure and code pollution (both in the call sites and the builder itself).

Builders are helpful when there are lot of different ways to configure an object, but the more common case of simply propagating a relevant subset of existing parameters (plus perhaps a couple of new but required parameters), at just a handful of equivalent call-sites, they are IMO unhelpful.

Note also that builders have the exact same legibility concerns as parameter-per-line: a significant amount of screen real-estate is taken up by scaffolding/noise. This is only useful if the builder call-sites communicate some unique configuration details about the particular call-site.

> I would also like to suggest that an operator should always carry on line wraps

For the ternary operator I agree, however I am less convinced in other cases. String concatenation is probably cleaner with the opposite norm, so that string literals are aligned.



From: Mick Semb Wever <mc...@apache.org>
Date: Sunday, 20 March 2022 at 19:55
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide


On Tue, 15 Mar 2022 at 11:46, Ruslan Fomkin <ru...@gmail.com>> wrote:
…
I support Jacek’s request to have each argument on a separate line when they are many and need to be placed on multiple lines. For me it takes less effort to grasp arguments on separate lines than when several arguments are combined on the same line. IMHO the root cause is having too many arguments, which is common issue for non-OOP languages.


I support this too. It has always bugged me that by having the first parameter not on a newline, then when the method (or variable assigned) name changes it causes all subsequent wrapped parameter lines have to be re-indented, which leads to more noise in, and less readability of, the patch.

For example,

method(
    "aaaaaaaaaaaaa",
    "bbbbbbbbbbbbb",
    "ccccccccccccc"
)

is better than

method("aaaaaaaaaaaaa",

       "bbbbbbbbbbbbb",

       "ccccccccccccc"
)

I also agree that several arguments on the one line should be avoided, that too many method parameters is the problem here.

I would also like to suggest that an operator should always carry on line wraps. This makes it faster to read the difference between arguments per line wrapping, and operations over multiple lines.
For example, ensuring it looks like

var = bar == null

      ? doFoo()

      : doBar();

and not

var = bar == null ?

      doFoo() :

      doBar();



Re: Updating our Code Contribution/Style Guide

Posted by Mick Semb Wever <mc...@apache.org>.
On Tue, 15 Mar 2022 at 11:46, Ruslan Fomkin <ru...@gmail.com> wrote:

> …
> I support Jacek’s request to have each argument on a separate line when
> they are many and need to be placed on multiple lines. For me it takes less
> effort to grasp arguments on separate lines than when several arguments are
> combined on the same line. IMHO the root cause is having too many
> arguments, which is common issue for non-OOP languages.
>


I support this too. It has always bugged me that by having the first
parameter not on a newline, then when the method (or variable assigned)
name changes it causes all subsequent wrapped parameter lines have to be
re-indented, which leads to more noise in, and less readability of, the
patch.

For example,

method(
    "aaaaaaaaaaaaa",
    "bbbbbbbbbbbbb",
    "ccccccccccccc"
)

is better than

*method*("aaaaaaaaaaaaa",

       "bbbbbbbbbbbbb",

       "ccccccccccccc"
)

I also agree that several arguments on the one line should be avoided,
that too many method parameters is the problem here.

I would also like to suggest that an operator should always carry on
line wraps. This makes it faster to read the difference between
arguments per line wrapping, and operations over multiple lines.
For example, ensuring it looks like

var = bar == null

      ? doFoo()

      : doBar();

and not

var = bar == null ?

      doFoo() :

      doBar();

Re: Updating our Code Contribution/Style Guide

Posted by Yifan Cai <yc...@gmail.com>.
+1 to the guideline.


> > For the instance() / getInstance() methods - I know it is an additional
> effort, but on the other hand it has many advantages because you can
> replace the singleton for testing
>
> Again, do this as necessary. I think for public instances this is a fine
> recommendation, but for private uses it should not be prescribed, only used
> if there is an explicit benefit.


It is regarding testability. Where mock is desired, there should be
getter methods, instead of 'public final'. Otherwise, `public final` is
preferred for its simplicity.
It is more tricky in terms of singletons though. I feel there is no good
use of private singleton, which is ugly and makes the referencing code
difficult to test. So probably for singleton, we want to declare the
'instance()' method.
It is good that the guideline is not rigid.


I don’t think it is good idea to prohibit or discourage to use final, which
> is a tool to guard immutability.

Ruslan,
What is proposed is to prohibit or discourage the use of `final` *within a
method body*. I think it is less useful to mark a variable's *reference* as
being immutable within such scope. In the other scenario, e.g. class member
fields, `final` should be used when reference/primitive immutability is
desired.

- Yifan

On Tue, Mar 15, 2022 at 3:46 AM Ruslan Fomkin <ru...@gmail.com>
wrote:

> Hi,
>
> I hope it’s OK I jump to the discussion.
>
> I find it is important to automate code formatting and have a build check
> to verify it, otherwise there are many examples in other projects that
> formatting is not followed. To make formatting to be not painful for
> contributors it will be good to setup git commit hooks (which will require
> to have a command line formatting tool) in addition to IDE support. In such
> case the main task for the formatting CI build check will be to fail
> environments, which are not yet set.
> For example, cassandra-dtest already has a CI formatting check in place
> for Python code, which runs on each PR. There is a Python formatting
> command line tool, which can be easily run locally, and if I don’t mistake
> it is easy to setup git commit hook with it. (also works to setup the
> formatting in VScode)
>
> I don’t think it is good idea to prohibit or discourage to use final,
> which is a tool to guard immutability. As mentioned unfortunately Java is
> not designed to be safe by default and thus makes code more noisy by
> requiring to use the keyword.
>
> I noticed an issue with current formatting that there is no indentation if
> an assignment statement is split to multiple lines before or without using
> parenthesis. For example:
> ImmutableMap.Builder<String, ImmutableMultimap<String,
> InetAddressAndPort>> dcRackBuilder =
> ImmutableMap.builder();
> It would be nice if the next line is intended to understand that it is
> part of the previous line.
>
> I support Jacek’s request to have each argument on a separate line when
> they are many and need to be placed on multiple lines. For me it takes less
> effort to grasp arguments on separate lines than when several arguments are
> combined on the same line. IMHO the root cause is having too many
> arguments, which is common issue for non-OOP languages.
>
> Best regards,
> Ruslan Fomkin
>
> On 15 Mar 2022, at 10:04, Stefan Miklosovic <
> stefan.miklosovic@instaclustr.com> wrote:
>
> I agree with the single commit approach to fix it all. TBH Javadocs
> are a little bit messy as well, warnings on generating them,
> incomplete, in a lot of cases obsolete or they do not reflect the code
> anymore etc.
>
> On Tue, 15 Mar 2022 at 09:44, benedict@apache.org <be...@apache.org>
> wrote:
>
>
> I’d be fine with that, though I think if we want to start enforcing
> imports we probably want to mass correct them first. It’s not like other
> style requirements in that there should not be unintended consequences. A
> single (huge) commit to standardise the orders and introduce a build-time
> check would be fine IMO.
>
>
>
> I also don’t really think it is that important.
>
>
>
> From: Jacek Lewandowski <le...@gmail.com>
> Date: Tuesday, 15 March 2022 at 05:18
> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
> Subject: Re: Updating our Code Contribution/Style Guide
>
> I do think that we should at least enforce the import order. What is now
> is a complete mess and causes a lot of conflicts during rebasing / merging.
> Perhaps we could start enforcing such rules only on modified files, this
> way we could gradually go towards consistency... wdyt?
>
>
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
>
>
>
> On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi <dj...@apache.org> wrote:
>
> Benedict, I agree. We should not be rigid about applying any style.
> stylechecks are meant to bring uniformity in the codebase. I assure you
> what I am proposing is neither rigid nor curbs the ability to apply the
> rules flexibly.
>
>
>
> On Mar 14, 2022, at 4:52 PM, benedict@apache.org wrote:
>
>
>
> I’m a strong -1 on strictly enforcing any style guide. It is there to help
> shape contributions, review feedback and responding to said feedback. It
> can also be used to setup IntelliJ’s code formatter to configure default
> behaviours.
>
>
>
> It is not meant to be turned into a linter. Plenty of the rules are stated
> in a flexible manner, so as to permit breaches where overall legibility and
> aesthetics are improved.
>
>
>
>
>
> From: Dinesh Joshi <dj...@apache.org>
> Date: Monday, 14 March 2022 at 23:44
> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
> Subject: Re: Updating our Code Contribution/Style Guide
>
> I am also in favor of updating the style guide. We should ideally have
> custom checkstyle configuration that can ensure adherence to the style
> guide.
>
>
>
> I also don't think this is a contended topic. We want to explicitly codify
> our current practices so new contributors have an easier time writing code.
>
>
>
> It is also important to note that the current codebase is not consistent
> since it was written over a long period of time so it tends to confuse
> folks who are working in different parts of the codebase. So this style
> guide would be very helpful.
>
>
>
> On Mar 14, 2022, at 2:41 AM, benedict@apache.org wrote:
>
>
>
> Our style guide hasn’t been updated in about a decade, and I think it is
> overdue some improvements that address some shortcomings as well as modern
> facilities such as streams and lambdas.
>
>
>
> Most of this was put together for an effort Dinesh started a few years
> ago, but has languished since, in part because the project has always
> seemed to have other priorities. I figure there’s never a good time to
> raise a contended topic, so here is my suggested update to contributor
> guidelines:
>
>
>
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
>
>
> Many of these suggestions codify norms already widely employed, sometimes
> in spite of the style guide, but some likely remain contentious. Some
> potentially contentious things to draw your attention to:
>
>
>
> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and
> more succinct simple x() to retrieve where clear
> Avoid implementing methods, incl. equals(), hashCode() and toString(),
> unless actually used
> Modified new-line rules for multi-line function calls
> External dependency rules (require DISCUSS thread before introducing)
>
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by Ruslan Fomkin <ru...@gmail.com>.
Hi,

I hope it’s OK I jump to the discussion.

I find it is important to automate code formatting and have a build check to verify it, otherwise there are many examples in other projects that formatting is not followed. To make formatting to be not painful for contributors it will be good to setup git commit hooks (which will require to have a command line formatting tool) in addition to IDE support. In such case the main task for the formatting CI build check will be to fail environments, which are not yet set.
For example, cassandra-dtest already has a CI formatting check in place for Python code, which runs on each PR. There is a Python formatting command line tool, which can be easily run locally, and if I don’t mistake it is easy to setup git commit hook with it. (also works to setup the formatting in VScode)

I don’t think it is good idea to prohibit or discourage to use final, which is a tool to guard immutability. As mentioned unfortunately Java is not designed to be safe by default and thus makes code more noisy by requiring to use the keyword.

I noticed an issue with current formatting that there is no indentation if an assignment statement is split to multiple lines before or without using parenthesis. For example:
ImmutableMap.Builder<String, ImmutableMultimap<String, InetAddressAndPort>> dcRackBuilder = 
ImmutableMap.builder();
It would be nice if the next line is intended to understand that it is part of the previous line.

I support Jacek’s request to have each argument on a separate line when they are many and need to be placed on multiple lines. For me it takes less effort to grasp arguments on separate lines than when several arguments are combined on the same line. IMHO the root cause is having too many arguments, which is common issue for non-OOP languages.

Best regards,
Ruslan Fomkin

> On 15 Mar 2022, at 10:04, Stefan Miklosovic <st...@instaclustr.com> wrote:
> 
> I agree with the single commit approach to fix it all. TBH Javadocs
> are a little bit messy as well, warnings on generating them,
> incomplete, in a lot of cases obsolete or they do not reflect the code
> anymore etc.
> 
> On Tue, 15 Mar 2022 at 09:44, benedict@apache.org <be...@apache.org> wrote:
>> 
>> I’d be fine with that, though I think if we want to start enforcing imports we probably want to mass correct them first. It’s not like other style requirements in that there should not be unintended consequences. A single (huge) commit to standardise the orders and introduce a build-time check would be fine IMO.
>> 
>> 
>> 
>> I also don’t really think it is that important.
>> 
>> 
>> 
>> From: Jacek Lewandowski <le...@gmail.com>
>> Date: Tuesday, 15 March 2022 at 05:18
>> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
>> Subject: Re: Updating our Code Contribution/Style Guide
>> 
>> I do think that we should at least enforce the import order. What is now is a complete mess and causes a lot of conflicts during rebasing / merging. Perhaps we could start enforcing such rules only on modified files, this way we could gradually go towards consistency... wdyt?
>> 
>> 
>> - - -- --- ----- -------- -------------
>> Jacek Lewandowski
>> 
>> 
>> 
>> 
>> 
>> On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi <dj...@apache.org> wrote:
>> 
>> Benedict, I agree. We should not be rigid about applying any style. stylechecks are meant to bring uniformity in the codebase. I assure you what I am proposing is neither rigid nor curbs the ability to apply the rules flexibly.
>> 
>> 
>> 
>> On Mar 14, 2022, at 4:52 PM, benedict@apache.org wrote:
>> 
>> 
>> 
>> I’m a strong -1 on strictly enforcing any style guide. It is there to help shape contributions, review feedback and responding to said feedback. It can also be used to setup IntelliJ’s code formatter to configure default behaviours.
>> 
>> 
>> 
>> It is not meant to be turned into a linter. Plenty of the rules are stated in a flexible manner, so as to permit breaches where overall legibility and aesthetics are improved.
>> 
>> 
>> 
>> 
>> 
>> From: Dinesh Joshi <dj...@apache.org>
>> Date: Monday, 14 March 2022 at 23:44
>> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
>> Subject: Re: Updating our Code Contribution/Style Guide
>> 
>> I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.
>> 
>> 
>> 
>> I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.
>> 
>> 
>> 
>> It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.
>> 
>> 
>> 
>> On Mar 14, 2022, at 2:41 AM, benedict@apache.org wrote:
>> 
>> 
>> 
>> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>> 
>> 
>> 
>> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>> 
>> 
>> 
>> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>> 
>> 
>> 
>> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>> 
>> 
>> 
>> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>> Modified new-line rules for multi-line function calls
>> External dependency rules (require DISCUSS thread before introducing)
>> 
>> 


Re: Updating our Code Contribution/Style Guide

Posted by Stefan Miklosovic <st...@instaclustr.com>.
I agree with the single commit approach to fix it all. TBH Javadocs
are a little bit messy as well, warnings on generating them,
incomplete, in a lot of cases obsolete or they do not reflect the code
anymore etc.

On Tue, 15 Mar 2022 at 09:44, benedict@apache.org <be...@apache.org> wrote:
>
> I’d be fine with that, though I think if we want to start enforcing imports we probably want to mass correct them first. It’s not like other style requirements in that there should not be unintended consequences. A single (huge) commit to standardise the orders and introduce a build-time check would be fine IMO.
>
>
>
> I also don’t really think it is that important.
>
>
>
> From: Jacek Lewandowski <le...@gmail.com>
> Date: Tuesday, 15 March 2022 at 05:18
> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
> Subject: Re: Updating our Code Contribution/Style Guide
>
> I do think that we should at least enforce the import order. What is now is a complete mess and causes a lot of conflicts during rebasing / merging. Perhaps we could start enforcing such rules only on modified files, this way we could gradually go towards consistency... wdyt?
>
>
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
>
>
>
> On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi <dj...@apache.org> wrote:
>
> Benedict, I agree. We should not be rigid about applying any style. stylechecks are meant to bring uniformity in the codebase. I assure you what I am proposing is neither rigid nor curbs the ability to apply the rules flexibly.
>
>
>
> On Mar 14, 2022, at 4:52 PM, benedict@apache.org wrote:
>
>
>
> I’m a strong -1 on strictly enforcing any style guide. It is there to help shape contributions, review feedback and responding to said feedback. It can also be used to setup IntelliJ’s code formatter to configure default behaviours.
>
>
>
> It is not meant to be turned into a linter. Plenty of the rules are stated in a flexible manner, so as to permit breaches where overall legibility and aesthetics are improved.
>
>
>
>
>
> From: Dinesh Joshi <dj...@apache.org>
> Date: Monday, 14 March 2022 at 23:44
> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
> Subject: Re: Updating our Code Contribution/Style Guide
>
> I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.
>
>
>
> I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.
>
>
>
> It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.
>
>
>
> On Mar 14, 2022, at 2:41 AM, benedict@apache.org wrote:
>
>
>
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>
>
>
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>
>
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
>
>
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>
>
>
> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> Modified new-line rules for multi-line function calls
> External dependency rules (require DISCUSS thread before introducing)
>
>

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
I’d be fine with that, though I think if we want to start enforcing imports we probably want to mass correct them first. It’s not like other style requirements in that there should not be unintended consequences. A single (huge) commit to standardise the orders and introduce a build-time check would be fine IMO.

I also don’t really think it is that important.

From: Jacek Lewandowski <le...@gmail.com>
Date: Tuesday, 15 March 2022 at 05:18
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
I do think that we should at least enforce the import order. What is now is a complete mess and causes a lot of conflicts during rebasing / merging. Perhaps we could start enforcing such rules only on modified files, this way we could gradually go towards consistency... wdyt?

- - -- --- ----- -------- -------------
Jacek Lewandowski


On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi <dj...@apache.org>> wrote:
Benedict, I agree. We should not be rigid about applying any style. stylechecks are meant to bring uniformity in the codebase. I assure you what I am proposing is neither rigid nor curbs the ability to apply the rules flexibly.


On Mar 14, 2022, at 4:52 PM, benedict@apache.org<ma...@apache.org> wrote:

I’m a strong -1 on strictly enforcing any style guide. It is there to help shape contributions, review feedback and responding to said feedback. It can also be used to setup IntelliJ’s code formatter to configure default behaviours.

It is not meant to be turned into a linter. Plenty of the rules are stated in a flexible manner, so as to permit breaches where overall legibility and aesthetics are improved.


From: Dinesh Joshi <dj...@apache.org>>
Date: Monday, 14 March 2022 at 23:44
To: dev@cassandra.apache.org<ma...@cassandra.apache.org> <de...@cassandra.apache.org>>
Subject: Re: Updating our Code Contribution/Style Guide
I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.

I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.

It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.

On Mar 14, 2022, at 2:41 AM, benedict@apache.org<ma...@apache.org> wrote:

Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.

Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:

https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo

Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:


  *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
  *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
  *   Modified new-line rules for multi-line function calls
  *   External dependency rules (require DISCUSS thread before introducing)


Re: Updating our Code Contribution/Style Guide

Posted by Jacek Lewandowski <le...@gmail.com>.
I do think that we should at least enforce the import order. What is now is
a complete mess and causes a lot of conflicts during rebasing / merging.
Perhaps we could start enforcing such rules only on modified files, this
way we could gradually go towards consistency... wdyt?

- - -- --- ----- -------- -------------
Jacek Lewandowski


On Tue, Mar 15, 2022 at 1:52 AM Dinesh Joshi <dj...@apache.org> wrote:

> Benedict, I agree. We should not be rigid about applying any style.
> stylechecks are meant to bring uniformity in the codebase. I assure you
> what I am proposing is neither rigid nor curbs the ability to apply the
> rules flexibly.
>
> On Mar 14, 2022, at 4:52 PM, benedict@apache.org wrote:
>
> I’m a strong -1 on strictly enforcing any style guide. It is there to help
> shape contributions, review feedback and responding to said feedback. It
> can also be used to setup IntelliJ’s code formatter to configure default
> behaviours.
>
> It is not meant to be turned into a linter. Plenty of the rules are stated
> in a flexible manner, so as to permit breaches where overall legibility and
> aesthetics are improved.
>
>
>
> *From: *Dinesh Joshi <dj...@apache.org>
> *Date: *Monday, 14 March 2022 at 23:44
> *To: *dev@cassandra.apache.org <de...@cassandra.apache.org>
> *Subject: *Re: Updating our Code Contribution/Style Guide
> I am also in favor of updating the style guide. We should ideally have
> custom checkstyle configuration that can ensure adherence to the style
> guide.
>
> I also don't think this is a contended topic. We want to explicitly codify
> our current practices so new contributors have an easier time writing code.
>
> It is also important to note that the current codebase is not consistent
> since it was written over a long period of time so it tends to confuse
> folks who are working in different parts of the codebase. So this style
> guide would be very helpful.
>
>
> On Mar 14, 2022, at 2:41 AM, benedict@apache.org wrote:
>
> Our style guide hasn’t been updated in about a decade, and I think it is
> overdue some improvements that address some shortcomings as well as modern
> facilities such as streams and lambdas.
>
> Most of this was put together for an effort Dinesh started a few years
> ago, but has languished since, in part because the project has always
> seemed to have other priorities. I figure there’s never a good time to
> raise a contended topic, so here is my suggested update to contributor
> guidelines:
>
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> Many of these suggestions codify norms already widely employed, sometimes
> in spite of the style guide, but some likely remain contentious. Some
> potentially contentious things to draw your attention to:
>
>
>    - Deemphasis of getX() nomenclature, in favour of richer set of
>    prefixes and more succinct simple x() to retrieve where clear
>    - Avoid implementing methods, incl. equals(), hashCode() and
>    toString(), unless actually used
>    - Modified new-line rules for multi-line function calls
>    - External dependency rules (require DISCUSS thread before introducing)
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by Dinesh Joshi <dj...@apache.org>.
Benedict, I agree. We should not be rigid about applying any style. stylechecks are meant to bring uniformity in the codebase. I assure you what I am proposing is neither rigid nor curbs the ability to apply the rules flexibly.

> On Mar 14, 2022, at 4:52 PM, benedict@apache.org wrote:
> 
> I’m a strong -1 on strictly enforcing any style guide. It is there to help shape contributions, review feedback and responding to said feedback. It can also be used to setup IntelliJ’s code formatter to configure default behaviours. 
>  
> It is not meant to be turned into a linter. Plenty of the rules are stated in a flexible manner, so as to permit breaches where overall legibility and aesthetics are improved.
>  
>  
> From: Dinesh Joshi <dj...@apache.org>
> Date: Monday, 14 March 2022 at 23:44
> To: dev@cassandra.apache.org <de...@cassandra.apache.org>
> Subject: Re: Updating our Code Contribution/Style Guide
> 
> I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.
>  
> I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.
>  
> It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.
> 
> 
> On Mar 14, 2022, at 2:41 AM, benedict@apache.org <ma...@apache.org> wrote:
>  
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>  
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>  
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo <https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo>
>  
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>  
> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> Modified new-line rules for multi-line function calls
> External dependency rules (require DISCUSS thread before introducing)


Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
I’m a strong -1 on strictly enforcing any style guide. It is there to help shape contributions, review feedback and responding to said feedback. It can also be used to setup IntelliJ’s code formatter to configure default behaviours.

It is not meant to be turned into a linter. Plenty of the rules are stated in a flexible manner, so as to permit breaches where overall legibility and aesthetics are improved.


From: Dinesh Joshi <dj...@apache.org>
Date: Monday, 14 March 2022 at 23:44
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.

I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.

It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.


On Mar 14, 2022, at 2:41 AM, benedict@apache.org<ma...@apache.org> wrote:

Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.

Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:

https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo

Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:


  *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
  *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
  *   Modified new-line rules for multi-line function calls
  *   External dependency rules (require DISCUSS thread before introducing)


Re: Updating our Code Contribution/Style Guide

Posted by Dinesh Joshi <dj...@apache.org>.
I am also in favor of updating the style guide. We should ideally have custom checkstyle configuration that can ensure adherence to the style guide.

I also don't think this is a contended topic. We want to explicitly codify our current practices so new contributors have an easier time writing code.

It is also important to note that the current codebase is not consistent since it was written over a long period of time so it tends to confuse folks who are working in different parts of the codebase. So this style guide would be very helpful.

> On Mar 14, 2022, at 2:41 AM, benedict@apache.org wrote:
> 
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>  
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>  
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo <https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo>
>  
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>  
> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> Modified new-line rules for multi-line function calls
> External dependency rules (require DISCUSS thread before introducing)


Re: Updating our Code Contribution/Style Guide

Posted by Bowen Song <bo...@bso.ng>.
I think there's two separate issues, the style guide for Python code, 
and fixing the existing code style. In my opinion, the style guide 
should come first, and we can follow that to fix the existing code's style.

BTW, I can see the changes you made in CASSANDRA-17413 has already been 
merged into trunk. However, latest code in trunk still has all the 
issues I mentioned in the previous email. Some of the issues are purely 
code styling, such as visual indentation and white spaces around 
operators, but some are a little more than that, such as unused imports 
which can slightly impact performance and memory usage.

I think there's two valid approaches to address the issue. We could 
create a style guide first, and then fix them all in one go; or split 
the issues to two categories, pure style issue to be fixed after the 
code style guides is published, and other issues which can be fixed now. 
I personally prefer the former, because it involves less amount of work 
- no need to spend time to triage the issues reported by tools such as 
"flake8".

On 14/03/2022 11:11, Stefan Miklosovic wrote:
> Hi Bowen,
>
> we were working on that recently, like CASSANDRA-17413 + a lot of
> improvements around Python stuff are coming. If you identify more
> places for improvements we are definitely interested.
>
> Regards
>
> On Mon, 14 Mar 2022 at 11:53, Bowen Song <bo...@bso.ng> wrote:
>> I found there's no mentioning of Python code style at all. If we are going to update the style guide, can this be addressed too?
>>
>> FYI, a quick "flake8" style check shows many existing issues in the Python code, including libraries imported but unused, redefinition of unused imports and invalid escape sequence in strings.
>>
>>
>> On 14/03/2022 09:41, benedict@apache.org wrote:
>>
>> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>>
>>
>>
>> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>>
>>
>>
>> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>>
>>
>>
>> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>>
>>
>>
>> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>> Modified new-line rules for multi-line function calls
>> External dependency rules (require DISCUSS thread before introducing)
>>
>>
>>
>>

Re: Updating our Code Contribution/Style Guide

Posted by Stefan Miklosovic <st...@instaclustr.com>.
Hi Bowen,

we were working on that recently, like CASSANDRA-17413 + a lot of
improvements around Python stuff are coming. If you identify more
places for improvements we are definitely interested.

Regards

On Mon, 14 Mar 2022 at 11:53, Bowen Song <bo...@bso.ng> wrote:
>
> I found there's no mentioning of Python code style at all. If we are going to update the style guide, can this be addressed too?
>
> FYI, a quick "flake8" style check shows many existing issues in the Python code, including libraries imported but unused, redefinition of unused imports and invalid escape sequence in strings.
>
>
> On 14/03/2022 09:41, benedict@apache.org wrote:
>
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>
>
>
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>
>
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
>
>
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>
>
>
> Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> Modified new-line rules for multi-line function calls
> External dependency rules (require DISCUSS thread before introducing)
>
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by Bowen Song <bo...@bso.ng>.
Oh, I certainly don't mean to block the purposed update. I'm sorry if it 
sounded that way. All I'm saying is we should add Python code style 
guides to it, and a follow up addendum can surely do the job.

On 14/03/2022 11:41, Josh McKenzie wrote:
> +1 to the doc as written. A good portion of it also applies to python 
> code style (structure, clarity in naming, etc).
>
> Perhaps a python specific addendum as a follow up might make sense Bowen?
>
> On Mon, Mar 14, 2022, at 7:21 AM, benedict@apache.org wrote:
>>
>> I think the community would be happy to introduce a python style 
>> guide, but I am not well placed to do so, having chosen throughout my 
>> career to limit my exposure to python. Probably a parallel effort 
>> would be best - perhaps you could work with Stefan and others to 
>> produce such a proposal?
>>
>>
>>
>> *From: *Bowen Song <bo...@bso.ng>
>> *Date: *Monday, 14 March 2022 at 10:53
>> *To: *dev@cassandra.apache.org <de...@cassandra.apache.org>
>> *Subject: *Re: Updating our Code Contribution/Style Guide
>>
>> I found there's no mentioning of Python code style at all. If we are 
>> going to update the style guide, can this be addressed too?
>>
>> FYI, a quick "flake8" style check shows many existing issues in the 
>> Python code, including libraries imported but unused, redefinition of 
>> unused imports and invalid escape sequence in strings.
>>
>>
>> On 14/03/2022 09:41, benedict@apache.org wrote:
>>
>>     Our style guide hasn’t been updated in about a decade, and I
>>     think it is overdue some improvements that address some
>>     shortcomings as well as modern facilities such as streams and
>>     lambdas.
>>
>>
>>     Most of this was put together for an effort Dinesh started a few
>>     years ago, but has languished since, in part because the project
>>     has always seemed to have other priorities. I figure there’s
>>     never a good time to raise a contended topic, so here is my
>>     suggested update to contributor guidelines:
>>
>>
>>     https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>>
>>
>>     Many of these suggestions codify norms already widely employed,
>>     sometimes in spite of the style guide, but some likely remain
>>     contentious. Some potentially contentious things to draw your
>>     attention to:
>>
>>
>>      1. Deemphasis of getX() nomenclature, in favour of richer set of
>>         prefixes and more succinct simple x() to retrieve where clear
>>      2. Avoid implementing methods, incl. equals(), hashCode() and
>>         toString(), unless actually used
>>      3. Modified new-line rules for multi-line function calls
>>      4. External dependency rules (require DISCUSS thread before
>>         introducing)
>>
>>
>>
>

Re: Updating our Code Contribution/Style Guide

Posted by Josh McKenzie <jm...@apache.org>.
+1 to the doc as written. A good portion of it also applies to python code style (structure, clarity in naming, etc).

Perhaps a python specific addendum as a follow up might make sense Bowen?

On Mon, Mar 14, 2022, at 7:21 AM, benedict@apache.org wrote:
> I think the community would be happy to introduce a python style guide, but I am not well placed to do so, having chosen throughout my career to limit my exposure to python. Probably a parallel effort would be best - perhaps you could work with Stefan and others to produce such a proposal?
>  
>  
> *From: *Bowen Song <bo...@bso.ng>
> *Date: *Monday, 14 March 2022 at 10:53
> *To: *dev@cassandra.apache.org <de...@cassandra.apache.org>
> *Subject: *Re: Updating our Code Contribution/Style Guide
> I found there's no mentioning of Python code style at all. If we are going to update the style guide, can this be addressed too?
> 
> FYI, a quick "flake8" style check shows many existing issues in the Python code, including libraries imported but unused, redefinition of unused imports and invalid escape sequence in strings.
> 
>  
> 
> On 14/03/2022 09:41, benedict@apache.org wrote:
>> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>>  
>> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>>  
>> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>>  
>> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>>  
>>  1. Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>>  2. Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>>  3. Modified new-line rules for multi-line function calls
>>  4. External dependency rules (require DISCUSS thread before introducing)
>>  
>> 
>>  

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
I think the community would be happy to introduce a python style guide, but I am not well placed to do so, having chosen throughout my career to limit my exposure to python. Probably a parallel effort would be best - perhaps you could work with Stefan and others to produce such a proposal?


From: Bowen Song <bo...@bso.ng>
Date: Monday, 14 March 2022 at 10:53
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide

I found there's no mentioning of Python code style at all. If we are going to update the style guide, can this be addressed too?

FYI, a quick "flake8" style check shows many existing issues in the Python code, including libraries imported but unused, redefinition of unused imports and invalid escape sequence in strings.


On 14/03/2022 09:41, benedict@apache.org<ma...@apache.org> wrote:
Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.

Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:

https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo

Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:


  1.  Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
  2.  Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
  3.  Modified new-line rules for multi-line function calls
  4.  External dependency rules (require DISCUSS thread before introducing)




Re: Updating our Code Contribution/Style Guide

Posted by Bowen Song <bo...@bso.ng>.
I found there's no mentioning of Python code style at all. If we are 
going to update the style guide, can this be addressed too?

FYI, a quick "flake8" style check shows many existing issues in the 
Python code, including libraries imported but unused, redefinition of 
unused imports and invalid escape sequence in strings.


On 14/03/2022 09:41, benedict@apache.org wrote:
>
> Our style guide hasn’t been updated in about a decade, and I think it 
> is overdue some improvements that address some shortcomings as well as 
> modern facilities such as streams and lambdas.
>
> Most of this was put together for an effort Dinesh started a few years 
> ago, but has languished since, in part because the project has always 
> seemed to have other priorities. I figure there’s never a good time to 
> raise a contended topic, so here is my suggested update to contributor 
> guidelines:
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> Many of these suggestions codify norms already widely employed, 
> sometimes in spite of the style guide, but some likely remain 
> contentious. Some potentially contentious things to draw your 
> attention to:
>
>   * Deemphasis of getX() nomenclature, in favour of richer set of
>     prefixes and more succinct simple x() to retrieve where clear
>   * Avoid implementing methods, incl. equals(), hashCode() and
>     toString(), unless actually used
>   * Modified new-line rules for multi-line function calls
>   * External dependency rules (require DISCUSS thread before introducing)
>

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
It’s been a couple of months since I opened this discussion. I think I have integrated the feedback into the google doc. Are there any elements anyone wants to continue discussing, or things I have not fully addressed? I’ll take an absence of response as lazy consensus to commit the changes to the wiki.



From: benedict@apache.org <be...@apache.org>
Date: Monday, 14 March 2022 at 09:41
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Updating our Code Contribution/Style Guide
Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.

Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:

https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo

Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:


  *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
  *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
  *   Modified new-line rules for multi-line function calls
  *   External dependency rules (require DISCUSS thread before introducing)




Re: Updating our Code Contribution/Style Guide

Posted by Stefan Miklosovic <st...@instaclustr.com>.
I agree on not using finals as suggested by Marcus, I have been using
them where I could, sometimes for the sake of having them final to be
consistent with other code but I gladly drop this habit. Too bad Java
doesnt have it like Scala where it is the matter of "var" vs "val".

On Mon, 14 Mar 2022 at 13:00, Marcus Eriksson <ma...@apache.org> wrote:
>
> Looks good
>
> One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.
>
> /Marcus
>
> On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
> > Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
> >
> > Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
> >
> > https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
> >
> > Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
> >
> >
> >   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> >   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> >   *   Modified new-line rules for multi-line function calls
> >   *   External dependency rules (require DISCUSS thread before introducing)
> >
> >
> >

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
Agreed, how about a section like so:

Variable Mutability
As a general norm, parameters and variables should be treated as immutable and not be re-used. Where possible, variables that are mutated within a loop should be declared in the loop guard or body. Sometimes it is necessary for clarity to declare mutable variables outside of these contexts, but these should be scoped to the narrowest reasonable code block, with explicit code blocks utilised as necessary for clarity.

As a result of this norm, use of the final keyword within a method body is prohibited.

We could instead say “discouraged”, but I am not aware of any context where it is helpful today.

From: Marcus Eriksson <ma...@apache.org>
Date: Monday, 14 March 2022 at 12:00
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
Looks good

One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.

/Marcus

On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>
>
>   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>   *   Modified new-line rules for multi-line function calls
>   *   External dependency rules (require DISCUSS thread before introducing)
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
I think it is fine to count generated implementations of interfaces as interfaces, even if they are not defined. If you would like to explicitly mention this, that is fine. Though, if I’m perfectly honest, I do not find that mocking improves testing in many cases (instead making it more tightly coupled and brittle). But that is a separate discussion.

> Having interfaces encourages better unit tests IMHO.

Having unnecessary and unused interfaces encourages messier code, IMHO. Premature abstraction is bad. Introduce interfaces, methods or indeed any concept as and when you need them, for testing or otherwise.

> For the instance() / getInstance() methods - I know it is an additional effort, but on the other hand it has many advantages because you can replace the singleton for testing

Again, do this as necessary. I think for public instances this is a fine recommendation, but for private uses it should not be prescribed, only used if there is an explicit benefit.

> And the continuation indent - currently, when I have IntelliJ configured with provided formatting setup, I get something like this

Ah, I thought you meant for lambdas. I’m not sure how best to specify a continuation indent, or in which contexts it applies – only when there is no other indentation? Conversely, the following works quite nicely. Typically I try to ensure the start of the line is as succinct as possible to permit clean indentation follow-up.


method("aaaaaaaaaaaaa",

       "bbbbbbbbbbbbb",

       "ccccccccccccc"
)



EndpointState removedState = endpointStateMap.stream(endpoint)

                                             .map()…




From: Jacek Lewandowski <le...@gmail.com>
Date: Monday, 14 March 2022 at 16:45
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
Regarding interfaces, mocks created by Mockito are not really the implementations. We also cannot predict tests which will be written in the future. Having interfaces encourages better unit tests IMHO.

An addendum for exception handling guidelines sounds like a good idea.

For the instance() / getInstance() methods - I know it is an additional effort, but on the other hand it has many advantages because you can replace the singleton for testing - replace with a newly created instance for a certain test case

And the continuation indent - currently, when I have IntelliJ configured with provided formatting setup, I get something like this:

method(
"aaaaaaaaaaaaa",
"bbbbbbbbbbbbb",
"ccccccccccccc"
)
or


EndpointState removedState = endpointStateMap
.remove(endpoint);


I know it is preferred to move to the previous line, but sometimes it makes the line much too long due to some nested calls or something else.



On Mon, Mar 14, 2022 at 4:02 PM benedict@apache.org<ma...@apache.org> <be...@apache.org>> wrote:
Hi Jacek,


> Sometimes, although it would be just a single implementation, interface can make sense for testing purposes - for mocking in particular

This would surely mean there are two implementations, one of which is in the test tree? I think this is therefore already covered.


> For exception handling, perhaps we should explicitly mention in the guideline that we should always handle Exception or Throwable (which is frequently being catched in the code) by methods from Throwables, which would properly deal with InterruptedException

I do not think this properly handles InterruptedException – InterruptedException that are not to be handled directly should now really be handled by propagating UncheckedInterruptedException, which is very different to catching all Throwable. In many cases InterruptedException should be handled explicitly, however.

I do not think catching Exception or Throwable is the correct solution in most cases either – we should ideally only do so at the top level at which we want broad unforeseen problems to be handled, or where we need to take specific actions to handle exception, in which case we should ideally always rethrow the Throwable unmolested. I can see some benefit from explicitly outlining these cases, as it is not trivial to handle exceptions cleanly and correctly. We could perhaps create an exception handling addendum, perhaps in a separate page, that goes into greater detail?


> I found it useful to access singletons by getInstance() method rather than directly



This can be beneficial for public use cases, but for private use cases it is oftentimes unhelpful to pollute the code. Also note that the document explicitly proposes avoiding getX, so we would instead have e.g. a method called instance(). Happy to add a section for this.



>- "...If a line wraps inside a method call, try to group natural parameters together on a single line..." while I'm generally ok with that approach, putting each argument in a new line, makes it easier for git / review / automatic merge



I personally prefer to optimise for readability, and 10+ lines of single short parameters badly pollutes a page of code IMO. If there is no consensus on this we can put it to an indicative vote.



>- imports - why mix org.apache.cassandra with other imports (log4j, google, etc.)? I know that order is used for a while, but I was always curious why we do that?



I would be happy to revisit these, as we have not been consistent about imports in the codebase. I do not know why they are as they are.


> - define continuation indent - currently it is 0 characters

An opening brace introduces any necessary indentation (from the start of the line, which is perfect for legibility). I am somewhat inlined to declare that braces must be used if the lambda cannot fit on the declaring line.

From: Jacek Lewandowski <le...@gmail.com>>
Date: Monday, 14 March 2022 at 14:27
To: dev@cassandra.apache.org<ma...@cassandra.apache.org> <de...@cassandra.apache.org>>
Subject: Re: Updating our Code Contribution/Style Guide
Hi,

I was looking at the document and have some thoughts:


- Sometimes, although it would be just a single implementation, interface can make sense for testing purposes - for mocking in particular

- For exception handling, perhaps we should explicitly mention in the guideline that we should always handle Exception or Throwable (which is frequently being catched in the code) by methods from Throwables, which would properly deal with InterruptedException

- I found it useful to access singletons by getInstance() method rather than directly (public final static field). When we use getInstance() method, we may go further and make getInstance return the instance from a provider, which would by default return the value of final field. However in tests, it could return the value of some mutable static field from a custom provider. This way we would be able to easily switch the singleton which is impossible without reloading a class at the moment

- "...If a line wraps inside a method call, try to group natural parameters together on a single line..." while I'm generally ok with that approach, putting each argument in a new line, makes it easier for git / review / automatic merge

- imports - why mix org.apache.cassandra with other imports (log4j, google, etc.)? I know that order is used for a while, but I was always curious why we do that?

- format configurations for IDEs - it seems like IntelliJ can import Eclipse formatter configuration, so maybe one configuration could be enough

- define continuation indent - currently it is 0 characters

- for unit test assertions - prefer AssertJ assertions over standard junit or hamcrest - maybe forbid them? AssertJ has much better descriptions of failing assertions



I hope that we can enforce the rules using checkstyle, otherwise this effort may have little effect. For the transition, perhaps checkstyle could run on CircleCI just for the modified files?



Thanks,

jacek














- - -- --- ----- -------- -------------
Jacek Lewandowski


On Mon, Mar 14, 2022 at 1:10 PM Josh McKenzie <jm...@apache.org>> wrote:

we should add Python code style guides to it
Strongly agree. We're hurting ourselves by treating our python as a 2nd class citizen.

if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the met
Why not both (i.e. use final where possible and refactor when at length / doing too much)? The benefits of immutability are generally well recognized as are the benefits of keeping methods to reasonable lengths and complexity.


On Mon, Mar 14, 2022, at 7:59 AM, Marcus Eriksson wrote:
Looks good

One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.

/Marcus

On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org<ma...@apache.org> wrote:
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>
>
>   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>   *   Modified new-line rules for multi-line function calls
>   *   External dependency rules (require DISCUSS thread before introducing)
>
>
>



Re: Updating our Code Contribution/Style Guide

Posted by Jacek Lewandowski <le...@gmail.com>.
Regarding interfaces, mocks created by Mockito are not really the
implementations. We also cannot predict tests which will be written in the
future. Having interfaces encourages better unit tests IMHO.

An addendum for exception handling guidelines sounds like a good idea.

For the instance() / getInstance() methods - I know it is an additional
effort, but on the other hand it has many advantages because you can
replace the singleton for testing - replace with a newly created instance
for a certain test case

And the continuation indent - currently, when I have IntelliJ configured
with provided formatting setup, I get something like this:

method(
"aaaaaaaaaaaaa",
"bbbbbbbbbbbbb",
"ccccccccccccc"
)

or

EndpointState removedState = endpointStateMap

.remove(endpoint);


I know it is preferred to move to the previous line, but sometimes it makes
the line much too long due to some nested calls or something else.



On Mon, Mar 14, 2022 at 4:02 PM benedict@apache.org <be...@apache.org>
wrote:

> Hi Jacek,
>
>
>
> > Sometimes, although it would be just a single implementation, interface
> can make sense for testing purposes - for mocking in particular
>
>
>
> This would surely mean there are two implementations, one of which is in
> the test tree? I think this is therefore already covered.
>
>
>
> > For exception handling, perhaps we should explicitly mention in the
> guideline that we should always handle Exception or Throwable (which is
> frequently being catched in the code) by methods from Throwables, which
> would properly deal with InterruptedException
>
>
>
> I do not think this properly handles InterruptedException –
> InterruptedException that are not to be handled directly should now really
> be handled by propagating UncheckedInterruptedException, which is very
> different to catching all Throwable. In many cases InterruptedException
> should be handled explicitly, however.
>
>
>
> I do not think catching Exception or Throwable is the correct solution in
> most cases either – we should ideally only do so at the top level at which
> we want broad unforeseen problems to be handled, or where we need to take
> specific actions to handle exception, in which case we should ideally
> always rethrow the Throwable unmolested. I can see some benefit from
> explicitly outlining these cases, as it is not trivial to handle exceptions
> cleanly and correctly. We could perhaps create an exception handling
> addendum, perhaps in a separate page, that goes into greater detail?
>
>
>
> > I found it useful to access singletons by getInstance() method rather
> than directly
>
>
>
> This can be beneficial for *public* use cases, but for private use cases
> it is oftentimes unhelpful to pollute the code. Also note that the document
> explicitly proposes avoiding getX, so we would instead have e.g. a method
> called instance(). Happy to add a section for this.
>
>
>
> >- "...If a line wraps inside a method call, try to group natural
> parameters together on a single line..." while I'm generally ok with that
> approach, putting each argument in a new line, makes it easier for git /
> review / automatic merge
>
>
>
> I personally prefer to optimise for readability, and 10+ lines of single
> short parameters badly pollutes a page of code IMO. If there is no
> consensus on this we can put it to an indicative vote.
>
>
>
> >- imports - why mix org.apache.cassandra with other imports (log4j,
> google, etc.)? I know that order is used for a while, but I was always
> curious why we do that?
>
>
>
> I would be happy to revisit these, as we have not been consistent about
> imports in the codebase. I do not know why they are as they are.
>
>
>
> > - define continuation indent - currently it is 0 characters
>
>
>
> An opening brace introduces any necessary indentation (from the start of
> the line, which is perfect for legibility). I am somewhat inlined to
> declare that braces must be used if the lambda cannot fit on the declaring
> line.
>
>
>
> *From: *Jacek Lewandowski <le...@gmail.com>
> *Date: *Monday, 14 March 2022 at 14:27
> *To: *dev@cassandra.apache.org <de...@cassandra.apache.org>
> *Subject: *Re: Updating our Code Contribution/Style Guide
>
> Hi,
>
>
>
> I was looking at the document and have some thoughts:
>
>
>
> - Sometimes, although it would be just a single implementation, interface
> can make sense for testing purposes - for mocking in particular
>
> - For exception handling, perhaps we should explicitly mention in the
> guideline that we should always handle Exception or Throwable (which is
> frequently being catched in the code) by methods from Throwables, which
> would properly deal with InterruptedException
>
> - I found it useful to access singletons by getInstance() method rather
> than directly (public final static field). When we use getInstance()
> method, we may go further and make getInstance return the instance from a
> provider, which would by default return the value of final field. However
> in tests, it could return the value of some mutable static field from a
> custom provider. This way we would be able to easily switch the singleton
> which is impossible without reloading a class at the moment
>
> - "...If a line wraps inside a method call, try to group natural
> parameters together on a single line..." while I'm generally ok with that
> approach, putting each argument in a new line, makes it easier for git /
> review / automatic merge
>
> - imports - why mix org.apache.cassandra with other imports (log4j,
> google, etc.)? I know that order is used for a while, but I was always
> curious why we do that?
>
> - format configurations for IDEs - it seems like IntelliJ can import
> Eclipse formatter configuration, so maybe one configuration could be enough
>
> - define continuation indent - currently it is 0 characters
>
> - for unit test assertions - prefer AssertJ assertions over standard junit
> or hamcrest - maybe forbid them? AssertJ has much better descriptions of
> failing assertions
>
>
>
> I hope that we can enforce the rules using checkstyle, otherwise this
> effort may have little effect. For the transition, perhaps checkstyle could
> run on CircleCI just for the modified files?
>
>
>
> Thanks,
>
> jacek
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
>
>
>
> On Mon, Mar 14, 2022 at 1:10 PM Josh McKenzie <jm...@apache.org>
> wrote:
>
> we should add Python code style guides to it
>
> Strongly agree. We're hurting ourselves by treating our python as a 2nd
> class citizen.
>
>
>
> if we should avoid making method parameters and local variables `final` -
> this is inconsistent over the code base, but I'd prefer not having them. If
> the method is large enough that we might mistakenly reuse
> parameters/variables, we should probably refactor the met
>
> Why not both (i.e. use final where possible and refactor when at length /
> doing too much)? The benefits of immutability are generally well recognized
> as are the benefits of keeping methods to reasonable lengths and complexity.
>
>
>
>
>
> On Mon, Mar 14, 2022, at 7:59 AM, Marcus Eriksson wrote:
>
> Looks good
>
>
>
> One thing that might be missing is direction on if we should avoid making
> method parameters and local variables `final` - this is inconsistent over
> the code base, but I'd prefer not having them. If the method is large
> enough that we might mistakenly reuse parameters/variables, we should
> probably refactor the method.
>
>
>
> /Marcus
>
>
>
> On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
>
> > Our style guide hasn’t been updated in about a decade, and I think it is
> overdue some improvements that address some shortcomings as well as modern
> facilities such as streams and lambdas.
>
> >
>
> > Most of this was put together for an effort Dinesh started a few years
> ago, but has languished since, in part because the project has always
> seemed to have other priorities. I figure there’s never a good time to
> raise a contended topic, so here is my suggested update to contributor
> guidelines:
>
> >
>
> >
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> >
>
> > Many of these suggestions codify norms already widely employed,
> sometimes in spite of the style guide, but some likely remain contentious.
> Some potentially contentious things to draw your attention to:
>
> >
>
> >
>
> >   *   Deemphasis of getX() nomenclature, in favour of richer set of
> prefixes and more succinct simple x() to retrieve where clear
>
> >   *   Avoid implementing methods, incl. equals(), hashCode() and
> toString(), unless actually used
>
> >   *   Modified new-line rules for multi-line function calls
>
> >   *   External dependency rules (require DISCUSS thread before
> introducing)
>
> >
>
> >
>
> >
>
>
>
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by "benedict@apache.org" <be...@apache.org>.
Hi Jacek,


> Sometimes, although it would be just a single implementation, interface can make sense for testing purposes - for mocking in particular

This would surely mean there are two implementations, one of which is in the test tree? I think this is therefore already covered.


> For exception handling, perhaps we should explicitly mention in the guideline that we should always handle Exception or Throwable (which is frequently being catched in the code) by methods from Throwables, which would properly deal with InterruptedException

I do not think this properly handles InterruptedException – InterruptedException that are not to be handled directly should now really be handled by propagating UncheckedInterruptedException, which is very different to catching all Throwable. In many cases InterruptedException should be handled explicitly, however.

I do not think catching Exception or Throwable is the correct solution in most cases either – we should ideally only do so at the top level at which we want broad unforeseen problems to be handled, or where we need to take specific actions to handle exception, in which case we should ideally always rethrow the Throwable unmolested. I can see some benefit from explicitly outlining these cases, as it is not trivial to handle exceptions cleanly and correctly. We could perhaps create an exception handling addendum, perhaps in a separate page, that goes into greater detail?


> I found it useful to access singletons by getInstance() method rather than directly



This can be beneficial for public use cases, but for private use cases it is oftentimes unhelpful to pollute the code. Also note that the document explicitly proposes avoiding getX, so we would instead have e.g. a method called instance(). Happy to add a section for this.



>- "...If a line wraps inside a method call, try to group natural parameters together on a single line..." while I'm generally ok with that approach, putting each argument in a new line, makes it easier for git / review / automatic merge



I personally prefer to optimise for readability, and 10+ lines of single short parameters badly pollutes a page of code IMO. If there is no consensus on this we can put it to an indicative vote.



>- imports - why mix org.apache.cassandra with other imports (log4j, google, etc.)? I know that order is used for a while, but I was always curious why we do that?



I would be happy to revisit these, as we have not been consistent about imports in the codebase. I do not know why they are as they are.


> - define continuation indent - currently it is 0 characters

An opening brace introduces any necessary indentation (from the start of the line, which is perfect for legibility). I am somewhat inlined to declare that braces must be used if the lambda cannot fit on the declaring line.

From: Jacek Lewandowski <le...@gmail.com>
Date: Monday, 14 March 2022 at 14:27
To: dev@cassandra.apache.org <de...@cassandra.apache.org>
Subject: Re: Updating our Code Contribution/Style Guide
Hi,

I was looking at the document and have some thoughts:


- Sometimes, although it would be just a single implementation, interface can make sense for testing purposes - for mocking in particular

- For exception handling, perhaps we should explicitly mention in the guideline that we should always handle Exception or Throwable (which is frequently being catched in the code) by methods from Throwables, which would properly deal with InterruptedException

- I found it useful to access singletons by getInstance() method rather than directly (public final static field). When we use getInstance() method, we may go further and make getInstance return the instance from a provider, which would by default return the value of final field. However in tests, it could return the value of some mutable static field from a custom provider. This way we would be able to easily switch the singleton which is impossible without reloading a class at the moment

- "...If a line wraps inside a method call, try to group natural parameters together on a single line..." while I'm generally ok with that approach, putting each argument in a new line, makes it easier for git / review / automatic merge

- imports - why mix org.apache.cassandra with other imports (log4j, google, etc.)? I know that order is used for a while, but I was always curious why we do that?

- format configurations for IDEs - it seems like IntelliJ can import Eclipse formatter configuration, so maybe one configuration could be enough

- define continuation indent - currently it is 0 characters

- for unit test assertions - prefer AssertJ assertions over standard junit or hamcrest - maybe forbid them? AssertJ has much better descriptions of failing assertions



I hope that we can enforce the rules using checkstyle, otherwise this effort may have little effect. For the transition, perhaps checkstyle could run on CircleCI just for the modified files?



Thanks,

jacek














- - -- --- ----- -------- -------------
Jacek Lewandowski


On Mon, Mar 14, 2022 at 1:10 PM Josh McKenzie <jm...@apache.org>> wrote:

we should add Python code style guides to it
Strongly agree. We're hurting ourselves by treating our python as a 2nd class citizen.

if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the met
Why not both (i.e. use final where possible and refactor when at length / doing too much)? The benefits of immutability are generally well recognized as are the benefits of keeping methods to reasonable lengths and complexity.


On Mon, Mar 14, 2022, at 7:59 AM, Marcus Eriksson wrote:
Looks good

One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.

/Marcus

On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org<ma...@apache.org> wrote:
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
>
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
>
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
>
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
>
>
>   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>   *   Modified new-line rules for multi-line function calls
>   *   External dependency rules (require DISCUSS thread before introducing)
>
>
>



Re: Updating our Code Contribution/Style Guide

Posted by Jacek Lewandowski <le...@gmail.com>.
Hi,

I was looking at the document and have some thoughts:

- Sometimes, although it would be just a single implementation, interface
can make sense for testing purposes - for mocking in particular

- For exception handling, perhaps we should explicitly mention in the
guideline that we should always handle Exception or Throwable (which is
frequently being catched in the code) by methods from Throwables, which
would properly deal with InterruptedException

- I found it useful to access singletons by getInstance() method rather
than directly (public final static field). When we use getInstance()
method, we may go further and make getInstance return the instance from a
provider, which would by default return the value of final field. However
in tests, it could return the value of some mutable static field from a
custom provider. This way we would be able to easily switch the singleton
which is impossible without reloading a class at the moment

- "...If a line wraps inside a method call, try to group natural parameters
together on a single line..." while I'm generally ok with that approach,
putting each argument in a new line, makes it easier for git / review /
automatic merge

- imports - why mix org.apache.cassandra with other imports (log4j, google,
etc.)? I know that order is used for a while, but I was always curious why
we do that?

- format configurations for IDEs - it seems like IntelliJ can import
Eclipse formatter configuration, so maybe one configuration could be enough

- define continuation indent - currently it is 0 characters

- for unit test assertions - prefer AssertJ assertions over standard junit
or hamcrest - maybe forbid them? AssertJ has much better descriptions of
failing assertions


I hope that we can enforce the rules using checkstyle, otherwise this
effort may have little effect. For the transition, perhaps checkstyle could
run on CircleCI just for the modified files?


Thanks,

jacek









- - -- --- ----- -------- -------------
Jacek Lewandowski


On Mon, Mar 14, 2022 at 1:10 PM Josh McKenzie <jm...@apache.org> wrote:

> we should add Python code style guides to it
>
> Strongly agree. We're hurting ourselves by treating our python as a 2nd
> class citizen.
>
> if we should avoid making method parameters and local variables `final` -
> this is inconsistent over the code base, but I'd prefer not having them. If
> the method is large enough that we might mistakenly reuse
> parameters/variables, we should probably refactor the met
>
> Why not both (i.e. use final where possible and refactor when at length /
> doing too much)? The benefits of immutability are generally well recognized
> as are the benefits of keeping methods to reasonable lengths and complexity.
>
>
> On Mon, Mar 14, 2022, at 7:59 AM, Marcus Eriksson wrote:
>
> Looks good
>
> One thing that might be missing is direction on if we should avoid making
> method parameters and local variables `final` - this is inconsistent over
> the code base, but I'd prefer not having them. If the method is large
> enough that we might mistakenly reuse parameters/variables, we should
> probably refactor the method.
>
> /Marcus
>
> On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
> > Our style guide hasn’t been updated in about a decade, and I think it is
> overdue some improvements that address some shortcomings as well as modern
> facilities such as streams and lambdas.
> >
> > Most of this was put together for an effort Dinesh started a few years
> ago, but has languished since, in part because the project has always
> seemed to have other priorities. I figure there’s never a good time to
> raise a contended topic, so here is my suggested update to contributor
> guidelines:
> >
> >
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
> >
> > Many of these suggestions codify norms already widely employed,
> sometimes in spite of the style guide, but some likely remain contentious.
> Some potentially contentious things to draw your attention to:
> >
> >
> >   *   Deemphasis of getX() nomenclature, in favour of richer set of
> prefixes and more succinct simple x() to retrieve where clear
> >   *   Avoid implementing methods, incl. equals(), hashCode() and
> toString(), unless actually used
> >   *   Modified new-line rules for multi-line function calls
> >   *   External dependency rules (require DISCUSS thread before
> introducing)
> >
> >
> >
>
>
>

Re: Updating our Code Contribution/Style Guide

Posted by Josh McKenzie <jm...@apache.org>.
> we should add Python code style guides to it
> 
Strongly agree. We're hurting ourselves by treating our python as a 2nd class citizen.

> if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the met
Why not both (i.e. use final where possible and refactor when at length / doing too much)? The benefits of immutability are generally well recognized as are the benefits of keeping methods to reasonable lengths and complexity.


On Mon, Mar 14, 2022, at 7:59 AM, Marcus Eriksson wrote:
> Looks good
> 
> One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.
> 
> /Marcus
> 
> On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
> > Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
> > 
> > Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
> > 
> > https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
> > 
> > Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
> > 
> > 
> >   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
> >   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
> >   *   Modified new-line rules for multi-line function calls
> >   *   External dependency rules (require DISCUSS thread before introducing)
> > 
> > 
> > 
> 

Re: Updating our Code Contribution/Style Guide

Posted by Marcus Eriksson <ma...@apache.org>.
Looks good

One thing that might be missing is direction on if we should avoid making method parameters and local variables `final` - this is inconsistent over the code base, but I'd prefer not having them. If the method is large enough that we might mistakenly reuse parameters/variables, we should probably refactor the method.

/Marcus

On Mon, Mar 14, 2022 at 09:41:35AM +0000, benedict@apache.org wrote:
> Our style guide hasn’t been updated in about a decade, and I think it is overdue some improvements that address some shortcomings as well as modern facilities such as streams and lambdas.
> 
> Most of this was put together for an effort Dinesh started a few years ago, but has languished since, in part because the project has always seemed to have other priorities. I figure there’s never a good time to raise a contended topic, so here is my suggested update to contributor guidelines:
> 
> https://docs.google.com/document/d/1sjw0crb0clQin2tMgZLt_ob4hYfLJYaU4lRX722htTo
> 
> Many of these suggestions codify norms already widely employed, sometimes in spite of the style guide, but some likely remain contentious. Some potentially contentious things to draw your attention to:
> 
> 
>   *   Deemphasis of getX() nomenclature, in favour of richer set of prefixes and more succinct simple x() to retrieve where clear
>   *   Avoid implementing methods, incl. equals(), hashCode() and toString(), unless actually used
>   *   Modified new-line rules for multi-line function calls
>   *   External dependency rules (require DISCUSS thread before introducing)
> 
> 
>