You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Suneet Saldanha <su...@imply.io> on 2020/01/02 17:52:36 UTC

Re: Test naming in Druid

Apologies for the delay in responding to this, and for the long response :)

I like Clint's suggestion of adding this as a best practice. This way we
can try it doesn't prevent people from naming tests more specifically. And
if the community sees value in it, usage will uptick over time and we can
think about adding an automatic way to enforce this if there's enough
value.
I agree that this style may not work in all cases, however, in my
experience - I've seen it work for a lot of functional tests as well.
I think you picked good examples - here's how I would name the tests

   - CalciteQueryTest - I think as written this has the first 2 parts of
   the suggested format. For example `testSelectConstantExpression` ->
   `test_selectConstantExpression_withoutTables_shouldReturnConstant` I think
   the "function" under test could also be `selectRegexExpression` if the devs
   believed that other constant expressions were special cases that needed
   testing.
   - GroupByQueryRunnerTest - similar structure here as well. `testGroupBy`
   -> `test_groupBy_aggregateByDay_shouldReturnCorrectResult`
   `testGroupByOnMissingColumn` ->
   `test_groupBy_onMissingColumnWithRowCount_shouldReturnAllRows`
   - SelectorFilterTest -> `testSingleValueStringColumnWithoutNulls` - this
   looks like it tests 4 things. This definition would encourage 4 small tests
   instead of 1 test that tests 4 things
   `test_dimFilter_singleValueStringColumnWithoutNullsMatchesNulls_shouldReturnNothing`,
   `test_dimFilter_singleValueStringColumnWithoutNullsMatchesEmptyString_shouldReturnNothing`.
   I also acknowledge that this may not always make sense - for example:
   `testSingleValueStringColumnWithNulls` The special handling here for
   NullHandling.replaceWithDefault() makes it difficult to use this format.

Instead of `functionUnderTest`, I think of it as `componentUnderTest` maybe
this makes it easier to describe functional tests.

Thanks again for being open to trying this out in your PR! In my next PR, I
can try tests without underscores - since there is concern from at least 2
people - to see if we can still achieve explicit and easy to read test
names without the underscores

On Mon, Dec 30, 2019 at 1:33 PM Gian Merlino <gi...@apache.org> wrote:

> FYI: In https://github.com/apache/druid/pull/9111 I've named the new test
> methods in this style, minus the "expectedResult" part. While I'm still not
> sure that this style works in all cases (as mentioned in my previous mail),
> I can see that it's useful for unit tests. For example,
> HashJoinSegmentStorageAdapterTest (
>
> https://github.com/apache/druid/pull/9111/files#diff-0fffbdcb587f7a95008a66a7291a7256
> ).
>
> Curious what you all think.
>
>
> On Tue, Dec 24, 2019 at 12:38 AM Clint Wylie <cw...@apache.org> wrote:
>
> > I strongly agree with Gian about underscores being ugly, though I think
> > your suggested formatting would be just as readable (to me at least)
> using
> > just camel-case and would also remain consistent with the styling of the
> > majority of Druids existing tests. I share similar concerns with the
> others
> > on how well it holds up as a convention beyond simple unit tests. I also
> > don't really feel that the utility this naming adds is worth treating it
> as
> > a hard rule, at least at this point. In my experience, the most useful
> > thing in determining why a test has failed and what my code did to do it
> is
> > generally either the stack trace of the error or the mismatched values in
> > the test expectation. The test name doesn't really matter to me at least,
> > and is more of just an anchor point to navigate to a position in a file
> to
> > let me know which test(s) to re-run, where as the other information is
> > usually all it takes to clue me into what I part of the changes in my PR
> > likely caused the test to fail.
> >
> > Because of this, and more importantly that there will be no possible way
> > that tooling can enforce such a naming rule, I think it instead should
> just
> > be captured as best practices in CONTRIBUTING.md or something like that,
> as
> > just a suggestion, so that we don't have to nag at people to enforce
> this.
> >
> > On Mon, Dec 23, 2019 at 7:15 PM Jonathan Wei <jo...@apache.org> wrote:
> >
> > > I think the suggested format could be useful when applicable, I would
> > > consider expanding "function under test" to "area under test" to
> > > accommodate tests that have a wider scope than a single method.
> > >
> > > For cases where that format may not be applicable, perhaps that means
> > it's
> > > a test where additional javadocs/comments explaining what the test
> > > does/expects is a good idea (that would be enough info in my view).
> > >
> > > Since I don't think the proposed format applies universally, I would
> > prefer
> > > starting it off as a suggestion/best practice instead of as a hard
> > > requirement and seeing how that goes.
> > >
> > > I'm indifferent to the use of underscores.
> > >
> > >
> > >
> > > On Mon, Dec 23, 2019 at 3:19 PM Gian Merlino <gi...@apache.org> wrote:
> > >
> > > > Suneet,
> > > >
> > > > Sometimes it's hard to understand how things would improve without an
> > > > example. Could you point to a test file that you think would be
> > improved
> > > by
> > > > this change? Also, there are some test files that I would struggle to
> > fit
> > > > into this framework. It seems best suited to simple single-method
> unit
> > > > tests. What would you suggest doing for these examples?
> > > >
> > > > - CalciteQueryTest: Tests the SQL layer. There's one function per SQL
> > > query
> > > > being tested, but there is no specific function under test, and the
> > > > expected result is too complex to include in a method name.
> > > > - GroupByQueryRunnerTest: Tests the groupBy query engine. Similar
> > > structure
> > > > and similar issue to CalciteQueryTest.
> > > > - SelectorFilterTest: Tests the selector filter. But no specific
> > function
> > > > is being tested. The "assertFilterMatches" helper is used to create
> > test
> > > > cases that say 'this filter should match these rows'.
> > > >
> > > > Other notes:
> > > >
> > > > - IMO, the underscores are ugly, but acceptable if they buy us
> > something.
> > > > - I'm not sure that the "expectedResult" part of the name is going to
> > > work
> > > > well. It's often something complex (like an array, or object, or long
> > > > string) that would be awkward to include in a method name. And with
> > JUnit
> > > > assertions, the expected result is right there in the method call
> > anyway.
> > > >
> > > > Looking forward to hearing your thoughts. Thanks for thinking about
> how
> > > to
> > > > make our testing better.
> > > >
> > > > Gian
> > > >
> > > > On Mon, Dec 23, 2019 at 11:06 AM Suneet Saldanha <
> > > suneet.saldanha@imply.io
> > > > >
> > > > wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > I've started writing tests in the druid repo and would like to
> > propose
> > > a
> > > > > new test naming structure for the project. Inspired by this thread
> -
> > > > > https://www.mail-archive.com/dev@druid.apache.org/msg02426.html
> but
> > > > > focused
> > > > > only on the naming of tests. I'd like to propose we start using the
> > > > format
> > > > > below
> > > > >
> > > > > test_<functionUnderTest>_<conditions>_<expectedResult>
> > > > >
> > > > > This makes it a lot easier for devs to name tests in a way that's
> > > easily
> > > > > understood by someone else without having to read the test to know
> > > what's
> > > > > going on. Here's some of my rationale:
> > > > >
> > > > >    - Explicit - the name tells you everything you need to know
> about
> > > the
> > > > >    test
> > > > >    - Forces you to write one test per condition/ expected result
> > > > >    - Underscores make it easy to delineate the different components
> > of
> > > > the
> > > > >    test
> > > > >    - Minimal effort to think of a short name that correctly
> captures
> > > > >    everything about the test while still being different from all
> the
> > > > other
> > > > >    tests
> > > > >
> > > > > Happy to hear feedback / concerns about this approach.
> > > > >
> > > > > Suneet
> > > > >
> > > >
> > >
> >
>

Re: Test naming in Druid

Posted by Jihoon Son <ji...@apache.org>.
I also gave the suggested format a try without underscores in my PR:
https://github.com/apache/druid/pull/9138.
I find it pretty useful and also see how underscores can make the test name
more readable.
However, the underscores look pretty ugly to me as well while the test
names without underscores are just ok to read.

Jihoon

On Thu, Jan 2, 2020 at 9:53 AM Suneet Saldanha <su...@imply.io>
wrote:

> Apologies for the delay in responding to this, and for the long response :)
>
> I like Clint's suggestion of adding this as a best practice. This way we
> can try it doesn't prevent people from naming tests more specifically. And
> if the community sees value in it, usage will uptick over time and we can
> think about adding an automatic way to enforce this if there's enough
> value.
> I agree that this style may not work in all cases, however, in my
> experience - I've seen it work for a lot of functional tests as well.
> I think you picked good examples - here's how I would name the tests
>
>    - CalciteQueryTest - I think as written this has the first 2 parts of
>    the suggested format. For example `testSelectConstantExpression` ->
>    `test_selectConstantExpression_withoutTables_shouldReturnConstant` I
> think
>    the "function" under test could also be `selectRegexExpression` if the
> devs
>    believed that other constant expressions were special cases that needed
>    testing.
>    - GroupByQueryRunnerTest - similar structure here as well. `testGroupBy`
>    -> `test_groupBy_aggregateByDay_shouldReturnCorrectResult`
>    `testGroupByOnMissingColumn` ->
>    `test_groupBy_onMissingColumnWithRowCount_shouldReturnAllRows`
>    - SelectorFilterTest -> `testSingleValueStringColumnWithoutNulls` - this
>    looks like it tests 4 things. This definition would encourage 4 small
> tests
>    instead of 1 test that tests 4 things
>
>  `test_dimFilter_singleValueStringColumnWithoutNullsMatchesNulls_shouldReturnNothing`,
>
>  `test_dimFilter_singleValueStringColumnWithoutNullsMatchesEmptyString_shouldReturnNothing`.
>    I also acknowledge that this may not always make sense - for example:
>    `testSingleValueStringColumnWithNulls` The special handling here for
>    NullHandling.replaceWithDefault() makes it difficult to use this format.
>
> Instead of `functionUnderTest`, I think of it as `componentUnderTest` maybe
> this makes it easier to describe functional tests.
>
> Thanks again for being open to trying this out in your PR! In my next PR, I
> can try tests without underscores - since there is concern from at least 2
> people - to see if we can still achieve explicit and easy to read test
> names without the underscores
>
> On Mon, Dec 30, 2019 at 1:33 PM Gian Merlino <gi...@apache.org> wrote:
>
> > FYI: In https://github.com/apache/druid/pull/9111 I've named the new
> test
> > methods in this style, minus the "expectedResult" part. While I'm still
> not
> > sure that this style works in all cases (as mentioned in my previous
> mail),
> > I can see that it's useful for unit tests. For example,
> > HashJoinSegmentStorageAdapterTest (
> >
> >
> https://github.com/apache/druid/pull/9111/files#diff-0fffbdcb587f7a95008a66a7291a7256
> > ).
> >
> > Curious what you all think.
> >
> >
> > On Tue, Dec 24, 2019 at 12:38 AM Clint Wylie <cw...@apache.org> wrote:
> >
> > > I strongly agree with Gian about underscores being ugly, though I think
> > > your suggested formatting would be just as readable (to me at least)
> > using
> > > just camel-case and would also remain consistent with the styling of
> the
> > > majority of Druids existing tests. I share similar concerns with the
> > others
> > > on how well it holds up as a convention beyond simple unit tests. I
> also
> > > don't really feel that the utility this naming adds is worth treating
> it
> > as
> > > a hard rule, at least at this point. In my experience, the most useful
> > > thing in determining why a test has failed and what my code did to do
> it
> > is
> > > generally either the stack trace of the error or the mismatched values
> in
> > > the test expectation. The test name doesn't really matter to me at
> least,
> > > and is more of just an anchor point to navigate to a position in a file
> > to
> > > let me know which test(s) to re-run, where as the other information is
> > > usually all it takes to clue me into what I part of the changes in my
> PR
> > > likely caused the test to fail.
> > >
> > > Because of this, and more importantly that there will be no possible
> way
> > > that tooling can enforce such a naming rule, I think it instead should
> > just
> > > be captured as best practices in CONTRIBUTING.md or something like
> that,
> > as
> > > just a suggestion, so that we don't have to nag at people to enforce
> > this.
> > >
> > > On Mon, Dec 23, 2019 at 7:15 PM Jonathan Wei <jo...@apache.org>
> wrote:
> > >
> > > > I think the suggested format could be useful when applicable, I would
> > > > consider expanding "function under test" to "area under test" to
> > > > accommodate tests that have a wider scope than a single method.
> > > >
> > > > For cases where that format may not be applicable, perhaps that means
> > > it's
> > > > a test where additional javadocs/comments explaining what the test
> > > > does/expects is a good idea (that would be enough info in my view).
> > > >
> > > > Since I don't think the proposed format applies universally, I would
> > > prefer
> > > > starting it off as a suggestion/best practice instead of as a hard
> > > > requirement and seeing how that goes.
> > > >
> > > > I'm indifferent to the use of underscores.
> > > >
> > > >
> > > >
> > > > On Mon, Dec 23, 2019 at 3:19 PM Gian Merlino <gi...@apache.org>
> wrote:
> > > >
> > > > > Suneet,
> > > > >
> > > > > Sometimes it's hard to understand how things would improve without
> an
> > > > > example. Could you point to a test file that you think would be
> > > improved
> > > > by
> > > > > this change? Also, there are some test files that I would struggle
> to
> > > fit
> > > > > into this framework. It seems best suited to simple single-method
> > unit
> > > > > tests. What would you suggest doing for these examples?
> > > > >
> > > > > - CalciteQueryTest: Tests the SQL layer. There's one function per
> SQL
> > > > query
> > > > > being tested, but there is no specific function under test, and the
> > > > > expected result is too complex to include in a method name.
> > > > > - GroupByQueryRunnerTest: Tests the groupBy query engine. Similar
> > > > structure
> > > > > and similar issue to CalciteQueryTest.
> > > > > - SelectorFilterTest: Tests the selector filter. But no specific
> > > function
> > > > > is being tested. The "assertFilterMatches" helper is used to create
> > > test
> > > > > cases that say 'this filter should match these rows'.
> > > > >
> > > > > Other notes:
> > > > >
> > > > > - IMO, the underscores are ugly, but acceptable if they buy us
> > > something.
> > > > > - I'm not sure that the "expectedResult" part of the name is going
> to
> > > > work
> > > > > well. It's often something complex (like an array, or object, or
> long
> > > > > string) that would be awkward to include in a method name. And with
> > > JUnit
> > > > > assertions, the expected result is right there in the method call
> > > anyway.
> > > > >
> > > > > Looking forward to hearing your thoughts. Thanks for thinking about
> > how
> > > > to
> > > > > make our testing better.
> > > > >
> > > > > Gian
> > > > >
> > > > > On Mon, Dec 23, 2019 at 11:06 AM Suneet Saldanha <
> > > > suneet.saldanha@imply.io
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I've started writing tests in the druid repo and would like to
> > > propose
> > > > a
> > > > > > new test naming structure for the project. Inspired by this
> thread
> > -
> > > > > > https://www.mail-archive.com/dev@druid.apache.org/msg02426.html
> > but
> > > > > > focused
> > > > > > only on the naming of tests. I'd like to propose we start using
> the
> > > > > format
> > > > > > below
> > > > > >
> > > > > > test_<functionUnderTest>_<conditions>_<expectedResult>
> > > > > >
> > > > > > This makes it a lot easier for devs to name tests in a way that's
> > > > easily
> > > > > > understood by someone else without having to read the test to
> know
> > > > what's
> > > > > > going on. Here's some of my rationale:
> > > > > >
> > > > > >    - Explicit - the name tells you everything you need to know
> > about
> > > > the
> > > > > >    test
> > > > > >    - Forces you to write one test per condition/ expected result
> > > > > >    - Underscores make it easy to delineate the different
> components
> > > of
> > > > > the
> > > > > >    test
> > > > > >    - Minimal effort to think of a short name that correctly
> > captures
> > > > > >    everything about the test while still being different from all
> > the
> > > > > other
> > > > > >    tests
> > > > > >
> > > > > > Happy to hear feedback / concerns about this approach.
> > > > > >
> > > > > > Suneet
> > > > > >
> > > > >
> > > >
> > >
> >
>