You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@druid.apache.org by Roman Leventov <le...@apache.org> on 2019/10/04 16:44:26 UTC

Looking out of the Druid's development bubble at modern Java testing practices

A few days ago I've stumbled upon a blog post "Modern Best Practices for
Testing in Java" (
https://phauer.com/2019/modern-best-practices-testing-java/) and learned a
lot of new things from it.

In a lot of cases, how testing is done in Druid goes against these best
practices:
 - "Given, When, Then" structure: in Druid, rarely followed or clarified
via blank lines.
 - Use the Prefixes “actual*” and “expected*: not in Druid, usually it's
something1 / something2 and what is expected and what is actual is unclear
(and sometimes confused in assert() args)
 - Heavily Use Helper Functions: used sometimes in Druid, but by far not
enough,  in my perception
 - Don’t Extend Existing Tests To “Just Test One More Tiny Thing: not
everywhere, but in a lot of places this practice is abused in Druid tests
 - Insert Test Data Right In The Test Method:  test data is often inserted
in @Before in Druid
 - Favor Composition Over Inheritance: there are some hierarchies in Druid
 - Don’t Write Too Much Logic: Druid tests feel like *all*
boilerplate/noise/scaffolding logic
 - Don’t Use In-Memory Databases For Tests: Druid does toy database Derby
for tests :)
 - Use AssertJ: Druid doesn't use AssertJ (I've just added AssertJ to
dependencies in https://github.com/apache/incubator-druid/pull/8564 in one
module)
 - Use JUnit5: ¯\_(ツ)_/¯
 - Mock Remote Service: in Druid, we don't use OkHttp's MockWebServer
 - Use Awaitility for Asserting Asynchronous Code: in Druid, we use ad-hoc
while loops with sleep(10), sleep(100), etc.
 - Separate Asynchronous Execution and Actual Logic: I think, this is one
of the most important points. In Druid, in almost all cases, execution
management (thread pools), dependency injection (and dependencies in
general), and the "core" logic are alloyed in a single class. This makes
testing the core logic very painful with a lot of mocking and setup
ceremony, breaking frequently. Following the Humble Object pattern (
http://xunitpatterns.com/Humble%20Object.html) would make writing *and
supporting* much less laborious.

These things cannot be fixed in one day because test code has a lot of
inertia, but I believe to improve developer productivity all active Druid
developers should start using most of these practices in all new tests
being written, demand other contributors align their tests with these
practices in code reviews, and take some time to think about strategic ways
to refactor a bulk of existing tests with little effort
(semi-automatically) that can at the same time improve their quality even
by a little.

In my experience, wrestling with tests is a huge (often the biggest) part
of changing existing code in Druid, so we pay a huge productivity tax by
not paying enough attention to the quality of the test codebase.

Re: Looking out of the Druid's development bubble at modern Java testing practices

Posted by Gian Merlino <gi...@apache.org>.

I like the "Complete Vertical Slide" recommendation. It goes against the
wisdom of having focused unit tests, but I think in my experience, the
tests that shake out the most bugs (and are most robust to refactoring)
have been ones that wrap together a lot of layers.

One thing I didn't see in the article, but also like, is declarative tests
(like the ones in CalciteQueryTest). I think they're short, effective at
catching bugs, easy to read, and easy to write more of. They would be even
better if they used builder-style methods instead of positional arguments.

I don't think the "Don’t Use In-Memory Databases" recommendation applies to
us. The idea is that you should test with something similar to production.
But Druid is super-pluggable, and it supports a ton of options for
databases, deep storages, and other integrations in production. We use
Derby for metadata and local fs for deep storage in most tests, which _are_
supported production options. It's just that there are _also other_ options
that we are not testing in unit tests. This has caused regressions in the
past and I suppose calls for a test matrix. In the long run, we might
benefit from bringing some of this stuff "in house". Imagine metadata
stores were no longer pluggable, but instead there was only one supported
option that lived on the Coordinator. It would be an effort to develop
that, but then testing becomes simpler since one dimension of the matrix
goes away.

On Fri, Oct 4, 2019 at 9:44 AM Roman Leventov <le...@apache.org> wrote:

> A few days ago I've stumbled upon a blog post "Modern Best Practices for
> Testing in Java" (
> https://phauer.com/2019/modern-best-practices-testing-java/) and learned a
> lot of new things from it.
>
> In a lot of cases, how testing is done in Druid goes against these best
> practices:
>  - "Given, When, Then" structure: in Druid, rarely followed or clarified
> via blank lines.
>  - Use the Prefixes “actual*” and “expected*: not in Druid, usually it's
> something1 / something2 and what is expected and what is actual is unclear
> (and sometimes confused in assert() args)
>  - Heavily Use Helper Functions: used sometimes in Druid, but by far not
> enough,  in my perception
>  - Don’t Extend Existing Tests To “Just Test One More Tiny Thing: not
> everywhere, but in a lot of places this practice is abused in Druid tests
>  - Insert Test Data Right In The Test Method:  test data is often inserted
> in @Before in Druid
>  - Favor Composition Over Inheritance: there are some hierarchies in Druid
>  - Don’t Write Too Much Logic: Druid tests feel like *all*
> boilerplate/noise/scaffolding logic
>  - Don’t Use In-Memory Databases For Tests: Druid does toy database Derby
> for tests :)
>  - Use AssertJ: Druid doesn't use AssertJ (I've just added AssertJ to
> dependencies in https://github.com/apache/incubator-druid/pull/8564 in one
> module)
>  - Use JUnit5: ¯\_(ツ)_/¯
>  - Mock Remote Service: in Druid, we don't use OkHttp's MockWebServer
>  - Use Awaitility for Asserting Asynchronous Code: in Druid, we use ad-hoc
> while loops with sleep(10), sleep(100), etc.
>  - Separate Asynchronous Execution and Actual Logic: I think, this is one
> of the most important points. In Druid, in almost all cases, execution
> management (thread pools), dependency injection (and dependencies in
> general), and the "core" logic are alloyed in a single class. This makes
> testing the core logic very painful with a lot of mocking and setup
> ceremony, breaking frequently. Following the Humble Object pattern (
> http://xunitpatterns.com/Humble%20Object.html) would make writing *and
> supporting* much less laborious.
>
> These things cannot be fixed in one day because test code has a lot of
> inertia, but I believe to improve developer productivity all active Druid
> developers should start using most of these practices in all new tests
> being written, demand other contributors align their tests with these
> practices in code reviews, and take some time to think about strategic ways
> to refactor a bulk of existing tests with little effort
> (semi-automatically) that can at the same time improve their quality even
> by a little.
>
> In my experience, wrestling with tests is a huge (often the biggest) part
> of changing existing code in Druid, so we pay a huge productivity tax by
> not paying enough attention to the quality of the test codebase.
>

Re: Looking out of the Druid's development bubble at modern Java testing practices

Posted by Roman Leventov <le...@gmail.com>.

Nothing in particular. I think just reading the article. And also actually
paying attention to test quality. In the past, on many occasions, I've
skipped reading the test code in pull-requests completely or glanced over
it very cursory which is a bad practice.

The questions that developers and reviewers should ask themselves are
 - Will it be easy to add more tests to this class?
 - Will it be easy to support existing and new tests?
 - Will the test break (or would require rework) when somebody changes
implementation internals, e. g. due to mocking not working properly?

On Fri, 4 Oct 2019 at 21:57, Jad Naous <ja...@imply.io> wrote:

> That's a great link, thank you! What do you think we should change in the
> process to start following these patterns? Should the committers be more
> strict on what gets committed?
>
> On Fri, Oct 4, 2019 at 9:44 AM Roman Leventov <le...@apache.org> wrote:
>
> > A few days ago I've stumbled upon a blog post "Modern Best Practices for
> > Testing in Java" (
> > https://phauer.com/2019/modern-best-practices-testing-java/) and
> learned a
> > lot of new things from it.
> >
> > In a lot of cases, how testing is done in Druid goes against these best
> > practices:
> >  - "Given, When, Then" structure: in Druid, rarely followed or clarified
> > via blank lines.
> >  - Use the Prefixes “actual*” and “expected*: not in Druid, usually it's
> > something1 / something2 and what is expected and what is actual is
> unclear
> > (and sometimes confused in assert() args)
> >  - Heavily Use Helper Functions: used sometimes in Druid, but by far not
> > enough,  in my perception
> >  - Don’t Extend Existing Tests To “Just Test One More Tiny Thing: not
> > everywhere, but in a lot of places this practice is abused in Druid tests
> >  - Insert Test Data Right In The Test Method:  test data is often
> inserted
> > in @Before in Druid
> >  - Favor Composition Over Inheritance: there are some hierarchies in
> Druid
> >  - Don’t Write Too Much Logic: Druid tests feel like *all*
> > boilerplate/noise/scaffolding logic
> >  - Don’t Use In-Memory Databases For Tests: Druid does toy database Derby
> > for tests :)
> >  - Use AssertJ: Druid doesn't use AssertJ (I've just added AssertJ to
> > dependencies in https://github.com/apache/incubator-druid/pull/8564 in
> one
> > module)
> >  - Use JUnit5: ¯\_(ツ)_/¯
> >  - Mock Remote Service: in Druid, we don't use OkHttp's MockWebServer
> >  - Use Awaitility for Asserting Asynchronous Code: in Druid, we use
> ad-hoc
> > while loops with sleep(10), sleep(100), etc.
> >  - Separate Asynchronous Execution and Actual Logic: I think, this is one
> > of the most important points. In Druid, in almost all cases, execution
> > management (thread pools), dependency injection (and dependencies in
> > general), and the "core" logic are alloyed in a single class. This makes
> > testing the core logic very painful with a lot of mocking and setup
> > ceremony, breaking frequently. Following the Humble Object pattern (
> > http://xunitpatterns.com/Humble%20Object.html) would make writing *and
> > supporting* much less laborious.
> >
> > These things cannot be fixed in one day because test code has a lot of
> > inertia, but I believe to improve developer productivity all active Druid
> > developers should start using most of these practices in all new tests
> > being written, demand other contributors align their tests with these
> > practices in code reviews, and take some time to think about strategic
> ways
> > to refactor a bulk of existing tests with little effort
> > (semi-automatically) that can at the same time improve their quality even
> > by a little.
> >
> > In my experience, wrestling with tests is a huge (often the biggest) part
> > of changing existing code in Druid, so we pay a huge productivity tax by
> > not paying enough attention to the quality of the test codebase.
> >
>
>
> --
> Jad Naous
> Imply | VP R&D
> 650-521-3425
> jad.naous@imply.io
>  @jadtnaous <https://twitter.com/jadtnaous>
>

Re: Looking out of the Druid's development bubble at modern Java testing practices

Posted by Jad Naous <ja...@imply.io>.

That's a great link, thank you! What do you think we should change in the
process to start following these patterns? Should the committers be more
strict on what gets committed?

On Fri, Oct 4, 2019 at 9:44 AM Roman Leventov <le...@apache.org> wrote:

> A few days ago I've stumbled upon a blog post "Modern Best Practices for
> Testing in Java" (
> https://phauer.com/2019/modern-best-practices-testing-java/) and learned a
> lot of new things from it.
>
> In a lot of cases, how testing is done in Druid goes against these best
> practices:
>  - "Given, When, Then" structure: in Druid, rarely followed or clarified
> via blank lines.
>  - Use the Prefixes “actual*” and “expected*: not in Druid, usually it's
> something1 / something2 and what is expected and what is actual is unclear
> (and sometimes confused in assert() args)
>  - Heavily Use Helper Functions: used sometimes in Druid, but by far not
> enough,  in my perception
>  - Don’t Extend Existing Tests To “Just Test One More Tiny Thing: not
> everywhere, but in a lot of places this practice is abused in Druid tests
>  - Insert Test Data Right In The Test Method:  test data is often inserted
> in @Before in Druid
>  - Favor Composition Over Inheritance: there are some hierarchies in Druid
>  - Don’t Write Too Much Logic: Druid tests feel like *all*
> boilerplate/noise/scaffolding logic
>  - Don’t Use In-Memory Databases For Tests: Druid does toy database Derby
> for tests :)
>  - Use AssertJ: Druid doesn't use AssertJ (I've just added AssertJ to
> dependencies in https://github.com/apache/incubator-druid/pull/8564 in one
> module)
>  - Use JUnit5: ¯\_(ツ)_/¯
>  - Mock Remote Service: in Druid, we don't use OkHttp's MockWebServer
>  - Use Awaitility for Asserting Asynchronous Code: in Druid, we use ad-hoc
> while loops with sleep(10), sleep(100), etc.
>  - Separate Asynchronous Execution and Actual Logic: I think, this is one
> of the most important points. In Druid, in almost all cases, execution
> management (thread pools), dependency injection (and dependencies in
> general), and the "core" logic are alloyed in a single class. This makes
> testing the core logic very painful with a lot of mocking and setup
> ceremony, breaking frequently. Following the Humble Object pattern (
> http://xunitpatterns.com/Humble%20Object.html) would make writing *and
> supporting* much less laborious.
>
> These things cannot be fixed in one day because test code has a lot of
> inertia, but I believe to improve developer productivity all active Druid
> developers should start using most of these practices in all new tests
> being written, demand other contributors align their tests with these
> practices in code reviews, and take some time to think about strategic ways
> to refactor a bulk of existing tests with little effort
> (semi-automatically) that can at the same time improve their quality even
> by a little.
>
> In my experience, wrestling with tests is a huge (often the biggest) part
> of changing existing code in Druid, so we pay a huge productivity tax by
> not paying enough attention to the quality of the test codebase.
>


-- 
Jad Naous
Imply | VP R&D
650-521-3425
jad.naous@imply.io
 @jadtnaous <https://twitter.com/jadtnaous>