You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by François Méthot <fm...@gmail.com> on 2017/04/20 18:23:27 UTC

Jars for BaseTestQuery

Hi,

   I need to develop unit test of our storage plugins and if possible I
would like to borrow from the tests done in "TestCsvHeader.java" and other
classes in that package.

Those tests depends on  BaseTestQuery, DrillTest and ExecTest classes which
are not packaged in the Drill release (please correct me if I am wrong).

Are those jar shared somewhere for Storage Plugin Developer that rely on
the pre-built jar?

Thanks
Francois

Re: Jars for BaseTestQuery

Posted by François Méthot <fm...@gmail.com>.
Hi Paul,

  Thanks for your detailed comments. I am keeping your email as a reference.

We tend to favor the System Level testing first because they are easier to
maintain and refactoring code.  After trying to use the System level test,
we tried pure JUnit test approach and we realize that some of the
builder/constructor/factory for objects (DrillBuf, BufferLedger) that are
required by some function are package protected.

I will revisit our test revamp based on your comments once 1.11 is released
(DRILL-5323 and DRILL-5318).


François





On Thu, Apr 20, 2017 at 7:11 PM, Paul Rogers <pr...@mapr.com> wrote:

> Hi François,
>
> You raised two issues, I’ll address both.
>
> First, it is true that Maven’s model is that test code is not packaged, it
> is visible only to the maven module in which the test code resides. As you
> point out, this is an inconvenience in multiple-module projects such as
> Drill. Drill gets around the problem by minimizing unit testing; most
> testing outside of java-exec is done via system tests: running all of Drill
> and throwing queries at it.
>
> Drill, at present, has no support for reusing tests outside of their home
> module. It would be great if someone volunteers to solve the problem. Here
> are two references: [1], [2]
>
> Second, you mentioned you want to unit test a storage plugin. Here, it is
> necessary to understand how Drill’s usage of the term “unit test" differs
> from common industry usage. In the industry, a “unit test” would be one
> where you test your reader in isolation. Specially, give it an operator
> definition (the so-called “physical operator” or “sub scan POP” in Drill.)
> You’d then grab data and verify that the returned data batches are correct.
>
> Similarly, for the planning side of the plugin, you’d let Drill plan the
> query, then verify that the plan JSON is as you expect it to be.
>
> Drill, however, uses “unit test” to mean a system-level test written using
> JUnit. That is, most Drill tests run a query and examine the results. The
> BaseTestQuery class you mentioned is a JUnit test, but it is a system level
> test: it starts up an embedded Drillbit to which you can send queries. It
> has helper classes that let you examine results o the entire query (not
> just of your reader.) If you construct the correct SQL, your query can
> include nothing but a scan and the screen operator. Still, this approach
> introduces many layers between your test and your reader. (I call it trying
> to fix a watch while wearing oven mitts.)
>
> There are two recent additions to Drill’s test tools that may be of
> interest. First, we have a simpler way to run system tests based on a
> “cluster test fixture”. BaseTestQuery provides very poor control over
> boot-time configuration, but the test fixture gives you much better
> control. Plus, the new fixture lets you reuse the “TestBuilder” classes
> from BestTestQuery while also providing very easy ways to run queries, time
> results and so on. Check out the package-info in [3] and the example test
> in [4]. Unfortunately, this code has the same Maven packaging issues as
> described above.
>
> Of course, even the simplified test fixture is still a system test. We are
> in the process of checking in a new set of “sub-operator” unit test
> fixtures that enable true unit tests: you test only your code. See
> DRILL-5323 and DRILL-5318. Those PRs will be followed by a complete set of
> tests for the sort operator. I can point you to my personal dev branch if
> you want a preview.
>
> With these tools, you can set up to run just your own reader, then set up
> expected results and validate that things work as expected. Unit tests let
> you verify behavior at a very fine grain: verify each kind of column data
> type, verify filters you wish to push and so on. This is important because
> Drill suffers from a very large number of minor bugs: bugs that are hard to
> find using system tests, but which become obvious when using true unit
> tests.
>
> The in-flight version of the test framework was built for an “internal”
> operator (the sort.) Some work will be required to extend the tests to work
> with a reader (and to refactor the reader so it does not depend on a
> running Drillbit.) This is a worthwhile effort that I can help with if you
> want to go this route.
>
> Thanks,
>
> - Paul
>
> [1] http://stackoverflow.com/questions/14722873/sharing-
> src-test-classes-between-modules-in-a-multi-module-maven-project
> [2] http://maven.apache.org/guides/mini/guide-attached-tests.html
> [3] https://github.com/apache/drill/blob/master/exec/java-
> exec/src/test/java/org/apache/drill/test/package-info.java
> [4] https://github.com/apache/drill/blob/master/exec/java-
> exec/src/test/java/org/apache/drill/test/ExampleTest.java
>
>
> > On Apr 20, 2017, at 11:23 AM, François Méthot <fm...@gmail.com>
> wrote:
> >
> > Hi,
> >
> >   I need to develop unit test of our storage plugins and if possible I
> > would like to borrow from the tests done in "TestCsvHeader.java" and
> other
> > classes in that package.
> >
> > Those tests depends on  BaseTestQuery, DrillTest and ExecTest classes
> which
> > are not packaged in the Drill release (please correct me if I am wrong).
> >
> > Are those jar shared somewhere for Storage Plugin Developer that rely on
> > the pre-built jar?
> >
> > Thanks
> > Francois
>
>

Re: Jars for BaseTestQuery

Posted by Paul Rogers <pr...@mapr.com>.
Hi François,

You raised two issues, I’ll address both.

First, it is true that Maven’s model is that test code is not packaged, it is visible only to the maven module in which the test code resides. As you point out, this is an inconvenience in multiple-module projects such as Drill. Drill gets around the problem by minimizing unit testing; most testing outside of java-exec is done via system tests: running all of Drill and throwing queries at it.

Drill, at present, has no support for reusing tests outside of their home module. It would be great if someone volunteers to solve the problem. Here are two references: [1], [2]

Second, you mentioned you want to unit test a storage plugin. Here, it is necessary to understand how Drill’s usage of the term “unit test" differs from common industry usage. In the industry, a “unit test” would be one where you test your reader in isolation. Specially, give it an operator definition (the so-called “physical operator” or “sub scan POP” in Drill.) You’d then grab data and verify that the returned data batches are correct.

Similarly, for the planning side of the plugin, you’d let Drill plan the query, then verify that the plan JSON is as you expect it to be.

Drill, however, uses “unit test” to mean a system-level test written using JUnit. That is, most Drill tests run a query and examine the results. The BaseTestQuery class you mentioned is a JUnit test, but it is a system level test: it starts up an embedded Drillbit to which you can send queries. It has helper classes that let you examine results o the entire query (not just of your reader.) If you construct the correct SQL, your query can include nothing but a scan and the screen operator. Still, this approach introduces many layers between your test and your reader. (I call it trying to fix a watch while wearing oven mitts.)

There are two recent additions to Drill’s test tools that may be of interest. First, we have a simpler way to run system tests based on a “cluster test fixture”. BaseTestQuery provides very poor control over boot-time configuration, but the test fixture gives you much better control. Plus, the new fixture lets you reuse the “TestBuilder” classes from BestTestQuery while also providing very easy ways to run queries, time results and so on. Check out the package-info in [3] and the example test in [4]. Unfortunately, this code has the same Maven packaging issues as described above.

Of course, even the simplified test fixture is still a system test. We are in the process of checking in a new set of “sub-operator” unit test fixtures that enable true unit tests: you test only your code. See DRILL-5323 and DRILL-5318. Those PRs will be followed by a complete set of tests for the sort operator. I can point you to my personal dev branch if you want a preview.

With these tools, you can set up to run just your own reader, then set up expected results and validate that things work as expected. Unit tests let you verify behavior at a very fine grain: verify each kind of column data type, verify filters you wish to push and so on. This is important because Drill suffers from a very large number of minor bugs: bugs that are hard to find using system tests, but which become obvious when using true unit tests.

The in-flight version of the test framework was built for an “internal” operator (the sort.) Some work will be required to extend the tests to work with a reader (and to refactor the reader so it does not depend on a running Drillbit.) This is a worthwhile effort that I can help with if you want to go this route.

Thanks,

- Paul

[1] http://stackoverflow.com/questions/14722873/sharing-src-test-classes-between-modules-in-a-multi-module-maven-project
[2] http://maven.apache.org/guides/mini/guide-attached-tests.html
[3] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/package-info.java
[4] https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/ExampleTest.java


> On Apr 20, 2017, at 11:23 AM, François Méthot <fm...@gmail.com> wrote:
> 
> Hi,
> 
>   I need to develop unit test of our storage plugins and if possible I
> would like to borrow from the tests done in "TestCsvHeader.java" and other
> classes in that package.
> 
> Those tests depends on  BaseTestQuery, DrillTest and ExecTest classes which
> are not packaged in the Drill release (please correct me if I am wrong).
> 
> Are those jar shared somewhere for Storage Plugin Developer that rely on
> the pre-built jar?
> 
> Thanks
> Francois