You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/25 17:39:07 UTC

[GitHub] [druid] jihoonson opened a new pull request #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

jihoonson opened a new pull request #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564
 
 
   ### Description
   
   We currently have decent number of tests for the aggregate, but even with them, we have been having hard time to write good tests verifying all edge cases. A most recent example could be a couple of bugs found in sql-compatible null handling mode. I think this is because we had to write every test from scratch due to the lack of a good framework.
   
   This PR adds a new framework, `AggregateTestBase`, which generates random data including nulls for all data types. The data can be stored in either `IncrementalIndex` or `QueryableIndex` depending on a parameter, the caller can test reading from both of them. 
   
   To use the same data generator with our benchmarks in `AggregateTestBase`, I renamed some classes including `BenchmarkSchemaInfo`, `BenchmarkColumnSchema`, `BenchmarkDataGenerator`, `BenchmarkColumnValueGenerator` and moved them to `druid-processing`.
   
   Some example tests are found in `LongSumAggregateTest`. This class tests only `aggregate()` and `bufferAggregate()` using the data generated by `AggregateTestBase`. There are a couple of more new tests in `LongSumAggregatorFactoryTest`
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MyFoo`
    * `OurBar`
    * `TheirBaz`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug edited a comment on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

Posted by GitBox <gi...@apache.org>.
himanshug edited a comment on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564#issuecomment-604735785
 
 
   > Would you tell me more about this
   
   I  was referring to  all the extra code it took to test LongSum aggregator in this PR. :)
   
   > Thanks for pointing it out. My original intention was to add a low-level framework for aggregator testing without using the query framework (because it's high-level!), but I admit they are kind of similar. Let me check if they can be merged.
   
   `AggregationTestHelper` lets  you do more of "pseudo integration testing" . This type of testing is specially useful for complex aggregators . For example ensuring the intermediate sketch data representations are handled properly in various combinations. But, it automatically tests other/most  low level things as  well.
    However, I think it is missing the data  generation part, if we aid  it  with data generation then it can test all the different scenarios an aggregator  implementation gets exercised inside of a Druid cluster.  It would be nice if  tests like `DoubleMeanAggregationTest` could use the  generic data  generation  framework than to create their own limited dummy data sets and that would enable covering even more code paths.
   
   Further low level UTs are nice to have to flush out other corners that above high level  tests can't exercise.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564#issuecomment-604735785
 
 
   > Would you tell me more about this
   
   I  was referring to  all the extra code it took to test LongSum aggregator in this PR. :)
   
   > Thanks for pointing it out. My original intention was to add a low-level framework for aggregator testing without using the query framework (because it's high-level!), but I admit they are kind of similar. Let me check if they can be merged.
   
   `AggregationTestHelper` lets  you do more of "pseudo integration testing" . This type of testing is specially useful for complex aggregators . For example ensuring the intermediate sketch data representations are handled properly in various combinations.
    However, I think it is missing the data  generation part, if we aid  it  with data generation then it can test all the different scenarios an aggregator  implementation gets exercised inside of a Druid cluster.  It would be nice if  tests like `DoubleMeanAggregationTest` could use the  generic data  generation  framework than to create their own limited dummy data sets and that would enable covering even more code paths.
   
   Further low level UTs are nice to have to flush out other corners that above high level  tests can't exercise.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564#issuecomment-604642637
 
 
   > however, even after this there is significant effort required to thoroughly test a new aggregator that only has even primitive type intermediate form :)
   
   @himanshug thanks for taking a look. Would you tell me more about this? I would like to make things better in this PR if possible.
   
   > FYI to contributors coming here: there is another common utility `AggregationTestHelper` that serves the purpose of writing tests that tries to simulate use of aggregator in real cluster .
   
   Thanks for pointing it out. My original intention was to add a low-level framework for aggregator testing without using the query framework (because it's high-level!), but I admit they are kind of similar. Let me check if they can be merged. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

Posted by GitBox <gi...@apache.org>.
himanshug commented on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564#issuecomment-604608102
 
 
   +1 , since it improves the current state. however, even after this there  is significant effort required to thoroughly test  a new aggregator that only has  even primitive type  intermediate form :)
   
   FYI to contributors coming here:  there is another common  utility `AggregationTestHelper` that serves the purpose of writing tests  that  tries  to simulate use aggregator in real cluster .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] himanshug edited a comment on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory

Posted by GitBox <gi...@apache.org>.
himanshug edited a comment on issue #9564: Framework for aggregate testing; Example tests with LongSumAggregatorFactory
URL: https://github.com/apache/druid/pull/9564#issuecomment-604608102
 
 
   +1 , since it improves the current state. however, even after this there  is significant effort required to thoroughly test  a new aggregator that only has  even primitive type  intermediate form :)
   
   FYI to contributors coming here:  there is another common  utility `AggregationTestHelper` that serves the purpose of writing tests  that  tries  to simulate use of aggregator in real cluster .

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org