You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/03 13:00:00 UTC

[jira] [Assigned] (KUDU-2619) Track Java test failures in the flaky test dashboard

     [ https://issues.apache.org/jira/browse/KUDU-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Henke reassigned KUDU-2619:
---------------------------------

    Assignee: Grant Henke

> Track Java test failures in the flaky test dashboard
> ----------------------------------------------------
>
>                 Key: KUDU-2619
>                 URL: https://issues.apache.org/jira/browse/KUDU-2619
>             Project: Kudu
>          Issue Type: Improvement
>          Components: java, test
>    Affects Versions: n/a
>            Reporter: Adar Dembo
>            Assignee: Grant Henke
>            Priority: Major
>
> Right now our flaky test tracking infrastructure only incorporates C++ tests using GTest; we should extend it to include Java tests too.
> I spent some time on this recently and I wanted to collect my notes in one place.
> For reference, here's how C++ test reporting works:
> # The build-and-test.sh script rebuilds thirdparty dependencies, builds Kudu, and invokes all test suites, optionally using dist-test. After all tests have been run, it also collects any dist-test artifacts and logs so that all of the test results are available in one place.
> # The run-test.sh script is effectively the C++ "test runner". It is responsible for running a test binary, retrying it if it fails, and calling report-test.sh after each test run (success or failure). Importantly, report-test.sh is invoked once per test binary (not individual test), and on test success we don't wait for the script to finish, because we don't care as much about collecting successes.
> # The report-test.sh collects some basic information about the test environment (such as the git hash used, whether ASAN or TSAN was enabled, etc.), then uses curl to send the information to the test result server.
> # The test result server will store the test run result in a database, and will query that database to produce a dashboard.
> There are several problems to solve if we're going to replicate this for Java:
> # There's no equivalent to run-test.sh. The entry point for running the Java test suite is Gradle, but in dist-test, the test invocation is actually done via 'java org.junit.runner.JUnitCore'. Note that C++ test reporting is currently also incompatible with dist-test, so the Java tests aren't unique in that respect.
> # It'd be some work to replace report-test.sh with a Java equivalent.
> My thinking is that we should move test reporting from run-test.sh into build-and-test.sh:
> # It's a good separation of concerns. The test "runners" are responsible for running and maybe retrying tests, while the test "aggregator" (build-and-test.sh) is responsible for reporting.
> # It's more performant. You can imagine building a test_result_server.py endpoint for reporting en masse, which would cut down on the number of round trips. That's especially important if we start reporting individual test results (as opposed to test _suite_ results).
> # It means the reporting logic need only be written once.
> # It was always a bit unexpected to find reporting logic buried in run-test.sh. I mean, it made sense for rapid prototyping but it never really made that much sense to me.
> So then the problem is ensuring that, after all tests have run, we have the right JUnit XML and log files for every test that ran, including retries, which is more tractable, and doable for dist-test environments too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)