You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/07/03 04:44:02 UTC
[jira] [Commented] (DRILL-5155) TestDrillbitResilience unit test is
not resilient
[ https://issues.apache.org/jira/browse/DRILL-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071915#comment-16071915 ]
Paul Rogers commented on DRILL-5155:
------------------------------------
Additional issues. After enabling the managed version of the external sort, two tests within the test suite behave randomly.
When run in the debugger (directly in Eclipse or using a remote debug when run from Maven), the tests pass. Run as part of the Drill test suite, or as a standalone test in Maven, the tests fail.
{code}
TestDrillbitResilience.interruptingBlockedMergingRecordBatch:784->interruptingBlockedFragmentsWaitingForData:814->assertCancelledWithoutException:545->assertStateCompleted:531 Query state is incorrect (expected: CANCELED, actual: FAILED) AND/OR
Exception thrown: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: AssertionError
{code}
And
{code}
memoryLeaksWhenCancelled(org.apache.drill.exec.server.TestDrillbitResilience) Time elapsed: 50.019 sec <<< ERROR!
java.lang.Exception: test timed out after 50000 milliseconds
{code}
Sometimes the following fails, though most often it works:
{code}
Running org.apache.drill.exec.server.TestDrillbitResilience#failsAfterMSorterSorting
org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /172.30.1.212:58698 <--> /172.30.1.212:31013 (user client) closed unexpectedly. Drillbit down?
{code}
In another instance, a test failed because of a *negative* memory leak (test leaked -500 bytes, because start was greater than end...)
The conclusion is that the Drillbit is very fragile; the tests pass, but likely due to luck. Change anything and the tests fail.
> TestDrillbitResilience unit test is not resilient
> -------------------------------------------------
>
> Key: DRILL-5155
> URL: https://issues.apache.org/jira/browse/DRILL-5155
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.9.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Priority: Minor
>
> The unit test {{TestDrillbitResilience}} plays quite rough with a set of Drillbits, forcing a number of error conditions to see if the Drillbits can recover. The test cases are good, but they interact with each other to make the test as a whole quite fragile. The failure of any one test tends to cause others to fail. When tests are run individually, they may run. But, when run as a suite, they fail due to cross-interactions.
> Restructure the test to make the tests more independent so that one test does not change the state of the cluster expected by a different test.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)