You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by GitBox <gi...@apache.org> on 2018/07/21 00:20:53 UTC

[GitHub] leventov commented on issue #6028: Error in SqlMetadataRuleManagerTest

leventov commented on issue #6028: Error in SqlMetadataRuleManagerTest
URL: https://github.com/apache/incubator-druid/issues/6028#issuecomment-406756225
 
 
   I think the problem is that `SQLMetadataRuleManager.poll()` is scheduled to be called asynchronously in a dedicated executor in `SQLMetadataRuleManager.start()`. `poll()` makes an inner join of the rules table with itself, that appears to be enough to cause a deadlock when the drop of this table is called in parallel (it is called in `SQLMetadataRuleManagerTest.cleanup()`.
   
   The problem was introduced in #5554, where the `SQLMetadataRuleManagerTest.testMultipleStopAndStart()` was added which calls `start()` and `stop()` repeatedly. Before, `SQLMetadataRuleManager.start()` was not ever called in the context of `SQLMetadataRuleManagerTest`.
   
   Extra coincidental problem that leads to this condition is that `stop()` (that pairs `start()` calls in `testMultipleStopAndStart()` is not actually synchronous at the moment. By the time when `stop()` exists, some `poll()` may still be executed. To ensure this is not the case, `exec.awaitTermination()` should be called. But this is awkward, because we don't know what timeout we should use. Alternatively, `poll()` could be synchronized together with `start()` and `stop()` (the `started` flag should be checked under the lock in `poll()`).
   
   Extra:
    - There is no point of `future` and `exec` being volatile.
    - `future.cancel` is redundant before `shutdownNow()`.
    - There is a race on `retryStartTime` updates in `poll()`.
    - It feels to me that `SQLMetadataRuleManager` mixes two separate abstractions - one of scheduled polling, and another of the access to the database itself (including the implementation of `poll()`). But I'm uncertain about this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
For additional commands, e-mail: dev-help@druid.apache.org