You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Ryan McMahon (JIRA)" <ji...@apache.org> on 2018/10/29 22:59:02 UTC

[jira] [Updated] (GEODE-5568) Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve worst-case performance

     [ https://issues.apache.org/jira/browse/GEODE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McMahon updated GEODE-5568:
--------------------------------
    Summary: Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve worst-case performance  (was: testCacheOpAfterQueryCancel in QueryMonitorDUnitTest fails intermittently)

> Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve worst-case performance
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-5568
>                 URL: https://issues.apache.org/jira/browse/GEODE-5568
>             Project: Geode
>          Issue Type: Bug
>          Components: tests
>            Reporter: Michael Oleske
>            Assignee: Bill
>            Priority: Major
>              Labels: pull-request-available, swat
>          Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> h2. Original Description
> (original title of this ticket was *testCacheOpAfterQueryCancel in QueryMonitorDUnitTest fails intermittently*)
> *We* should remove flakiness from test
> *Before* we add more features around Query Monitor
> *Because* flaky tests do not inspire confidence
> *Notes*
> When running [PR 2311|https://github.com/apache/geode/pull/2311], a test failed that passed on a rerun (see [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-results/distributedTest/1534096152/classes/org.apache.geode.cache.query.dunit.QueryMonitorDUnitTest.html#testCacheOpAfterQueryCancel])
> The build artifacts are available [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-artifacts/1534096152/distributedtestfiles-geode-pr-2311.tgz]
> Since it is a timing issue, it could be a few things so not sure on best path forward.
> h2. Update October, 2018
> After seeing more failures on October 2, we decided to fix this bug. We discovered an actual product bug in the {{QueryMonitor}}. We also discovered that the "hot path" of of monitoring and then un-monitoring a query scaled poorly: time complexity was O(N) where N is the number of queries currently being monitored. As a result, our fix entailed mostly rewriting the {{QueryMonitor}} class to use a more appropriate (off-the-shelf) data structure.
> See [GEODE-5568 Analysis and Results|https://docs.google.com/document/d/1FFKH2XJMshCE-dBDXtMCYSylr73H54w4TODj4BRW9Y0/edit#] for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)