You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/10/31 02:08:00 UTC

[jira] [Commented] (GEODE-5568) Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve performance

    [ https://issues.apache.org/jira/browse/GEODE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669504#comment-16669504 ] 

ASF subversion and git services commented on GEODE-5568:
--------------------------------------------------------

Commit 80614978deefe7528c2135810129a7ad780521fb in geode's branch refs/heads/develop from [~mcmellawatt]
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=8061497 ]

GEODE-5568: Rewrite QueryMonitor to use ScheduledThreadPoolExecutor (#2744)

* GEODE-5568: Rewrite QueryMonitor to use ScheduledThreadPoolExecutor

Eliminate notify/wait bugs and improve hot-path performance.

Co-authored-by: Bill Burcham <bb...@pivotal.io>
Co-authored-by: Ryan McMahon <rm...@pivotal.io>

> Rewrite QueryMonitor to use ScheduledThreadPoolExecutor to eliminate notify/wait bugs and improve performance
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-5568
>                 URL: https://issues.apache.org/jira/browse/GEODE-5568
>             Project: Geode
>          Issue Type: Bug
>          Components: tests
>            Reporter: Michael Oleske
>            Assignee: Bill
>            Priority: Major
>              Labels: pull-request-available, swat
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> h2. Original Description
> (original title of this ticket was *testCacheOpAfterQueryCancel in QueryMonitorDUnitTest fails intermittently*)
> *We* should remove flakiness from test
> *Before* we add more features around Query Monitor
> *Because* flaky tests do not inspire confidence
> *Notes*
> When running [PR 2311|https://github.com/apache/geode/pull/2311], a test failed that passed on a rerun (see [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-results/distributedTest/1534096152/classes/org.apache.geode.cache.query.dunit.QueryMonitorDUnitTest.html#testCacheOpAfterQueryCancel])
> The build artifacts are available [here|http://files.apachegeode-ci.info/builds/geode-pr-2311/test-artifacts/1534096152/distributedtestfiles-geode-pr-2311.tgz]
> Since it is a timing issue, it could be a few things so not sure on best path forward.
> h2. Update October, 2018
> After seeing more failures on October 2, we decided to fix this bug. We discovered an actual product bug in the {{QueryMonitor}}. We also discovered that the "hot path" of of monitoring and then un-monitoring a query scaled poorly: time complexity was O(N) where N is the number of queries currently being monitored. As a result, our fix entailed mostly rewriting the {{QueryMonitor}} class to use a more appropriate (off-the-shelf) data structure.
> See [GEODE-5568 Analysis and Results|https://docs.google.com/document/d/1FFKH2XJMshCE-dBDXtMCYSylr73H54w4TODj4BRW9Y0/edit#] for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)