You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2021/09/16 06:19:42 UTC

[GitHub] [geode] albertogpz opened a new pull request #6874: GEODE-9602: QueryObserver improvements.

albertogpz opened a new pull request #6874:
URL: https://github.com/apache/geode/pull/6874


   - Make QueryObserverHolder thread-safe
   - Allow having an observer per query by means of setting the observer
     in the query at the start of the execution.
   - Invoke beforeIterationEvaluation and afterIterationEvaluation callbacks when
     query is using indexes.
   
   <!-- Thank you for submitting a contribution to Apache Geode. -->
   
   <!-- In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken: 
   -->
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   <!-- Note:
   Please ensure that once the PR is submitted, check Concourse for build issues and
   submit an update to your PR as soon as possible. If you need help, please send an
   email to dev@geode.apache.org.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] lgtm-com[bot] commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-920975763


   This pull request **fixes 2 alerts** when merging d32b31c32e7ab421dccce4d5615ab373477446b2 into d0113fc2eb3a2eedfa1464aab733c7af148d4539 - [view on LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-ff48f3008155734fd5da713aab3527fff35bc9f0)
   
   **fixed alerts:**
   
   * 2 for Dereferenced variable may be null


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] jinmeiliao commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
jinmeiliao commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r714305143



##########
File path: geode-gfsh/src/main/java/org/apache/geode/management/internal/cli/functions/DataCommandFunction.java
##########
@@ -237,6 +238,9 @@ private DataCommandResult select(InternalCache cache, Object principal, String q
       if (queryObserver != null) {
         QueryObserverHolder.reset();
       }
+      if (tracedQuery.isTraced()) {

Review comment:
       this block is already inside the `if (tracedQuery.isTraced())` condition, is this `if` necessary?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748


   @albertogpz @jhuynh1 `AcceptorImpl` is the class that runs server-side threads to execute requests from clients. When the thread starts, you could check a global in `QueryObserverHolder` and then set the `ThreadLocal` on each thread that it starts.
   
   It might even be possible to support multiple QueryObservers and invoke them all for each query.
   
   So here are some suggestions from Jason and me:
   
   Have the test set a global in the server(s). When the server starts a `AcceptorImpl` thread to process that query, set that same `ThreadLocal` on that thread.
   
   The query execution then adds the instance of `QueryObserver` to the `ExecutionContext`.
   
   Don't forget to call `ThreadLocal.remove()` after a query completes.
   
   `OrderByComparator` has a call to `QueryObserverHolder.getInstance()`. Change that to just fetch the observer from the `context` field in that class.
   
   It might even be possible to avoid the `ThreadLocal` by using a unique observer instance per execution context. I envisioned stuffing a unique instance of `QueryObserver` into the `ThreadLocal `that is then garbage collected after the query finishes and `ThreadLocal.remove` has been called.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748


   @albertogpz @jhuynh1 AcceptorImpl is the class that runs server-side threads to execute requests from clients. When the thread starts, you could check a global in QueryObserverHolder and then set the ThreadLocal on each thread that it starts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] mhansonp commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
mhansonp commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r712413479



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       It looks to me like you have that covered Alberto. By passing around context, it looks to me like you have addressed many if not all the thread safety issues. I tend to agree with Kirk about preferring and atomic to a synchronize, though.

##########
File path: geode-core/src/integrationTest/java/org/apache/geode/cache/query/internal/QueryTraceJUnitTest.java
##########
@@ -366,7 +396,11 @@ public void testTraceOnLocalRegionWithSmallTracePrefixNoComments() throws Except
     assertTrue(((DefaultQuery) query).isTraced());
 
     SelectResults results = (SelectResults) query.execute();
-    assertTrue(QueryObserverHolder.getInstance() instanceof IndexTrackingQueryObserver);
+

Review comment:
       These tests seem awfully repetitive. Is there a reason to check the test hook in every test?  From my perspective it seems like a waste of CPU. Just my opinion though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r713066718



##########
File path: geode-core/src/integrationTest/java/org/apache/geode/cache/query/internal/QueryTraceJUnitTest.java
##########
@@ -366,7 +396,11 @@ public void testTraceOnLocalRegionWithSmallTracePrefixNoComments() throws Except
     assertTrue(((DefaultQuery) query).isTraced());
 
     SelectResults results = (SelectResults) query.execute();
-    assertTrue(QueryObserverHolder.getInstance() instanceof IndexTrackingQueryObserver);
+

Review comment:
       It is the way to check that the `trace` directive in the query has taken effect when set or that trace is not being done when it is not specified.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] agingade commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
agingade commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r711286483



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       It looks like this is not going to achieve what you are looking for:
   > The QueryObserverHolder class is not thread-safe. 
   As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
   Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context. 
   QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
   
   I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r710580895



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/index/AbstractIndex.java
##########
@@ -1826,9 +1828,14 @@ void addValuesToCollection(Collection result, CompiledValue iterOp, RuntimeItera
                     // Compare the value in index with value in RegionEntry.
                     ok = verifyEntryAndIndexValue(entry, o1, context);
                   }
-                  if (ok && runtimeItr != null) {
-                    runtimeItr.setCurrent(o1);
-                    ok = QueryUtils.applyCondition(iterOp, context);
+                  try {
+                    if (ok && runtimeItr != null) {
+                      runtimeItr.setCurrent(o1);
+                      observer.beforeIterationEvaluation(iterOp, o1);
+                      ok = QueryUtils.applyCondition(iterOp, context);
+                    }
+                  } finally {
+                    observer.afterIterationEvaluation(ok);

Review comment:
       I think you should try to extract the blocks that look like this to their own private methods. There is some repetition involved that you may be able to reduce down to one method. The code calling these methods will also become more readable.

##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {

Review comment:
       Rather than using `synchronized`, you'd be better off making `_instance` a `private static final AtomicReference<QueryObserver>`. The use of `synchronized` will cause at least some JVM implementations to pin every thread hitting it to the same CPU.
   
   Using atomics is generally always a better approach if you can use it.

##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       I've wanted to make this class thread-safe too. There is one problem with doing so though. `getInstance()` is in the path of every query execution and index evaluate call. If you make it `synchronized`, then there will be a significant performance penalty inflicted on every query and index evaluation.
   
   To see this, look at the call hierarchy of constructing a new ExecutionConext. Every new instance will invoke:
   ```java
   private QueryObserver observer = QueryObserverHolder.getInstance();
   ```
   
   In order to make this class thread-safe, I think we need to solve the problem of how to get queries and index evaluations from hitting those `synchronized` blocks. I'm not as familiar with querying or indexing so you may need to chat about it further with Anil or others.
   
   Another problem is that the new `private QueryObserver observer` field in `ExecutionContext` is itself not thread-safe, so the setter and getter you added are not thread-safe either.
   
   I think your intention is to have a test invoke `setObserver` for one or more direct invocations of a query. If that's true, then you're probably better off introducing a `ThreadLocal` to `ExecutionContext` for this purpose. Then the execution would check that `ThreadLocal` for a non-null value and then use it. If it's null, the execution would fallback to using the usual `observer` stored in the field.
   
   Then you could make the field itself `final` in the hopes that it would be used multiple times. But again, I suspect that every query invocation creates a new `ExecutionContext` which will perform poorly because of the `synchronized` in `QueryObserverHolder`.
   

##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/index/CompactRangeIndex.java
##########
@@ -803,8 +809,13 @@ private void addToResultsFromEntries(Object lowerBoundKey, Object upperBoundKey,
               if (runtimeItr != null) {
                 runtimeItr.setCurrent(value);
               }
-              if (ok && runtimeItr != null && iterOps != null) {
-                ok = QueryUtils.applyCondition(iterOps, context);
+              try {
+                if (ok && runtimeItr != null) {
+                  observer.beforeIterationEvaluation(iterOps, value);
+                  ok = QueryUtils.applyCondition(iterOps, context);
+                }
+              } finally {
+                observer.afterIterationEvaluation(ok);

Review comment:
       This class also has several try-finally blocks like that this may benefit by being extracted as a new private method both for readability and to hopefully reduce some of the repetition in these blocks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] onichols-pivotal commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
onichols-pivotal commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-1030573406


   this PR appears to be abandoned, can it be closed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] lgtm-com[bot] commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-924100441


   This pull request **fixes 2 alerts** when merging 96a15bc4e382f049367ecfccf9be58289c01aaf1 into cf1b35c083d526a02851c06f44a6eda78963d16f - [view on LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-4cca61b5304c4f93e41f22c9bceb6d27dc31f80a)
   
   **fixed alerts:**
   
   * 2 for Dereferenced variable may be null


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r712233082



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       > It looks like this is not going to achieve what you are looking for:
   > 
   > > The QueryObserverHolder class is not thread-safe.
   > 
   > As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
   > Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context.
   > QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
   > 
   That's actually what I tried to do with this proposal. Am I missing something here?
   > I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-1031193201


   > this PR appears to be abandoned, can it be closed?
   
   It is not yet abandoned. It is pending from a discussion with @kirklund about it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-934499121


   > Maybe the way to go here is to rewrite `QueryObserverHolder` as a class that hold a `ThreadLocal<QueryObserverHolder>` and then have ALL other product classes such as `DefaultQuery` just use `QueryObserverHolder` as an API for accessing that ThreadLocal.
   
   I like that idea. The problem is that there are test cases that count on setting the observer globally and then running a query from the client. Take a look for example at: `SelectStarQueryDUnitTets::testSelectStarQueryForPartitionedRegion`. In these cases, you cannot set the thread local observer in the thread that will executy the query because the query is executed remotely from the client.
   
   I have been able to change the test cases in `QueryUsingFunctionContextDUnitTest` and set the observer in the function executing the queries instead of globally in the servers but this approach is not valid for the test cases previously mentioned.
   
   I am also worried about the use done by the `QueryObserver` to provide `trace` information in queries. It is not possible to set a query observer and at the same time have trace information.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] agingade commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
agingade commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r711286483



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       It looks like this is not going to achieve what you are looking for:
   > The QueryObserverHolder class is not thread-safe. 
   
   As Kirk pointed out, you have to make this as a ThreadLocal and set/reset while in use.
   Other option could be making this as non-static...If the query context is available in all the places where query-observer is getting used, you can think of creating a new QueryObserver and setting it on the context. 
   QueryContext.setObserver(new QueryObserver()); And no static implementation/use of query observer.
   
   I did not see any test changes where concurrent queries are using QueryObserver...It will show if the query observers are cross referenced...
   
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r712232163



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       > I've wanted to make this class thread-safe too. There is one problem with doing so though. `getInstance()` is in the path of every query execution and index evaluate call. If you make it `synchronized`, then there will be a significant performance penalty inflicted on every query and index evaluation.
   > 
   > To see this, look at the call hierarchy of constructing a new ExecutionConext. Every new instance will invoke:
   > 
   > ```java
   > private QueryObserver observer = QueryObserverHolder.getInstance();
   > ```
   > 
   > In order to make this class thread-safe, I think we need to solve the problem of how to get queries and index evaluations from hitting those `synchronized` blocks on every invocation. I'm not as familiar with querying or indexing so you may need to chat about it further with Anil or others.
   > 
   I would expect this to be a problem if the above calls are done many times per query. But in this PR, it is proposed to invoke it just once, to get the value and then set it in the `ExecutionContext`, right before the query is executed. Later, when the query needs the value of the observer, it would get it from the `ExecutionContext` using a non-synchronized call given that all the calls to get the observer have been changed from `QueryObserverHolder.getInstance()` to `context.getObserver()`.
   
   > Another problem is that the new `private QueryObserver observer` field in `ExecutionContext` is itself not thread-safe, so the setter and getter you added are not thread-safe either.
   > 
   I am assuming that the `ExecutionContext` is not shared between threads and therefore it does not need to be thread-safe. I am wrong here?
   > I think your intention is to have a test invoke `setObserver` for one or more direct invocations of a query. If that's true, then you're probably better off introducing a `ThreadLocal` to `ExecutionContext` for this purpose. Then the execution would check that `ThreadLocal` for a non-null value and then use it. If it's null, the execution would fallback to using the usual `observer` stored in the field.
   > 
   Actually, my intention was just to make this class thread-safe but not extend it to have new usages. Nevertheless, the class is documented and used so that the observer can be set from any thread and be available to any thread in the VM.
   I have proposed the current solution but have also considered the use of a `ThreadLocal` variable, not in the `ExecutionContext` class (which did not have an `observer` field) but in the `QueryObserverHolder` class to be used in a way similar to what you are proposing.
   But I think the lifecycle of the `ThreadLocal` variable would be complex. It should be reset after the query execution. But who would do it?
   
   > Then you could make the field itself `final` in the hopes that it would be used multiple times. But again, I suspect that every query invocation creates a new `ExecutionContext` which will perform poorly because of the `synchronized` in `QueryObserverHolder`.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-933700627


   Maybe the way to go here is to rewrite `QueryObserverHolder` as a class that hold a `ThreadLocal<QueryObserverHolder>` and then have ALL other product classes such as `DefaultQuery` just use `QueryObserverHolder` as an API for accessing that ThreadLocal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r711216002



##########
File path: geode-core/src/integrationTest/java/org/apache/geode/cache/query/internal/QueryTraceJUnitTest.java
##########
@@ -296,8 +319,11 @@ public void testTraceOnLocalRegionWithTracePrefixNoComments() throws Exception {
     assertTrue(((DefaultQuery) query).isTraced());
 
     SelectResults results = (SelectResults) query.execute();
-    assertTrue(QueryObserverHolder.getInstance() instanceof IndexTrackingQueryObserver);
-    // The query should return all elements in region.

Review comment:
       This comment seems to have been accidentally removed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] lgtm-com[bot] commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-924059455


   This pull request **introduces 1 alert** and **fixes 2** when merging 52fe699ad956c0c2cbb8c739b27ce6831c56a310 into 19f55add07d6a652911dc5a5e2116fcf7bf7b2f7 - [view on LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-f76a226a06d575b6d6471e937ab11c21d0a63a48)
   
   **new alerts:**
   
   * 1 for Inconsistent synchronization of getter and setter
   
   **fixed alerts:**
   
   * 2 for Dereferenced variable may be null


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748


   @albertogpz @jhuynh1 AcceptorImpl is the class that runs server-side threads to execute requests from clients. When the thread starts, you could check a global in QueryObserverHolder and then set the ThreadLocal on each thread that it starts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-934499121


   > Maybe the way to go here is to rewrite `QueryObserverHolder` as a class that hold a `ThreadLocal<QueryObserverHolder>` and then have ALL other product classes such as `DefaultQuery` just use `QueryObserverHolder` as an API for accessing that ThreadLocal.
   
   I like that idea.
   I did an experiment with it but saw many test cases fail: https://github.com/apache/geode/pull/6882
   (https://concourse.apachegeode-ci.info/builds/84932)
   
   The problem is that there are test cases that count on setting the observer globally and then running a query from the client. Take a look for example at: `SelectStarQueryDUnitTets::testSelectStarQueryForPartitionedRegion`. In these cases, you cannot set the thread local observer in the thread that will execute the query because the query is executed remotely from the client.
   
   I have been able to change the test cases in `QueryUsingFunctionContextDUnitTest` and set the observer in the function executing the queries instead of globally in the servers but this approach is not valid for the test cases previously mentioned.
   
   I am also worried about the use done by the `QueryObserver` to provide `trace` information in queries. It is not possible to set a query observer and at the same time have trace information.
   
   We could also go for having two variables in the `QueryObserverHolder`: a thread local one and the current global one. Setting the instance could set the value in the global one and getting the instance could check if the thread local has a value. If it does, it would return it. Otherwise, the thread local one would be set to the global one value and then be returned. That way, during a query execution, only the first call to `getInstance()` should require synchronization.
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r714547058



##########
File path: geode-gfsh/src/main/java/org/apache/geode/management/internal/cli/functions/DataCommandFunction.java
##########
@@ -237,6 +238,9 @@ private DataCommandResult select(InternalCache cache, Object principal, String q
       if (queryObserver != null) {
         QueryObserverHolder.reset();
       }
+      if (tracedQuery.isTraced()) {

Review comment:
       It is not. This finally block is outside any `if (tracedQuery.isTraced())` condition.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r710531334



##########
File path: geode-core/src/integrationTest/java/org/apache/geode/cache/query/internal/QueryObserverCallbacksTest.java
##########
@@ -243,6 +245,59 @@ public void beforeAggregationsAndGroupByShouldBeCalledForAggregateFunctions() th
     verify(myQueryObserver, times(queries.size())).beforeAggregationsAndGroupBy(any());
   }
 
+  @Test
+  public void testBeforeAndAfterIterationEvaluateNoWhere() throws Exception {
+    Query query = queryService.newQuery(
+        "select count(*) from " + SEPARATOR + "portfolio p");
+
+    query.execute();
+    verify(myQueryObserver, times(0)).beforeIterationEvaluation(any(), any());
+    verify(myQueryObserver, times(0)).afterIterationEvaluation(any());

Review comment:
       You can also use `never()`:
   ```java
       verify(myQueryObserver, never()).beforeIterationEvaluation(any(), any());
       verify(myQueryObserver, never()).afterIterationEvaluation(any());
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] jhuynh1 commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
jhuynh1 commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941355730


   To add to Kirk's prior suggestion (some suggestions are probably duplicated as we are currently discussing)  The main difference between the two is the use of vs possible removal of the threadlocal
   
   We might be able to avoid the thread local entirely if we add a method to the QueryObserver interface such as createObserver() where it is expected to pass back a unique instance of an observer.  Improvements to reuses/pool these can be made to reduce garbage later.
   
   Where we currently do start query, perhaps call this new createObserver() method and stuff it into the execution context. (https://github.com/apache/geode/blob/7dd614387e9f3133de77ff17675d6bf00051276c/geode-core/src/main/java/org/apache/geode/cache/query/internal/DefaultQuery.java#L342)
   
   Wherever we currently do QueryObserverHolder.getInstance(), instead do context.getObserver()
   
   There are a few locations of QueryObserverHolder.getInstance() like OrderbyComparator (https://github.com/apache/geode/blob/7dd614387e9f3133de77ff17675d6bf00051276c/geode-core/src/main/java/org/apache/geode/cache/query/internal/OrderByComparator.java#L116), that have a context present as a class variable that can be used.
   
   The execution context and observer should be gc'd eventually once the execution context is no longer being used.  There is probably more I am missing but thought I'd put this down as a suggestion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] albertogpz edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
albertogpz edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-934499121


   > Maybe the way to go here is to rewrite `QueryObserverHolder` as a class that hold a `ThreadLocal<QueryObserverHolder>` and then have ALL other product classes such as `DefaultQuery` just use `QueryObserverHolder` as an API for accessing that ThreadLocal.
   
   I like that idea.
   I did an experiment with it but saw many test cases fail:
   https://concourse.apachegeode-ci.info/builds/84932
   
   The problem is that there are test cases that count on setting the observer globally and then running a query from the client. Take a look for example at: `SelectStarQueryDUnitTets::testSelectStarQueryForPartitionedRegion`. In these cases, you cannot set the thread local observer in the thread that will execute the query because the query is executed remotely from the client.
   
   I have been able to change the test cases in `QueryUsingFunctionContextDUnitTest` and set the observer in the function executing the queries instead of globally in the servers but this approach is not valid for the test cases previously mentioned.
   
   I am also worried about the use done by the `QueryObserver` to provide `trace` information in queries. It is not possible to set a query observer and at the same time have trace information.
   
   We could also go for having two variables in the `QueryObserverHolder`: a thread local one and the current global one. Setting the instance could set the value in the global one and getting the instance could check if the thread local has a value. If it does, it would return it. Otherwise, the thread local one would be set to the global one value and then be returned. That way, during a query execution, only the first call to `getInstance()` should require synchronization.
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748


   @albertogpz @jhuynh1 `AcceptorImpl` is the class that runs server-side threads to execute requests from clients. When the thread starts, you could check a global in `QueryObserverHolder` and then set the `ThreadLocal` on each thread that it starts.
   
   It might even be possible to support multiple QueryObservers and invoke them all for each query.
   
   So here are some suggestions from Jason and me:
   
   Have the test set a global in the server(s). When the server starts a `AcceptorImpl` thread to process that query, set that same `ThreadLocal` on that thread.
   
   The query execution then adds the instance of `QueryObserver` to the `ExecutionContext`.
   
   Don't forget to call `ThreadLocal.remove()` after a query completes.
   
   `OrderByComparator` has a call to `QueryObserverHolder.getInstance()`. Change that class to pass in a `QueryObserver` to the constructor and store it in a `private final` field. Then in `compare` use the field instead of calling `getInstance`.
   
   It might even be possible to avoid the `ThreadLocal` by using a unique observer instance per execution context. I envisioned stuffing a unique instance of `QueryObserver` into the `ThreadLocal `that is then garbage collected after the query finishes and `ThreadLocal.remove` has been called.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund edited a comment on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund edited a comment on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941266748


   @albertogpz @jhuynh1 AcceptorImpl is the class that runs server-side threads to execute requests from clients. When the thread starts, you could check a global in QueryObserverHolder and then set the ThreadLocal on each thread that it starts.
   
   It might even be possible to support multiple QueryObservers and invoke them all for each query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on a change in pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
kirklund commented on a change in pull request #6874:
URL: https://github.com/apache/geode/pull/6874#discussion_r710577674



##########
File path: geode-core/src/main/java/org/apache/geode/cache/query/internal/QueryObserverHolder.java
##########
@@ -54,26 +54,26 @@
   /**
    * Set the given observer to be notified of query events. Returns the current observer.
    */
-  public static QueryObserver setInstance(QueryObserver observer) {
+  public static synchronized QueryObserver setInstance(QueryObserver observer) {
     Support.assertArg(observer != null, "setInstance expects a non-null argument!");
     QueryObserver oldObserver = _instance;
     _instance = observer;
     return oldObserver;
   }
 
-  public static boolean hasObserver() {
+  public static synchronized boolean hasObserver() {
     return _instance != NO_OBSERVER;
   }
 
   /** Return the current QueryObserver instance */
-  public static QueryObserver getInstance() {
+  public static synchronized QueryObserver getInstance() {

Review comment:
       I've wanted to make this class thread-safe too. There is one problem with doing so though. `getInstance()` is in the path of every query execution and index evaluate call. If you make it `synchronized`, then there will be a significant performance penalty inflicted on every query and index evaluation.
   
   To see this, look at the call hierarchy of constructing a new ExecutionConext. Every new instance will invoke:
   ```java
   private QueryObserver observer = QueryObserverHolder.getInstance();
   ```
   
   In order to make this class thread-safe, I think we need to solve the problem of how to get queries and index evaluations from hitting those `synchronized` blocks on every invocation. I'm not as familiar with querying or indexing so you may need to chat about it further with Anil or others.
   
   Another problem is that the new `private QueryObserver observer` field in `ExecutionContext` is itself not thread-safe, so the setter and getter you added are not thread-safe either.
   
   I think your intention is to have a test invoke `setObserver` for one or more direct invocations of a query. If that's true, then you're probably better off introducing a `ThreadLocal` to `ExecutionContext` for this purpose. Then the execution would check that `ThreadLocal` for a non-null value and then use it. If it's null, the execution would fallback to using the usual `observer` stored in the field.
   
   Then you could make the field itself `final` in the hopes that it would be used multiple times. But again, I suspect that every query invocation creates a new `ExecutionContext` which will perform poorly because of the `synchronized` in `QueryObserverHolder`.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] jhuynh1 commented on pull request #6874: GEODE-9602: QueryObserver improvements.

Posted by GitBox <gi...@apache.org>.
jhuynh1 commented on pull request #6874:
URL: https://github.com/apache/geode/pull/6874#issuecomment-941355730


   To add to Kirk's prior suggestion (some suggestions are probably duplicated as we are currently discussing)  The main difference between the two is the use of vs possible removal of the threadlocal
   
   We might be able to avoid the thread local entirely if we add a method to the QueryObserver interface such as createObserver() where it is expected to pass back a unique instance of an observer.  Improvements to reuses/pool these can be made to reduce garbage later.
   
   Where we currently do start query, perhaps call this new createObserver() method and stuff it into the execution context. (https://github.com/apache/geode/blob/7dd614387e9f3133de77ff17675d6bf00051276c/geode-core/src/main/java/org/apache/geode/cache/query/internal/DefaultQuery.java#L342)
   
   Wherever we currently do QueryObserverHolder.getInstance(), instead do context.getObserver()
   
   There are a few locations of QueryObserverHolder.getInstance() like OrderbyComparator (https://github.com/apache/geode/blob/7dd614387e9f3133de77ff17675d6bf00051276c/geode-core/src/main/java/org/apache/geode/cache/query/internal/OrderByComparator.java#L116), that have a context present as a class variable that can be used.
   
   The execution context and observer should be gc'd eventually once the execution context is no longer being used.  There is probably more I am missing but thought I'd put this down as a suggestion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org