You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/09/10 15:17:49 UTC

[GitHub] [solr] markrmiller commented on a change in pull request #230: SOLR-15555 Improved caching on FilterQuery

markrmiller commented on a change in pull request #230:
URL: https://github.com/apache/solr/pull/230#discussion_r706260246



##########
File path: solr/core/src/java/org/apache/solr/search/CaffeineCache.java
##########
@@ -114,9 +126,12 @@ public Object init(Map<String, String> args, Object persistence, CacheRegenerato
     str = args.get(MAX_RAM_MB_PARAM);
     int maxRamMB = str == null ? -1 : Double.valueOf(str).intValue();
     maxRamBytes = maxRamMB < 0 ? Long.MAX_VALUE : maxRamMB * 1024L * 1024L;
-    str = args.get(CLEANUP_THREAD_PARAM);
-    cleanupThread = str != null && Boolean.parseBoolean(str);
-    if (cleanupThread) {
+    cleanupThread = Boolean.parseBoolean(args.get(CLEANUP_THREAD_PARAM));
+    async = Boolean.parseBoolean(args.get(ASYNC_PARAM));
+    if (async) {
+      // We record futures in the map to decrease bucket-lock contention, but need computation handled in same thread
+      executor = Runnable::run;
+    } else if (cleanupThread) {
       executor = ForkJoinPool.commonPool();

Review comment:
       You kind of narrowly dodge this being part of this issue, so more of an FYI, I'll second Ben's comments on this common thread pool usage.
   
   It's a trappy and confusing use of common pool indicated to only impact a "cleanup" thread.
   
   This is in fact a general executor for Caffeine, not simply the cleanup thread pool. In most cases, that is not a big deal, but the common pool is extremely easy to put into thread starvation, especially the common pool, if you perform any blocking tasks on it. There is an ugly workaround where you can indicate to the pool that you are going to block so it can compensate while you do so, but in general this pool is entirely meant for non blocking async type tasks. Where it get's trappy is that it's also used to do the work of loading for the async cache, which will often be doing IO, which is blocking. In this issue, async will use the current thread, so the case is avoid, but the code remains unclear and trappy and Caffeine can shift on where that executor is used in future versions given it's not a "cleanup" executor.

##########
File path: solr/core/src/java/org/apache/solr/response/transform/ChildDocTransformerFactory.java
##########
@@ -179,7 +179,11 @@ public static BitSetProducer getCachedBitSetProducer(final SolrQueryRequest requ
     SolrCache<Query, BitSetProducer> parentCache = request.getSearcher().getCache(CACHE_NAME);
     // lazily retrieve from solr cache
     if (parentCache != null) {
-      return parentCache.computeIfAbsent(query, QueryBitSetProducer::new);
+      try {
+        return parentCache.computeIfAbsent(query, QueryBitSetProducer::new);
+      } catch (IOException e) {
+        throw new AssertionError("Can't happen");

Review comment:
       Not a fan of this myself.ll Impls change, new impls are added, it's declared to throw an IOException. Personally, I'd just turn it into a runtime io exception.

##########
File path: solr/solr-ref-guide/src/caches-warming.adoc
##########
@@ -79,6 +79,9 @@ Reasonable values, depending on the query volume and patterns, may lie somewhere
 The `maxRamMB` attribute limits the maximum amount of memory a cache may consume.
 When both `size` and `maxRamMB` limits are specified the `maxRamMB` limit will take precedence and the `size` limit will be ignored.
 
+The `async` attribute determines whether the cache stores direct results (`async=false`, the default) or whether it will store indirect references to the computation (`async=true`). Enabling `async` will use more slightly more memory per cache entry, but may result in reduced CPU cycles generating results or doing garbage collection. The most noticable improvements will be seen when there are many queries racing to compute the same results before they are inserted into the cache. In some use cases, increasing autowarming may be a better alternative to enabling an async cache. Additionally, an async cache will not prevent a data race for time-limited queries, as those may return differing sets of partial results.

Review comment:
       Nothing urgent here, but personally I'd like it to also be a little more clear that this issue can help solve the problem in the description of the JIRA. It's somewhat referenced with the mention of autowarming, but with no history of that issue, I'd have no idea that using this would help prevent that problem or why autowarming was someone another approach to the async option.

##########
File path: solr/core/src/test/org/apache/solr/search/TestSolrCachePerf.java
##########
@@ -124,11 +129,10 @@ private void doTestGetPutCompute(Map<String, SummaryStatistics> ratioStats, Map<
                 }
               }
               Thread.yield();
-              stopLatch.countDown();
+              stopLatch.countDown(); // Does this need to be in a finally block?

Review comment:
       I would put it in one - there is no time out await below on that latch.

##########
File path: solr/benchmark/src/java/org/apache/solr/bench/search/FilterCache.java
##########
@@ -0,0 +1,148 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.bench.search;
+
+import static org.apache.solr.bench.generators.SourceDSL.integers;
+
+import java.io.IOException;
+import java.net.HttpURLConnection;
+import java.net.URI;
+import java.nio.charset.StandardCharsets;
+import org.apache.solr.bench.BaseBenchState;
+import org.apache.solr.bench.Docs;
+import org.apache.solr.bench.MiniClusterState;
+import org.apache.solr.bench.generators.SolrGen;
+import org.apache.solr.client.solrj.SolrQuery;
+import org.apache.solr.client.solrj.SolrServerException;
+import org.apache.solr.client.solrj.request.CollectionAdminRequest;
+import org.apache.solr.client.solrj.request.QueryRequest;
+import org.openjdk.jmh.annotations.Benchmark;
+import org.openjdk.jmh.annotations.Fork;
+import org.openjdk.jmh.annotations.Level;
+import org.openjdk.jmh.annotations.Measurement;
+import org.openjdk.jmh.annotations.Param;
+import org.openjdk.jmh.annotations.Scope;
+import org.openjdk.jmh.annotations.Setup;
+import org.openjdk.jmh.annotations.State;
+import org.openjdk.jmh.annotations.TearDown;
+import org.openjdk.jmh.annotations.Threads;
+import org.openjdk.jmh.annotations.Warmup;
+import org.quicktheories.core.RandomnessSource;
+import org.quicktheories.impl.Constraint;
+
+@Fork(value = 1)

Review comment:
       Overall, great benchmark. I do have some lessor issues with it, but I don't want to bog down this issue on it, I would not deal with anything in the benchmark here as it would hold up getting the issue into the release.
   
   - On my desktop, the default run is fairly long. I have not finished the module doc, but I'd like to push default runs to be 10 minutes or less at most as you can easily crank things up or down as desired, but someone should be able to kick off a benchmark and get some kind of relatively reasonable feedback within a reasonable time and if they need to really engage with a benchmark then you alter settings on the command line. With a lot of benchmarks, if we don't manage this, running them all by default will take a very, very long time. This benchmark takes a bit over 20 min on my desktop.
   - I don't think Threads.MAX is a great default. For one, it can often be simpler and less messy to start looking at multi threaded benchmarks with a handful of threads vs a ton. But more importantly, that takes into account hyper threaded cores - so on my desktop that uses 32 threads and at parts of the benchmark run, that can really clobber my machine for random small stretches. Of course that's not a bad idea, but you can configure a run like that easily via command line - it would be nice if the default run did not clobber my machine and just used a reasonably low number of threads.
   - On my initial inspection, the results are not very consistent. I have some ideas on why that may be, but we want to have consistent results across runs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org