You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/09/23 08:01:32 UTC

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1909: LUCENE-9539: Remove caches from SortingCodecReader

jpountz commented on a change in pull request #1909:
URL: https://github.com/apache/lucene-solr/pull/1909#discussion_r493279172



##########
File path: lucene/core/src/java/org/apache/lucene/index/SortingCodecReader.java
##########
@@ -510,4 +457,52 @@ public LeafMetaData getMetaData() {
     return metaData;
   }
 
+  // we try to cache the last used DV or Norms instance since during merge
+  // this instance is used more than once. We could in addition to this single instance
+  // also cache the fields that are used for sorting since we do the work twice for these fields
+  private String cachedField;
+  private Object cachedObject;
+  private boolean cacheIsNorms;
+
+  private <T> T getOrCreateNorms(String field, IOSupplier<T> supplier) throws IOException {
+    return getOrCreate(field, true, supplier);
+  }
+
+  @SuppressWarnings("unchecked")
+  private synchronized  <T> T getOrCreate(String field, boolean norms, IOSupplier<T> supplier) throws IOException {
+    if ((field.equals(cachedField) && cacheIsNorms == norms) == false) {
+      assert assertCreatedOnlyOnce(field, norms);
+      cachedObject = supplier.get();
+      cachedField = field;
+      cacheIsNorms = norms;
+
+    }
+    assert cachedObject != null;
+    return (T) cachedObject;
+  }
+
+  private final Map<String, Integer> cacheStats = new HashMap<>(); // only with assertions enabled
+  private boolean assertCreatedOnlyOnce(String field, boolean norms) {
+    assert Thread.holdsLock(this);
+    // this is mainly there to make sure we change anything in the way we merge we realize it early
+    Integer timesCached = cacheStats.compute(field + "N:" + norms, (s, i) -> i == null ? 1 : i.intValue() + 1);
+    if (timesCached > 1) {
+      assert norms == false :"[" + field + "] norms must not be cached twice";

Review comment:
       I think we might cache norms twice if full-text is indexed, as we'd pull norms once for merging norms, and another time to index impacts in postings for the same field.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org