You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/10/24 15:43:04 UTC

[GitHub] [solr] tboeghk opened a new pull request, #1118: Issue/solr 16489

tboeghk opened a new pull request, #1118:
URL: https://github.com/apache/solr/pull/1118

   # Description
   
   Under heavy machine load, accessing any Solr CaffeineCache can lead to threads spinning in an endless loop. In our setup we had machines spinning at 70% cpu load without receiving any request for hours.
   
   # Solution
   
   The problem is caused by delegating the `SolrCache#put` method to `Cache#asMap#put` Under heavy machine load with concurrent read and write access to the same key, the `ConcurrentHashMap#put` method does not terminate an endless loop.
   
   # Tests
   
   The behavior happens under heavy load when the cache is under congestion and read / write access is happening in different threads. We had this small fix live for a couple of weeks now without any stuck threads.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1299799324

   The clauses should not be modifyable from outside and I don't know any code doing this. BooleanQuery is designed to be immutable, the problem is more that there seems to be a getter that returns the non-wrapped list.
   But I don't think this is the problem here. I have the feeling one of the inner queries is the bad guy. Can you post the fill clauses contents as tree?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on a diff in pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on code in PR #1118:
URL: https://github.com/apache/solr/pull/1118#discussion_r1003649518


##########
solr/core/src/java/org/apache/solr/search/CaffeineCache.java:
##########
@@ -275,7 +275,7 @@ public V computeIfAbsent(K key, IOFunction<? super K, ? extends V> mappingFuncti
   @Override
   public V put(K key, V val) {
     inserts.increment();
-    V old = cache.asMap().put(key, val);
+    V old = cache.asMap().compute(key, (k, v) -> val);

Review Comment:
   I wanted to stay in line with the other methods who also call `cache.asMap()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290182129

   Would a stacktrace help? To reproduce the problem I would need to deploy a cluster on our side as we were not able to trigger the bug with a test case. However with a gatling loadtest we were able to reproduced it reliably. To set everything up I need about 4 hours. We were able to pin point the endless loop by attaching to the debug endpoint and pause all threads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on a diff in pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on code in PR #1118:
URL: https://github.com/apache/solr/pull/1118#discussion_r1003541142


##########
solr/core/src/java/org/apache/solr/search/CaffeineCache.java:
##########
@@ -275,7 +275,7 @@ public V computeIfAbsent(K key, IOFunction<? super K, ? extends V> mappingFuncti
   @Override
   public V put(K key, V val) {
     inserts.increment();
-    V old = cache.asMap().put(key, val);
+    V old = cache.asMap().compute(key, (k, v) -> val);

Review Comment:
   does this need to use `.asMap()`? can we just use `cache.get(key, mapping_function)`? This looks to be the same concept w/o trying to access the underlying map.
   
   https://www.javadoc.io/static/com.github.ben-manes.caffeine/caffeine/3.1.1/com.github.benmanes.caffeine/com/github/benmanes/caffeine/cache/Cache.html#get(K,java.util.function.Function)
   
   if using `asMap()` -` compute` looks correct based on https://www.javadoc.io/static/com.github.ben-manes.caffeine/caffeine/3.1.1/com.github.benmanes.caffeine/com/github/benmanes/caffeine/cache/Cache.html#asMap(), but it looks like we don't even need `asMap()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289718016

   > The threads got stuck in this [infinite loop](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L2215-L2293) in `BoundedLocalCache`. I'll ask @DennisBerger1984 and @mpetris for a detailed stack trace tomorrow.
   
   Actually this loop sometimes hits the assertion recently added in JDK 17. I'd rewrite that infinite loop to something more "CAS" like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290139900

   @uschindler the PR avoids the `put()` method which leads to the infinite loop by using the `compute()` method with a supplier. This will not reach the infinite loop.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295276855

   Query is by default an immutable object. So there should not be a separate cache key.
   If you found a buggy query with unstable hashcode/equals, please report this at Lucene. This is a bug. Please no workarounds here!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300772734

   Yes this helps. Interestingly: The "clauses" of a BooleanQuery are non-mutable, it is already wrapped with Collections.unmodifiableList(). So nobody can change the clauses after calling BooleanQuery.getCauses().


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] renekrie commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
renekrie commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311668608

   Thanks @DennisBerger1984! There is a smell related to multiMatchTie - https://github.com/querqy/querqy/issues/384. This FieldBoost implementation should be used under the hood with multiMatchTie under certain conditions and be referenced from queries and it misses hashcode/equals (though I cannot see SingleFieldBoost in your object graph). It should still result just in a cache miss and not trigger a loop. Maybe for you test, you could just remove the `multiMatchTie` param in a first step.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] renekrie commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
renekrie commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1312007008

   > For current lucene queries the problem may occur when you pass a BytesRef (it is a ref) only while building the query. If you change the bytesRef afterwards it fails.
   
   I briefly checked this but as far as I can see we are always using a fresh BytesRef. Manipulations at the Querqy level (= before going to the analysis chain) happen via character arrays anyway and they aren't being used any longer once a Lucene query has been produced.
   
   @DennisBerger1984 Can I ask you to disable query rewriters or/and use empty rule sets with rewriters still being enabled to narrow down the problem?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290173830

   If I remember correctly the "prior" in line 2203 of BoundedLocalCache in caffeineCache 3.3.1 was always non null and had the keyReference DEAD_STRONG_KEY such that the isAlive() check always returned false in line 2248. The thread spins in the enclosing for loop forever.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290876283

   Hi @ben-manes,
   I can confirm that. We've bumped the version in our gradle build and I just verified that in the resulting docker image the actual jar is 3.1.1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289828195

   > Actually this loop sometimes hits the assertion recently added in JDK 17. I'd rewrite that infinite loop to something more "CAS" like.
   
   Unfortunately the challenge with this method is that `ConcurrentLinkedHashMap` is pessimistic in its write operations by locking before searching for the entry in the hash bin (or segment in 5-7). Doug had [concerns](https://markmail.org/message/dixewqcvbbs5gzi2) about a full prescreen due to interactions with biased locking, safepoints, and GC pressure. He did eventually add a partial prescreen for the head of the bin, but that cannot be relied upon as the table is designed to prefer a high collision rate (cheap/poor hash offset by red-black tree bins).
   
   A CAS-style would be a [putIfAbsent loop](https://github.com/ben-manes/concurrentlinkedhashmap/blob/cc3e11603e8a91185c1748633be2c703e218219e/src/main/java/com/googlecode/concurrentlinkedhashmap/ConcurrentLinkedHashMap.java#L716-L750), so it would lock even if unsuccessful. In that case then the loop could be removed as locking is the costly work, so a [compute](https://github.com/ben-manes/caffeine/blob/c5cc2d4f04a317111705108a06b7babe3ce9dc6f/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L1609-L1701) would be preferable. Some users have assumed that `putIfAbsent` is a cheap read if present (e.g. https://github.com/apache/openwhisk/pull/2797), so the cache tries to read and fallback to locking only when appropriate.
   
   You can think of this as a [TTAS loop](https://en.wikipedia.org/wiki/Test_and_test-and-set) which has a similar speedup in comparison to be more straightforward TAS approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311486518

   Hey should we provide additional information like tests without querqy plugin or is it sufficient to wait for ticket #1146?
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311946613

   A completely missing equals/hashCode should not bring problems, because the cache would simply not work. For the bug to occur the hashcode or equals must suddenly change for an existing query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295767609

   Did we in the meantime figure out which of the query classes caused the issue? WrappedQuery.setQuery() was never used. It would be good to get a full toString dump of the query that caused the issue.
   Did you made a screenshot while debugging?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290126484

   Hi Ben,
   thanks for the insight. My comment was not directly about the type of loop (CAS, TTAS,...) it was more about the coding style. Hotspot sometimes tries to optimize loops where it thinks it has some common pattern. My proposal would be to mabye form the loop in a way that it follows a common pattern (like mentioned CAS, or use a while with condition up-front,...). This one is a `for(;;)` with very strange exit conditions and I had the feeling it causes some bugs in hotspot and maybe also triggers the infinite loop. In Lucene's bug we had some similar problems where loop invariants were incorrectly guessed.
   
   Of course if this is a bug in your code leading to the infinite loop, then it is not a hotspot problem. What I do not understand: How does the current PR fixes the bug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289465929

   We ran into the bug using JDK11, JDK17 and JDK19. I'll try to dig out the stack trace / thread dump the threads got stuck in!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289463719

   We came across this bug using JDK11 and Solr 8.11.2, so this is unlikely the same as https://github.com/ben-manes/caffeine/issues/797.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1298315534

   @ben-manes we can confirm that your change with `3.1.2-SNAPSHOT` is working! 
   
   We removed our fix from the SolrCache and can confirm that no threads inside Solr are spinning anymore. We see the following exceptions in the Solr log (I had to remove some key details as they are sensitive):
   
   ```
   java.lang.IllegalStateException: An invalid state was detected that occurs if the key's equals or hashCode was modified while it resided in the map. This violation of the Map contract can lead to non-deterministic behavior (key: ((indexed_dimensions:dame^5000.0 | MatchNoDocsQuery("") | MatchNoDocsQuery("") | selling_points:dame^1000.0 | MatchNoDocsQuery("") | MatchNoDocsQuery("") | indexed_filter_values:dame^10000.0 | target_group:dame^50000.0 | [...] text:trigema^10000.0)~0.2)~3).
   at com.github.benmanes.caffeine.cache.Caffeine.requireState(Caffeine.java:204)
   at com.github.benmanes.caffeine.cache.BoundedLocalCache.requireIsAlive(BoundedLocalCache.java:288)
   at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$put$8(BoundedLocalCache.java:2293)
   at java.base/java.util.concurrent.ConcurrentHashMap.computeIfPresent(ConcurrentHashMap.java:1852)
   at com.github.benmanes.caffeine.cache.BoundedLocalCache.put(BoundedLocalCache.java:2292)
   at com.github.benmanes.caffeine.cache.BoundedLocalCache.put(BoundedLocalCache.java:2212)
   at com.github.benmanes.caffeine.cache.LocalAsyncCache$AsMapView.put(LocalAsyncCache.java:689)
   at org.apache.solr.search.CaffeineCache.put(CaffeineCache.java:267)
   at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1437)
   at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:596)
   [...]
   ```
   
   We'll check what exact `Query` implementation is inserted into the cache.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289716808

   > Is this perhaps related to the JIT loop unrolling bug ([ben-manes/caffeine#797](https://github.com/ben-manes/caffeine/issues/797), [SOLR-16463](https://issues.apache.org/jira/browse/SOLR-16463))? (/cc @uschindler)
   
   Unlikely. The JIT loop unrolling bug is according to my knowledge a false assert. The generated code looks correct, it just hits an assertion recently added to Hotspot (JDK 17).
   
   The problem here seems to exist since 11.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295810270

   I'd need to investigate this on tuesday. I think there was also Querqy involved, so let me check that again, I'll take shots stackdumps etc… to really pinpoint the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295310470

   This query must die, please make fieldss final and remove the setter: https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/WrappedQuery.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1299808362

   For booleanQuery we have many tests in Lucene core and its own cache implementation. So it is three cases where it may break:
   - somebody calls getClauses() and modifies them (adding entries). We should prevent that by making the clauses unmodifiable in Lucene. I'll open an issue.
   - somebody modified one query inside the clauses. I would need to get all clauses of the above example
   - the WrappedQuery itsself is the issue. I would start there, see above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290842612

   Thanks @DennisBerger1984. 
   
   > We came across this bug using JDK11 and Solr 8.11.2
   >  If I remember correctly ... in caffeineCache 3.3.1 
   
   I see that Solr 8.11.2 is [using](https://github.com/apache/lucene-solr/blob/17dee71932c683e345508113523e764c3e4c80fa/lucene/ivy-versions.properties#L39) 2.9.2, whereas [currently](https://github.com/apache/solr/blob/8dec9cf416ab37e73076337a5a2bfa7e9c5d7aff/versions.props#L8) it is using 3.1.1. Can we confirm that your failure happens using 3.3.1? I don't recall any related bug fixes, but it would help us narrow in on the problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by "ben-manes (via GitHub)" <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1431546094

   PR is not necessary but was the underlying bug fixed? The issue was corrupting the cache by modifying the key’s equality while inside of a ConcurrentHashMap. The upgrade just yells if detected, but it was a usage error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289458504

   Is this perhaps related to the JIT loop unrolling bug (https://github.com/ben-manes/caffeine/issues/797, [SOLR-16463](https://issues.apache.org/jira/browse/SOLR-16463))? (/cc @uschindler)
   
   If you think this is instead a bug in Caffeine, please try to provide a reproducer so that we can isolate a fix. When I run a [stress](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/test/java/com/github/benmanes/caffeine/cache/Stresser.java) and [Lincheck](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/test/java/com/github/benmanes/caffeine/lincheck/AbstractLincheckCacheTest.java) tests it is fine, but these can be tricky. The `put` is optimized as a retry look instead of a pessimistic compute (2x faster), but in a microbenchmark so likely not important in an actual workload and correctness is more important.
   
   It looks like `asMap().compute` is the correct replacement as `Cache.get` is the same as `Map.computeIfAbsent` and this inserts or replaces an existing value. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] tboeghk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
tboeghk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289479812

   The threads got stuck in this [infinite loop](https://github.com/ben-manes/caffeine/blob/master/caffeine/src/main/java/com/github/benmanes/caffeine/cache/BoundedLocalCache.java#L2215-L2293) in `BoundedLocalCache`. I'll ask @DennisBerger1984 and @mpetris for a detailed stack trace tomorrow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300604173

   @uschindler does this help?
   
   ![1](https://user-images.githubusercontent.com/69144692/199523879-cdddb7e3-f23c-4c5d-9cde-7099f35dd870.png)
   ![2](https://user-images.githubusercontent.com/69144692/199523888-e922fda3-086d-4a01-afdd-e68606457ae2.png)
   ![3](https://user-images.githubusercontent.com/69144692/199523899-d8cd980a-4e81-4f70-97bd-953020b426dd.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] debe commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
debe commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1312067940

   @renekrie the querqy.rewriters are completely empty already during this tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] [SOLR-16489] CaffeineCache puts thread into infinite loop [solr]

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1951205542

   I think we should close this PR. The underlying fixed impl now yells if an issue was detected and no longer hangs.
   
   If the warning message is still printed, it's a broken equals/hashcode and should be taken care of separately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


Re: [PR] [SOLR-16489] CaffeineCache puts thread into infinite loop [solr]

Posted by "uschindler (via GitHub)" <gi...@apache.org>.
uschindler closed pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop
URL: https://github.com/apache/solr/pull/1118


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311608651

   Hi René, qboost.similarityScore is off, multiMatchTie is set to 0.2. I'm currently running a test with defType=edismax to futher drill down.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311949363

   For current lucene queries the problem may occur when you pass a BytesRef (it is a ref) only while building the query. If you change the bytesRef afterwards it fails.
   
   IMHO queries should clone the terms passed in. Sor has the big problem of over-reusing mutable references, although this is really no problem anymore as JVM can handle that with escape analysis and won't allocate instances. Actually "reusing" is bad in newer java versions, as it will populate the heap with useless temporary (but reused) objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300831545

   DisjunctionMaxQuery unfortunately has a getDisjuncts that returns a mutable Multiset (a Lucene collection subclass): https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295306427

   WrappedQuery is a sole class. Many of the queries there are old and we're never checked for final fields and correctness.
   This would be a good job to start fixing the - sorry to be harsh - heavily broken legacy query bullshit in Solr. Sole should only use Lucebe queries and not add those useless wrappers causing Trouble.
   In Lucebe every query only has final fields and setters are forbidden. We had some problematic ones but this was fixed long ago.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295287661

   There is no clone method for queries, sorry. I don't know what the best way is to solve this. What type of query was causing this?
   
   But the bug with infinite loop should be fixed, so it throws some exception.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1299722332

   ![Screenshot_20221102_084057](https://user-images.githubusercontent.com/69144692/199430736-8210ff3c-0eb6-40d5-889d-aaad853d1bcb.png)
   It's a WrappedQuery Object.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295311727

   @uschindler to make sure I understand - anything in Solr that extends from https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/Query.java should have no setters and final fields. effectively immutable. makes sense to me


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1325654602

   Released caffeine 3.1.2 with this error detection


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1299803390

   Anyways, we should figure out if not the WrappedQuery is changing its hashCode, because this one was explicitely mutable. But the setter is never called from Solr (dead code according to @risdenk, right?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289463410

   > ConcurrentHashMap#put method does not terminate an endless loop.
   
   This could be a livelock if Solr is performing recursive writes (e.g. a write within a computation). That is not supported and should fail fast with an error. The last case that I know of that did not error early was fixed in jdk12, but prior versions could spin forever. That [last case](https://github.com/openjdk/jdk/commit/53d3a4f50cef5712d42c7adcab75fed07354af15#diff-96692a9a81b55896a6d82cca1cf4f5835bcc146e774026c9f8cfc395d2f6beda) was related to modifying the map's size counters.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1292941058

   If you do come by a stacktrace please let me know. I am not having luck in a review of the code, but a fresh set of eyes is always welcome. If you can reproduce then look for other exceptions, like out of memory error or stackoverflow, which could leave an invalid state. In those cases the jvm may not be recoverable, e.g. ExitOnOutOfMemoryError flag.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1296493503

   I have [added](https://github.com/ben-manes/caffeine/commit/9cd509c69344e304eb3ff99fe61cd8810957e6e8) detection for the scenario when the key's equality has changed and corrupted the mapping. For put, putIfAbsent replace, remove, and computes this will throw an exception, whereas clear and eviction will log at the error level.
   
   The put's loop remains indefinite, but after a spin wait threshold it will fallback to a map computation. For the common case it adds no overhead, for a taxed system this yields so that the remover thread may be granted a time slice to finish its work, and for the corruption case it gives us a chance to terminate with a validation error.
   
   These changes are available in `3.1.2-SNAPSHOT` from the sonatype repository.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311820952

   When using defType=edismax the problem doesn't occur. Keeping querqy as defType but removing multiMatchTie doesn't help. But to be honest multiMatchTie was disabled anyway I think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] janhoy commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by "janhoy (via GitHub)" <gi...@apache.org>.
janhoy commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1431533348

   This PR is no longer needed when we have the upgraded CaffeineCache version, or not?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1387471564

   @tboeghk @DennisBerger1984 @mpetris what is the status here? Did you end up finding the Query impl that is changing hashcode/equals? Is this PR necessary still?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1289464453

   Can you try jdk12? Does the stacktrace look to be related to the linked patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1290160892

   We will need to see the stacktrace and debug details. Caffeine’s loop retries only if the entry was a stale read and once locked is determined to have been already removed from the map. That state transition can only happen in the lock and within a map removal operation. That leaves a small window for a stale read, with the caveat of Map.remove(key). That method discards a refresh, which could block if something weird happened during registration. Since refresh isn’t being used here, I cannot deduce any reason for a bug in Caffeine’s code. As ConcurrentHashMap also loops forever, and was the originally claimed source, we need the details to assess a possible cause.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300811132

   This is a non-Lucene or non-Solr query from the external Querqy plugin.
   
   https://github.com/querqy/querqy/blob/1dbafde2eddc0938b309bbe955f127148671fd41/querqy-for-lucene/querqy-lucene/src/main/java/querqy/lucene/rewrite/FieldBoostTermQueryBuilder.java
   
   It looks fine, too. It also implements equals and hashCode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1329636763

   I opened SOLR-16562 to upgrade to Caffeine 3.1.2 which will detect infinite loops (not fix the issue).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by "DennisBerger1984 (via GitHub)" <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1443078199

   All in all it's very hard for us to reproduce the bug reliably. I've seen a couple "An invalid state was detected" message two weeks ago and thought now I've found an offending query, but unfortunatbut they've disappeared and the last week was free of those messages. I've scanned around 1TB and 100 million log entries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295291167

   You may temporary use toString as cache key. But this is not working safely for all query types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1299774688

   Lucerne’s BooleanQuery clauseSets is externally mutable. The getClauses method returns a mutable collection (hashset) so the caller might add to it. I don’t see an occurrence, but some unexpected inner mutability is an easy mistake. The idea in the related issue above of using `@Immutable` to assert deeply would be a nice goal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] DennisBerger1984 commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by "DennisBerger1984 (via GitHub)" <gi...@apache.org>.
DennisBerger1984 commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1408569741

   Hi @risdenk I've been rather busy lately. In the mean time did we upgrade to caffeine 3.1.2 and removed our workaround. Since then we did not encountered the problem again, but neither did we saw contract violated messages from caffeine. So we didn't find it, sadly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295315124

   > @uschindler to make sure I understand - anything in Solr that extends from https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/Query.java should have no setters and final fields. effectively immutable. makes sense to me
   > 
   > I filed https://issues.apache.org/jira/browse/SOLR-16509 to fix these.
   
   Yes. Thanks. The lucene query cache also uses queries as key. They should really be immutable! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295313752

   The superclass is also broken, but not problematic as the fields are mutable only to control caching. But all of that should really be fixed!
   We should aldo have a review of other Query siblasses in Sole for immutability and correct equals/hashcode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295894185

   Reviewing the Query hierarchy and I see mutations that impact equality. Here's the list that I found, but being unfamiliar I might have missed some cases. Hope it helps.
   - LTRQuery's `scoringQuery` is mutable (`originalQuery`)
   - ExportQuery's `mainQuery`
   - PointSetQuery's `maxDocFreq`
   - GraphQuery's `traversalFilter`, `maxDepth`, etc
   - LTRScoringQuery's `originalQuery`
   - Feature's `name`, `index`, etc
     - Subtypes: FieldLengthFeature, FieldValueFeature, OriginalScoreFeature, SolrFeature, ValueFeature
   - LTRInterleavingQuery's `rerankingQueries`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295301391

   I believe it was a `WrappedQuery` holding a complex BooleanQuery, but we did not look enough. I only recall that the key equality was complex with hashmap comparisons, and when we inspected the key it had setters and non-final fields. We didn't think of capturing screenshots of the object states when debugging. Dennis might be able to offer more details.
   
   I will think more about how to handle the infinite loop in this scenario. A retry is needed if a stale read, e.g. the reader might lookup the entry, be context switched, it evicted, and then observe it to be dead. That's fine and a subsequent re-read should cause it to see a fresh result. I may be able to handle this case a little better, but since it can happen safely it is tricky.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295282499

   Thanks @uschindler. I think we need to confirm this hypothesis first. Is there an easy way to make an immutable copy for us to test with? The actual fix might be different, but this would be the easiest way to verify our guess.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295272732

   Thanks to @DennisBerger1984's help, we met to step through the debug session of the cache in this invalid state. Dennis captured this [jstack output](https://github.com/apache/solr/files/9890391/stack.txt). We confirmed the findings that the filterCache is stuck in an infinite loop due to retrieving a dead entry. When pairing on a code walkthrough, we came up with a scenario that could explain this problem and would require some Solr experts to advise us on.
   
   In short, is the cache key stable in filter cache? This is a `org.apache.lucene.search.Query` object and appears to be mutable. Can a query be modified after it was inserted into the cache? If so, then the cache might fail to remove it during eviction, eagerly mark it as dead, and rediscover it later by a lookup. As modifiable keys are violations to the Map contract it is not detected or handled.
   
   During eviction, Caffeine uses a `computeIfPresent` to remove the victim entry. If the entry was not found in the map, then we assume it was due to an explicit, concurrent removal. This allows the cache to eagerly decrement the current size and avoid unnecessary additional evictions to be within bounds. If a concurrent removal then a `RemovalTask` is queued and will no-op when run. However if the key's hash/equals were modified then eviction would instead leave a dead entry in the map, and a subsequent lookup might discover it by a map lookup.
   
   @DennisBerger1984 suggested that the filter cache could use an immutable cache key object. Given the subtypes, Is there an easy way to do this? Would using `toString()` result in a key with the right uniqueness properties? If we can apply this change then Dennis will retest and we'll see if our hypothesis holds.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1295966527

   @ben-manes thanks your list matches mine - I've been fixing most of these here - https://github.com/apache/solr/pull/1146 as part of https://issues.apache.org/jira/browse/SOLR-16509


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] ben-manes commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
ben-manes commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300803918

   oh yes, I did not notice that `Collection<Query> getClauses(Occur occur)` is package-private. This is returning a mutable collection, but since it is not public then that is safe (maybe package-private for testing).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] risdenk commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
risdenk commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1300573313

   https://github.com/apache/solr/search?q=setWrappedQuery is definitely not used in Solr main.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] uschindler commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
uschindler commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311504622

   > Hey should we provide additional information like tests without querqy plugin or is it sufficient to wait for ticket #1146? What do you think?
   
   If you could temporarily disable querqy it would be a good test. I was not able to find all the queries mentioned. At least we could rule out or confirm that querqy is the problem.
   
   The cleanup work in #1146 is good to have, but as WrappedQuery's setQuery was not called from code it wa sin fact immutable.
   
   I still have some problems with the DisjunctionMaxQueries.... (they are created by Querqy), this is a Lucene class and not fully sealed against changes in their clauses. Maybe Querqy modifies them, Solr at least does not do this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] renekrie commented on pull request #1118: [SOLR-16489] CaffeineCache puts thread into infinite loop

Posted by GitBox <gi...@apache.org>.
renekrie commented on PR #1118:
URL: https://github.com/apache/solr/pull/1118#issuecomment-1311573308

   @DennisBerger1984 Are you using the newish `multiMatchTie` as a Querqy query parameter (related to https://github.com/querqy/querqy/issues/281)? Also, are you setting `uq.similarityScore` or `qboost.similarityScore` (see https://querqy.org/docs/querqy/more-about-queries.html#reference) - I'm trying to think of a place where we are manipulating queries more heavily. On the other hand, we never touch anything after the 'parse' step and we never read from the cache, so I can't see how we could touch anything that already has already been put into the cache.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org