You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2019/11/15 17:58:00 UTC

[jira] [Reopened] (SOLR-13898) Non-atomic use of SolrCache get / put

     [ https://issues.apache.org/jira/browse/SOLR-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris M. Hostetter reopened SOLR-13898:
---------------------------------------

New test TestSolrCachePerf.testGetPutCompute seems to be overly brittle in it's math comparisons? (note also all the ERROR logging suggesting the cache lifecyle is broken)

[http://fucit.org/solr-jenkins-reports/job-data/apache/Lucene-Solr-NightlyTests-8.x/269]
{noformat}
   [junit4] Suite: org.apache.solr.search.TestSolrCachePerf
   [junit4]   2> 1397484 INFO  (SUITE-TestSolrCachePerf-seed#[3444241481A864AF]-worker) [     ] o.a.s.SolrTestCaseJ4 Created dataDir: /home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/checkout/solr/build/solr-core/test/J2/temp/solr.search.TestSolrCachePerf_3444241481A864AF-001/data-dir-91-001
   [junit4]   2> 1397484 WARN  (SUITE-TestSolrCachePerf-seed#[3444241481A864AF]-worker) [     ] o.a.s.SolrTestCaseJ4 startTrackingSearchers: numOpens=32 numCloses=32
   [junit4]   2> 1397484 INFO  (SUITE-TestSolrCachePerf-seed#[3444241481A864AF]-worker) [     ] o.a.s.SolrTestCaseJ4 Using PointFields (NUMERIC_POINTS_SYSPROP=true) w/NUMERIC_DOCVALUES_SYSPROP=true
   [junit4]   2> 1397485 INFO  (SUITE-TestSolrCachePerf-seed#[3444241481A864AF]-worker) [     ] o.a.s.SolrTestCaseJ4 Randomized ssl (false) and clientAuth (false) via: @org.apache.solr.util.RandomizeSSL(reason=, value=NaN, ssl=NaN, clientAuth=NaN)
   [junit4]   2> 1397485 INFO  (SUITE-TestSolrCachePerf-seed#[3444241481A864AF]-worker) [     ] o.a.s.SolrTestCaseJ4 SecureRandom sanity checks: test.solr.allowed.securerandom=null & java.security.egd=file:/dev/./urandom
   [junit4]   2> 1397487 INFO  (TEST-TestSolrCachePerf.testGetPutCompute-seed#[3444241481A864AF]) [     ] o.a.s.SolrTestCaseJ4 ###Starting testGetPutCompute
   [junit4]   2> 1398323 ERROR (Finalizer) [     ] o.a.s.u.ConcurrentLFUCache ConcurrentLFUCache was not destroyed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
   [junit4]   2> 1398323 ERROR (Finalizer) [     ] o.a.s.u.ConcurrentLFUCache ConcurrentLFUCache was not destroyed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
   [junit4]   2> 1398323 ERROR (Finalizer) [     ] o.a.s.u.ConcurrentLFUCache ConcurrentLFUCache was not destroyed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
   [junit4]   2> 1398323 ERROR (Finalizer) [     ] o.a.s.u.ConcurrentLFUCache ConcurrentLFUCache was not destroyed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
      ...skip ~200 duplicate ERROR lines...
   [junit4]   2> 1414149 INFO  (TEST-TestSolrCachePerf.testGetPutCompute-seed#[3444241481A864AF]) [     ] o.a.s.SolrTestCaseJ4 ###Ending testGetPutCompute
   [junit4]   2> NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory.
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestSolrCachePerf -Dtests.method=testGetPutCompute -Dtests.seed=3444241481A864AF -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.x/test-data/enwiki.random.lines.txt -Dtests.locale=en-SG -Dtests.timezone=America/Swift_Current -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 16.7s J2 | TestSolrCachePerf.testGetPutCompute <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: compute ratio (0.9992) should be higher or equal from get/put (0.9992000000000001)
   [junit4]    >        at __randomizedtesting.SeedInfo.seed([3444241481A864AF:B4ECA31C59217C30]:0)
   [junit4]    >        at org.apache.solr.search.TestSolrCachePerf.lambda$testGetPutCompute$0(TestSolrCachePerf.java:75)
   [junit4]    >        at java.util.HashMap.forEach(HashMap.java:1289)
   [junit4]    >        at org.apache.solr.search.TestSolrCachePerf.testGetPutCompute(TestSolrCachePerf.java:73)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)

{noformat}

> Non-atomic use of SolrCache get / put
> -------------------------------------
>
>                 Key: SOLR-13898
>                 URL: https://issues.apache.org/jira/browse/SOLR-13898
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.3
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 8.4
>
>         Attachments: SOLR-13898.patch, SOLR-13898.patch, SOLR-13898.patch
>
>
> As pointed out by [~ben.manes] in SOLR-13817 Solr code base in many key places uses a similar pattern of non-atomic get / put calls to SolrCache-s. In multi-threaded environment this leads to cache misses and additional unnecessary computations when competing threads both discover a missing value, non-atomically compute it and update the cache.
> Some of these places are known performance bottlenecks where efficient caching is especially important, such as {{SolrIndexSearcher}}, {{SolrDocumentFetcher}}, {{UninvertedField}} and join queries .
> I propose to add {{SolrCache.computeIfAbsent(key, mappingFunction)}} that will atomically retrieve existing values or compute and update the cache. This will require also changing how the {{SolrCache.get(...)}} is used in many components.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org