You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Daniel Doubrovkine (Jira)" <ji...@apache.org> on 2022/03/03 14:28:00 UTC
[jira] [Comment Edited] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500789#comment-17500789 ]
Daniel Doubrovkine edited comment on LUCENE-10428 at 3/3/22, 2:27 PM:
----------------------------------------------------------------------
I agree with closing. The loop can't happen anymore, and we can open a new issue when we see new data pointing to a bug elsewhere.
was (Author: dblock):
I agree with the above. The loop can't happen anymore, and we can open a new issue when we see new data pointing to a bug elsewhere.
> getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop
> -----------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-10428
> URL: https://issues.apache.org/jira/browse/LUCENE-10428
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/query/scoring, core/search
> Reporter: Ankit Jain
> Priority: Major
> Attachments: Flame_graph.png
>
> Time Spent: 5.5h
> Remaining Estimate: 0h
>
> Customers complained about high CPU for Elasticsearch cluster in production. We noticed that few search requests were stuck for long time
> {code:java}
> % curl -s localhost:9200/_cat/tasks?v
> indices:data/read/search[phase/query] AmMLzDQ4RrOJievRDeGFZw:569205 AmMLzDQ4RrOJievRDeGFZw:569204 direct 1645195007282 14:36:47 6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:502075 emjWc5bUTG6lgnCGLulq-Q:502074 direct 1645195037259 14:37:17 6.2h
> indices:data/read/search[phase/query] emjWc5bUTG6lgnCGLulq-Q:583270 emjWc5bUTG6lgnCGLulq-Q:583269 direct 1645201316981 16:21:56 4.5h
> {code}
> Flame graphs indicated that CPU time is mostly going into *getMinCompetitiveScore method in MaxScoreSumPropagator*. After doing some live JVM debugging found that org.apache.lucene.search.MaxScoreSumPropagator.scoreSumUpperBound method had around 4 million invocations every second
> Figured out the values of some parameters from live debugging:
> {code:java}
> minScoreSum = 3.5541441
> minScore + sumOfOtherMaxScores (params[0] scoreSumUpperBound) = 3.554144322872162
> returnObj scoreSumUpperBound = 3.5541444
> Math.ulp(minScoreSum) = 2.3841858E-7
> {code}
> Example code snippet:
> {code:java}
> double sumOfOtherMaxScores = 3.554144322872162;
> double minScoreSum = 3.5541441;
> float minScore = (float) (minScoreSum - sumOfOtherMaxScores);
> while (scoreSumUpperBound(minScore + sumOfOtherMaxScores) > minScoreSum) {
> minScore -= Math.ulp(minScoreSum);
> System.out.printf("%.20f, %.100f\n", minScore, Math.ulp(minScoreSum));
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org