You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Kevin Risden (Jira)" <ji...@apache.org> on 2022/04/26 17:46:00 UTC

[jira] [Comment Edited] (LUCENE-10534) MinFloatFunction / MaxFloatFunction exists check can be slow

    [ https://issues.apache.org/jira/browse/LUCENE-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528331#comment-17528331 ] 

Kevin Risden edited comment on LUCENE-10534 at 4/26/22 5:45 PM:
----------------------------------------------------------------

Here is a small part of the flamegraph highlighting the exists from the Max float function

 !flamegraph.png|height=250,width=250!

I think part of the issue might actually be the implementation of exists for FloatFieldSource and LongFieldSource after LUCENE-7407:
* https://github.com/apache/lucene/blame/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/FloatFieldSource.java#L84
* https://github.com/apache/lucene/blame/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/LongFieldSource.java#L96

Both of these actually return the value through getValueForDoc even though we really only need to check if the doc ids match. The exists method doesn't even check the actual value at all. This pops up in the flamegraph under exists as well.

 !flamegraph_getValueForDoc.png|height=250,width=250!


was (Author: risdenk):
Here is a small part of the flamegraph highlighting the exists from the Max float function

 !flamegraph.png! 

I think part of the issue might actually be the implementation of exists for FloatFieldSource and LongFieldSource after LUCENE-7407:
* https://github.com/apache/lucene/blame/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/FloatFieldSource.java#L84
* https://github.com/apache/lucene/blame/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/LongFieldSource.java#L96

Both of these actually return the value through getValueForDoc even though we really only need to check if the doc ids match. The exists method doesn't even check the actual value at all. This pops up in the flamegraph under exists as well.

 !flamegraph_getValueForDoc.png! 

> MinFloatFunction / MaxFloatFunction exists check can be slow
> ------------------------------------------------------------
>
>                 Key: LUCENE-10534
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10534
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>            Priority: Minor
>         Attachments: flamegraph.png, flamegraph_getValueForDoc.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> MinFloatFunction (https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MinFloatFunction.java) and MaxFloatFunction (https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MaxFloatFunction.java) both check if values exist. This is needed since the underlying valuesource returns 0.0f as either a valid value or as a value when the document doesn't have a value.
> Even though this is changed to anyExists and short circuits in the case a value is found in any document, the worst case is that there is no value found and requires checking all the way through to the raw data. This is only needed when 0.0f is returned and need to determine if it is a valid value or the not found case.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org