You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "jianping weng (Jira)" <ji...@apache.org> on 2022/03/06 09:36:00 UTC

[jira] [Comment Edited] (LUCENE-10425) count aggregation optimization inside one segment in log scenario

    [ https://issues.apache.org/jira/browse/LUCENE-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501913#comment-17501913 ] 

jianping weng edited comment on LUCENE-10425 at 3/6/22, 9:35 AM:
-----------------------------------------------------------------

> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now? it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search


was (Author: JIRAUSER285389):
> I'm not sure #687 actually helps compared to what we are already doing.

[~jpountz] Hi, is [#687|https://github.com/apache/lucene/pull/687] ok now, it can speed up getting min/max doc Id when index sort ascend enabled instead of use docValue binary search

> count aggregation optimization inside one segment in log scenario
> -----------------------------------------------------------------
>
>                 Key: LUCENE-10425
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10425
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/search
>            Reporter: jianping weng
>            Priority: Major
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> In log scenario, we usually want to know the doc count of documents between every time intervals. One possible optimized method is to sort the docuemt in ascend order according to @timestamp field in one segment. then we can use    this pr [https://github.com/apache/lucene/pull/687] to find out the min/max docId in on time interval.
> If there is no other filter query, the doc count of one time interval is (max docId- min docId +1)
> if there is only one another term filter query, we can use this pr [https://github.com/apache/lucene/pull/688 |https://github.com/apache/lucene/pull/688]to get the diff value of index, when we call advance(minId) and advance(maxId), the diff value is also the doc count of one time interval
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org