You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2021/10/07 08:01:00 UTC

[jira] [Resolved] (LUCENE-10125) Investigate indexing throughput regression on NYC Taxis between 2021-04-12 and 2021-05-24

     [ https://issues.apache.org/jira/browse/LUCENE-10125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand resolved LUCENE-10125.
-----------------------------------
    Fix Version/s: main (9.0)
       Resolution: Fixed

I couldn't find an easy way to improve it. The fact that IndexingChain allows the schema definition of a field to be splitted across multiple Field instances makes it challenging to improve, e.g. one that only adds points and another one that adds doc values.

There might be other opportunities for speedup but the main sources of the regression seem to have been addressed now and the indexing rate is now almost on par with how it was in April, so I'll close this issue.

> Investigate indexing throughput regression on NYC Taxis between 2021-04-12 and 2021-05-24
> -----------------------------------------------------------------------------------------
>
>                 Key: LUCENE-10125
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10125
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: main (9.0)
>
>         Attachments: LUCENE-10125_hack.patch
>
>          Time Spent: 7h
>  Remaining Estimate: 0h
>
> There's been a significant drop in indexing throughput between 2021-04-12 and 2021-05-24 on the NYC Taxis benchmark. Unfortunately several suspects have been merged during that period of time so we might need to git bisect to figure out which one is responsible for the regression. Interestingly the sorted index looks less affected than the non-sorted indexes.
> https://home.apache.org/~mikemccand/lucenebench/sparseResults.html#index_throughput



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org