You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "Tamas Cservenak (Jira)" <ji...@apache.org> on 2022/10/14 12:44:00 UTC

[jira] [Assigned] (MINDEXER-151) Speed up Index update from remote

     [ https://issues.apache.org/jira/browse/MINDEXER-151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tamas Cservenak reassigned MINDEXER-151:
----------------------------------------

    Assignee: Tamas Cservenak

> Speed up Index update from remote
> ---------------------------------
>
>                 Key: MINDEXER-151
>                 URL: https://issues.apache.org/jira/browse/MINDEXER-151
>             Project: Maven Indexer
>          Issue Type: Improvement
>            Reporter: Tamas Cservenak
>            Assignee: Tamas Cservenak
>            Priority: Major
>             Fix For: 7.0.0
>
>
> Currently, if you execute from examples the BasicUsageExample, it will perform "full" update, and the full update (to get from "empty" index to "up to date" index) takes 15 or more minutes. Yes, Central index is huge, but there is room for improvement.
> Steps happening during update(s):
>  * properties file downloaded
>  * GZ file(s) downloaded (depending is it incremental or full)
>  * the GZ files are processed into temporary Lucene index
>  * the target (being updated) indexing context index is "replaced" (or merged, depends) with temporary Lucene index
> Downloading files are several seconds, but it is the processing of the GZIP raw records into Lucene index that takes long time. This can be improved.
> The work done here is somewhat interleaved with MINDEXER-150 as well, as the duplicate code (incremental download handling, consuming the downloaded GZ files) should be reused, not duplicated across modules.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)