You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@maven.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/10/25 11:08:58 UTC
[jira] [Commented] (MINDEXER-99) improve performance loss
introduced in MINDEXER-77
[ https://issues.apache.org/jira/browse/MINDEXER-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15604990#comment-15604990 ]
ASF GitHub Bot commented on MINDEXER-99:
----------------------------------------
GitHub user tstupka opened a pull request:
https://github.com/apache/maven-indexer/pull/12
resolve performance loss due to lucene 4.8.1 - upgrade to lucene 553 and additional fixes
since lucene was upgraded to 4.8.1 the indexer takes 2.5x longer than with lucene 3.6. This seems be a cumulative effect of partial reductions in performance introduced in particular lucene releases after 3.6 - see also https://issues.apache.org/jira/browse/MINDEXER-99
see the particular commits in this request. Each addresses a suggestion for a specific improvement. When all applied, the resulting performance is comparable with the performance before the above mentioned upgrade.
#0bb9484 - upgrading lucene from 4.8.1 to 5.5.3
performance was improved in lucene 5.x.
with 5.5.3 the indexer works significantly faster than with 4.8.1
#3cfa430 - avoid rebuilding groups after reading index
after generating the index it has to be re-read one more time to extract a distinct list of allGroups and rootGroups, even though that info was already available, but thrown away.
#8b98a49 - improve reading from zip file
#4062146 - do not unnecessarily force merge on index writer
merge is very expensive - lets trust lucene to merge when it seems fit.
the final index size without force merges was 910mb compared to 900mb with fm.
the time improvement is aprox 30%
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tstupka/maven-indexer lucene_553
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/maven-indexer/pull/12.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12
----
commit 0bb9484e6eaea3e7974d7e5c9b5ab3d6802780e9
Author: Tomas Stupka <to...@oracle.com>
Date: 2016-10-24T15:25:46Z
upgrading lucene from 4.8.1 to 5.5.3
commit 3cfa430d71a8d58a0454966c0dd183e37f5fb067
Author: Tomas Stupka <to...@oracle.com>
Date: 2016-10-25T08:47:19Z
avoid rebuilding groups after reading index
commit 8b98a495186cafe20ee6494719185e74813ea15e
Author: Tomas Stupka <to...@oracle.com>
Date: 2016-10-25T09:01:18Z
improve reading from zip file
commit 40621465f3ebf14a89961d07ded0d17a4d2d61bc
Author: Tomas Stupka <to...@oracle.com>
Date: 2016-10-25T09:50:29Z
do not unnecessarily force merge on index writer
----
> improve performance loss introduced in MINDEXER-77
> --------------------------------------------------
>
> Key: MINDEXER-99
> URL: https://issues.apache.org/jira/browse/MINDEXER-99
> Project: Maven Indexer
> Issue Type: Improvement
> Affects Versions: 6.0
> Reporter: Tomas Stupka
> Fix For: 6.0
>
>
> even though lucene upgrade from 3.6 to 4.8.1 significantly reduced disk space usage, the unfortunate tradeoff is, that the indexer now needs approximately 2.5 times longer - 120s vs 300s, measured by using the indexer in current NetBeans dev build.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)