You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2015/03/02 06:58:04 UTC

[jira] [Updated] (LUCENE-6320) speed up checkindex

     [ https://issues.apache.org/jira/browse/LUCENE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-6320:
--------------------------------
    Attachment: LUCENE-6320.patch

Here is a patch. We use codec apis to do these checks, so the optimizations we already worked on for merge help a lot (esp. stored fields, norms, docvalues).

When we check postings without deletes, we weren't reusing postingsenum and were clone()'ing for every term.

FieldInfos.get(int) is a cpu hog for stored fields and vectors, since its called for every field in the doc and we do O(log N) lookup each time. Its wasteful in memory usually too (using a treemap always when in most cases a simple array is smaller and faster).

> speed up checkindex
> -------------------
>
>                 Key: LUCENE-6320
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6320
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-6320.patch
>
>
> This is fairly slow today, very ram intensive, and has some buggy stuff (e.g. postingsenum reuse bugs). We can do better...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org