You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steve Mason (JIRA)" <ji...@apache.org> on 2016/07/22 13:59:20 UTC

[jira] [Updated] (LUCENE-7391) MemoryIndexReader.fields() performance regression

     [ https://issues.apache.org/jira/browse/LUCENE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Mason updated LUCENE-7391:
--------------------------------
    Attachment: LUCENE-7391.patch

Patch attached. Note that all unit tests pass. I've also run it through our integration test suite (matching 1000 queries against 45000 documents) and verified that they pass as well

It would be good to know why the original code was like this, I wonder if [~martijn.v.groningen] remembers - it seems to be tied to this comment: https://issues.apache.org/jira/browse/LUCENE-7091?focusedCommentId=15189525&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15189525

If the existing behaviour needs to be preserved then that's fine - if someone can provide me with a test case (or explain one to me) then I'll add it to the patch and formulate an alternative solution

> MemoryIndexReader.fields() performance regression
> -------------------------------------------------
>
>                 Key: LUCENE-7391
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7391
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Steve Mason
>         Attachments: LUCENE-7391.patch
>
>
> While upgrading our codebase from Lucene 4 to Lucene 6 we found a significant performance regression - a 5x slowdown
> On profiling the code, the method MemoryIndexReader.fields() shows up as one of the hottest methods
> Looking at the method, it just creates a copy of the inner {{fields}} Map before passing it to {{MemoryFields}}. It does this so that it can filter out fields with {{numTokens <= 0}}.
> The simplest "fix" would be to just remove the copying of the map completely, and pass {{fields}} directly to {{MemoryFields}}.  It's simple and removes any slowdown caused by this method.  It does potentially change behaviour though, but none of the unit tests seem to test that behaviour so I wonder whether it's necessary (I looked at the original ticket LUCENE-7091 that introduced this code, I can't find much in way of an explanation). I'm going to attach a patch to this effect anyway and we can take things from there



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org