You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2015/06/10 05:27:01 UTC

[jira] [Updated] (SOLR-7655) Perf bug- DefaultSolrHighlighter.getSpanQueryScorer triggers MultiFields.getMergedFieldInfos

     [ https://issues.apache.org/jira/browse/SOLR-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley updated SOLR-7655:
-------------------------------
    Attachment: SOLR-7655.patch

Here's a patch; a little better than the "suggested fix": the Terms returned could be null, and if an exception somehow gets thrown then we can log it without re-throwing.

I did a little performance testing on a project I have.  It seems this perf bug is most pronounced if you attempt to highlight on a ton of fields (e.g. via {{hl.fl=*}}), and if there are a lot of Lucene segments.  And furthermore if you don't have a lot of text to highlight per field then the overhead here is proportionally higher to the overall task.

Precommit is happy and the tests pass.  It'd be nice to get this into 5.2.1 but would like to see a +1 from someone.  What do you think [~shalinmangar] (you're the RM I believe).

As a side note... I'm wondering if SlowCompositeReaderWrapper ought to cache FieldInfos too; maybe lazyily.  

> Perf bug- DefaultSolrHighlighter.getSpanQueryScorer triggers MultiFields.getMergedFieldInfos
> --------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7655
>                 URL: https://issues.apache.org/jira/browse/SOLR-7655
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 5.0
>            Reporter: David Smiley
>            Assignee: David Smiley
>         Attachments: SOLR-7655.patch
>
>
> It appears grabbing the FieldInfos from the SlowCompositeReaderWrapper is slow.  It isn't cached.  The DefaultSolrHighligher in SOLR-6196 (v5.0) uses it to ascertain if there are payloads.  Instead it can grab it from the Terms instance, which is cached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org