You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrew Clegg <an...@gmail.com> on 2009/11/03 17:06:54 UTC

Highlighting is very slow

Hi everyone,

I'm experimenting with highlighting for the first time, and it seems
shockingly slow for some queries.

For example, this query:

http://server:8080/solr/select/?q=transferase&qt=dismax&version=2.2&start=0&rows=10&indent=on

takes 313ms. But when I add highlighting:

http://server:8080/solr/select/?q=transferase&qt=dismax&version=2.2&start=0&rows=10&indent=on&hl=true&hl.fl=*&fl=id

it takes 305212ms = 5mins!

Some of my documents are slightly large -- the 10 hits for that query
contain between 362 bytes and 1.4 megabytes of text each. All fields are
stored and indexed, and most are termvectored. But this doesn't seem
excessively large!

Has anyone else seen this sort of behaviour before? This is with a nightly
from 2009-10-26.

All suggestions would be appreciated. My schema and config files are
attached...

http://old.nabble.com/file/p26160216/schema.xml schema.xml 
http://old.nabble.com/file/p26160216/solrconfig.xml solrconfig.xml 

Thanks (once again),

Andrew.

-- 
View this message in context: http://old.nabble.com/Highlighting-is-very-slow-tp26160216p26160216.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highlighting is very slow

Posted by Nicolas Dessaigne <ni...@dessaigne.net>.
I'm afraid there is no perfect solution for this problem, as you may always
have very long documents that will result in long response times, even with
a faster implementation (see https://issues.apache.org/jira/browse/SOLR-1268
).

The only way to avoid confusion for users and to ensure correct response
times is to truncate the indexed field. This way, every documents returned
can be highlighted... but you'll miss matches in long documents!

If you don't control the length of the documents and need highlight, either
you don't highlight all documents, either you don't find all documents. I
think that a pretty large copyfield (maybe 50k?) is usually enough for most
documents to be highlighted, but that depends on your corpus.

Good luck ;)
Nicolas


2009/11/9 Andrew Clegg <an...@gmail.com>

>
>
> Nicolas Dessaigne wrote:
> >
> > Alternatively, you could use a copyfield with a maxChars limit as your
> > highlighting field. Works well in my case.
> >
>
> Thanks for the tip. We did think about doing something similar (only
> enabling highlighting for certain shorter fields) but we decided that
> perhaps users would be confused if search terms were sometimes
> snippeted+highlighted and sometimes not. (A brief run through with a single
> user suggested this, although that's not statistically significant...) So
> we
> decided to avoid highlighting altogether until we can do it across the
> board.
>
> Cheers,
>
> Andrew.
> --
> View this message in context:
> http://old.nabble.com/Highlighting-is-very-slow-tp26160216p26267441.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Highlighting is very slow

Posted by Andrew Clegg <an...@gmail.com>.

Nicolas Dessaigne wrote:
> 
> Alternatively, you could use a copyfield with a maxChars limit as your
> highlighting field. Works well in my case.
> 

Thanks for the tip. We did think about doing something similar (only
enabling highlighting for certain shorter fields) but we decided that
perhaps users would be confused if search terms were sometimes
snippeted+highlighted and sometimes not. (A brief run through with a single
user suggested this, although that's not statistically significant...) So we
decided to avoid highlighting altogether until we can do it across the
board.

Cheers,

Andrew.
-- 
View this message in context: http://old.nabble.com/Highlighting-is-very-slow-tp26160216p26267441.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highlighting is very slow

Posted by Nicolas Dessaigne <ni...@dessaigne.net>.
Hi Andrew,

Alternatively, you could use a copyfield with a maxChars limit as your
highlighting field. Works well in my case.

See https://issues.apache.org/jira/browse/SOLR-538

Nicolas

2009/11/5 Andrew Clegg <an...@gmail.com>

>
>
> Indeed -- it actually went slightly slower but only by a few seconds, I
> suspect that's within normal variance.
>
> I'll hold out for the new version then -- it's certainly not mission
> critical.
>
> Thanks,
>
> Andrew.
>
>
> markrmiller wrote:
> >
> > It should be the same speed wither way for a term query. The
> > highlighted is going to be slow on general for a 1mb + doc. It
> > processes a token at a time. The fast vector highlighter is much
> > faster in those cases and should be in the next release. It handles
> > fewer query types though.
> >
> > - Mark
> >
> > http://www.lucidimagination.com (mobile)
> >
> > On Nov 4, 2009, at 1:26 PM, Chris Hostetter <ho...@fucit.org>
> > wrote:
> >
> >>
> >> : Has anyone else seen this sort of behaviour before? This is with a
> >> nightly
> >> : from 2009-10-26.
> >>
> >> have you tried hl.usePhraseHighlighter=false ? ...
> >>
> >>
> http://old.nabble.com/Highlighting-performance-between-1.3-and-1.4rc-to26190790.html
> >>
> >> ...it doesn't seem like it should be affecting you for a simple term
> >> query, but i'm not sure.
> >>
> >>
> >>
> >> -Hoss
> >>
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/Highlighting-is-very-slow-tp26160216p26211697.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Highlighting is very slow

Posted by Andrew Clegg <an...@gmail.com>.

Indeed -- it actually went slightly slower but only by a few seconds, I
suspect that's within normal variance.

I'll hold out for the new version then -- it's certainly not mission
critical.

Thanks,

Andrew.


markrmiller wrote:
> 
> It should be the same speed wither way for a term query. The  
> highlighted is going to be slow on general for a 1mb + doc. It  
> processes a token at a time. The fast vector highlighter is much  
> faster in those cases and should be in the next release. It handles  
> fewer query types though.
> 
> - Mark
> 
> http://www.lucidimagination.com (mobile)
> 
> On Nov 4, 2009, at 1:26 PM, Chris Hostetter <ho...@fucit.org>  
> wrote:
> 
>>
>> : Has anyone else seen this sort of behaviour before? This is with a  
>> nightly
>> : from 2009-10-26.
>>
>> have you tried hl.usePhraseHighlighter=false ? ...
>>
>> http://old.nabble.com/Highlighting-performance-between-1.3-and-1.4rc-to26190790.html
>>
>> ...it doesn't seem like it should be affecting you for a simple term
>> query, but i'm not sure.
>>
>>
>>
>> -Hoss
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/Highlighting-is-very-slow-tp26160216p26211697.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Highlighting is very slow

Posted by Mark Miller <ma...@gmail.com>.
It should be the same speed wither way for a term query. The  
highlighted is going to be slow on general for a 1mb + doc. It  
processes a token at a time. The fast vector highlighter is much  
faster in those cases and should be in the next release. It handles  
fewer query types though.

- Mark

http://www.lucidimagination.com (mobile)

On Nov 4, 2009, at 1:26 PM, Chris Hostetter <ho...@fucit.org>  
wrote:

>
> : Has anyone else seen this sort of behaviour before? This is with a  
> nightly
> : from 2009-10-26.
>
> have you tried hl.usePhraseHighlighter=false ? ...
>
> http://old.nabble.com/Highlighting-performance-between-1.3-and-1.4rc-to26190790.html
>
> ...it doesn't seem like it should be affecting you for a simple term
> query, but i'm not sure.
>
>
>
> -Hoss
>

Re: Highlighting is very slow

Posted by Chris Hostetter <ho...@fucit.org>.
: Has anyone else seen this sort of behaviour before? This is with a nightly
: from 2009-10-26.

have you tried hl.usePhraseHighlighter=false ? ...

http://old.nabble.com/Highlighting-performance-between-1.3-and-1.4rc-to26190790.html

...it doesn't seem like it should be affecting you for a simple term 
query, but i'm not sure.



-Hoss