You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by "Flowerday, Matthew J" <ma...@gb.unisys.com> on 2021/03/11 06:48:57 UTC

RE: Potential Slow searching for unified highlighting on Solr 8.8.0/8.8.1

Hi Ere

Thanks for the help on this. I have raised SOLR-15246 to cover this.

Many thanks

Matthew

Matthew Flowerday | Consultant | ULEAF
Unisys | 01908 774830| matthew.flowerday@unisys.com 
Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17
8LX



THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is for use only by the intended recipient. If you received this
in error, please contact the sender and delete the e-mail and its
attachments from all devices.
   

-----Original Message-----
From: Ere Maijala <er...@helsinki.fi> 
Sent: 04 March 2021 10:20
To: solr-user@lucene.apache.org
Subject: Re: Potential Slow searching for unified highlighting on Solr
8.8.0/8.8.1

EXTERNAL EMAIL - Be cautious of all links and attachments.

Hi,

Solr uses JIRA for issue tickets. You can find it here:
https://issues.apache.org/jira/browse/SOLR

I'd suggest filing a new bug issue in the SOLR project (note that several
other projects also use this JIRA installation). Here's an example of an
existing highlighter issue for reference:
https://issues.apache.org/jira/browse/SOLR-14019.

See also some brief documentation:

https://cwiki.apache.org/confluence/display/solr/HowToContribute#HowToContri
bute-JIRAtips(ourissue/bugtracker)

Regards,
Ere

Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.58:
> Hi Ere
>
> Please to be of service!
>
> No I have not filed a JIRA ticket. I am new to interacting with the 
> Solr Community and only beginning to 'find my legs'. I am not too sure 
> what JIRA is I am afraid!
>
> Regards
>
> Matthew
>
> Matthew Flowerday | Consultant | ULEAF Unisys | 01908 774830| 
> matthew.flowerday@unisys.com Address Enigma | Wavendon Business Park | 
> Wavendon | Milton Keynes | MK17 8LX
>
>
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE 
> PROPRIETARY MATERIAL and is for use only by the intended recipient. If 
> you received this in error, please contact the sender and delete the 
> e-mail and its attachments from all devices.
>
>
> -----Original Message-----
> From: Ere Maijala <er...@helsinki.fi>
> Sent: 01 March 2021 12:53
> To: solr-user@lucene.apache.org
> Subject: Re: Potential Slow searching for unified highlighting on Solr
> 8.8.0/8.8.1
>
> EXTERNAL EMAIL - Be cautious of all links and attachments.
>
> Hi,
>
> Whoa, thanks for the heads-up! You may just have saved me from a whole 
> lot of trouble. Did you file a JIRA ticket already?
>
> Thanks,
> Ere
>
> Flowerday, Matthew J kirjoitti 1.3.2021 klo 14.00:
>> Hi There
>>
>> I just came across a situation where a unified highlighting search 
>> under solr 8.8.0/8.8.1 can take over 20 mins to run and eventually 
>> times
> out.
>> I resolved it by a config change – but it can catch you out. Hence 
>> this email.
>>
>> With solr 8.8.0 a new unified highlighting parameter 
>> &hl.fragAlignRatio was implemented which if not set defaults to 0.5.
>> This attempts to improve the high lighting so that highlighted text 
>> does not appear right at the left. This works well but if you have a 
>> search result with numerous occurrences of the word in question 
>> within the record performance goes right down!
>>
>> 2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf] 
>> o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select 
>> params={hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=1000000&fl=id,
>> d
>> escription,specification,score&start=20&hl.fl=*&rows=10&_=16144051191
>> 3
>> 4}
>> hits=57008 status=0 QTime=1414320
>>
>> 2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf] 
>> o.a.s.s.HttpSolrCall Unable to write response, client closed 
>> connection or we are shutting down => 
>> org.eclipse.jetty.io.EofException
>>
>>                 at
>> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
>>
>> org.eclipse.jetty.io.EofException: null
>>
>>                 at
>> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
>> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>>
>>                 at
>> org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
>> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>>
>>                 at
>> org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378
>> ) ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>>
>> when I set &hl.fragAlignRatio=0.25 results came back much quicker
>>
>> 2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes] 
>> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
>> params={hl.weightMatches=false&hl=on&fl=id,description,specification,
>> s 
>> core&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.
>> m 
>> axAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_
>> =
>> 1614430061690}
>> hits=136939 status=0 QTime=87024
>>
>> And  &hl.fragAlignRatio=0.1
>>
>> 2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes] 
>> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
>> params={hl.weightMatches=false&hl=on&fl=id,description,specification,
>> s 
>> core&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.m
>> a
>> xAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=
>> 1
>> 614430061690}
>> hits=136939 status=0 QTime=69033
>>
>> And &hl.fragAlignRatio=0.0
>>
>> 2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes] 
>> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
>> params={hl.weightMatches=false&hl=on&fl=id,description,specification,
>> s 
>> core&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.m
>> a
>> xAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=
>> 1
>> 614430061690}
>> hits=136939 status=0 QTime=2841
>>
>> I left our setting at 0.0 – this presumably how it was in 7.7.1 
>> (fully left aligned).  I am not too sure as to how many time a word 
>> has to occur in a record for performance to go right down – but if 
>> too many it can have a BIG impact.
>>
>> I also noticed that setting &timeAllowed=90000 did not break out of 
>> the query until it finished. Perhaps because the query finished 
>> quickly and what took the time was the highlighting. It might be an 
>> idea to get &timeAllowed to also cover any highlighting so that the 
>> query does not run until the jetty timeout is hit. The machine 100% 
>> one core for about
>> 20 mins!.
>>
>> Hope this helps.
>>
>> Regards
>>
>> Matthew
>>
>> *Matthew Flowerday*| Consultant | ULEAF
>>
>> Unisys | 01908 774830| matthew.flowerday@unisys.com 
>> <ma...@unisys.com>
>>
>> Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes |
>> MK17 8LX
>>
>> unisys_logo <http://www.unisys.com/>
>>
>> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE 
>> PROPRIETARY MATERIAL and is for use only by the intended recipient. 
>> If you received this in error, please contact the sender and delete 
>> the e-mail and its attachments from all devices.
>>
>> Grey_LI <http://www.linkedin.com/company/unisys>Grey_TW
>> <http://twitter.com/unisyscorp>Grey_YT
>> <http://www.youtube.com/theunisyschannel>Grey_FB
>> <http://www.facebook.com/unisyscorp>Grey_Vimeo
>> <https://vimeo.com/unisys>Grey_UB <http://blogs.unisys.com/>
>>
>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
>

--
Ere Maijala
Kansalliskirjasto / The National Library of Finland