You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Juan Carlos Serrano <jc...@gmail.com> on 2014/02/19 16:53:07 UTC
Exact fragment length in highlighting
Hello everybody,
I'm using Solr 4.6.1. and I'd like to know if there's a way to determine
exactly the number of characters of a fragment used in highlights. If I use
hl.fragsize=70 the length of the fragments that I get is variable (often)
and I get results of 90 characters length.
Regards and thanks in advance,
Juan Carlos
Re: Exact fragment length in highlighting
Posted by Jason Hellman <jh...@innoventsolutions.com>.
Juan,
Pay close attention to the boundary scanner you’re employing:
http://wiki.apache.org/solr/HighlightingParameters#hl.boundaryScanner
You can be explicit to indicate a type (hl.bs.type) with options such as CHARACTER, WORD, SENTENCE, and LINE. The default is WORD (as the wiki indicates) and I presume this is what you are employing.
Be careful about using explicit characters. I had an interesting case of highlight returns that looked like this:
> This is a highlight
> Here is another highlight
> Yes, another one, etc…
It was a bit maddening trying to figure out why “>” was in the highlight…turned out it was XML content and the character boundary clipped the trailing “>” based on the boundary rules.
In any case, you should be able to achieve a pretty flexible result depending on what you’re really after with the right combination of settings.
Jason
On Feb 19, 2014, at 7:53 AM, Juan Carlos Serrano <jc...@gmail.com> wrote:
> Hello everybody,
>
> I'm using Solr 4.6.1. and I'd like to know if there's a way to determine
> exactly the number of characters of a fragment used in highlights. If I use
> hl.fragsize=70 the length of the fragments that I get is variable (often)
> and I get results of 90 characters length.
>
> Regards and thanks in advance,
>
> Juan Carlos
Re: Exact fragment length in highlighting
Posted by Ahmet Arslan <io...@yahoo.com>.
Hi Juan,
Are you counting number of characters of html rendered snippet?
I think pre and post strings (html markup which are not displayed) are causing that difference.
Ahmet
On Wednesday, February 19, 2014 5:53 PM, Juan Carlos Serrano <jc...@gmail.com> wrote:
Hello everybody,
I'm using Solr 4.6.1. and I'd like to know if there's a way to determine
exactly the number of characters of a fragment used in highlights. If I use
hl.fragsize=70 the length of the fragments that I get is variable (often)
and I get results of 90 characters length.
Regards and thanks in advance,
Juan Carlos