You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk> on 2019/06/03 08:06:33 UTC

highlighting not working as expected

Hi,

I am having some difficulties making highlighting work. For some reason the highlighting feature only works on some fields but not on other fields even though these fields are stored.

An example of a request looks like this: http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte

It simply returns an empty set, for all documents even though I can see several documents which have “Sagstitel” containing the word “rotte” (rotte=rat).  What am I missing here?

I am using the standard highlighter as below.


<searchComponent class="solr.HighlightComponent" name="highlight">
    <highlighting>
      <!-- Configure the standard fragmenter -->
      <!-- This could most likely be commented out in the "default" case -->
      <fragmenter name="gap"
                  default="true"
                  class="solr.highlight.GapFragmenter">
        <lst name="defaults">
          <int name="hl.fragsize">100</int>
        </lst>
      </fragmenter>

      <!-- A regular-expression-based fragmenter
           (for sentence extraction)
        -->
      <fragmenter name="regex"
                  class="solr.highlight.RegexFragmenter">
        <lst name="defaults">
          <!-- slightly smaller fragsizes work better because of slop -->
          <int name="hl.fragsize">70</int>
          <!-- allow 50% slop on fragment sizes -->
          <float name="hl.regex.slop">0.5</float>
          <!-- a basic sentence pattern -->
          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
        </lst>
      </fragmenter>

      <!-- Configure the standard formatter -->
      <formatter name="html"
                 default="true"
                 class="solr.highlight.HtmlFormatter">
        <lst name="defaults">
          <str name="hl.simple.pre">&lt;b&gt;</str>
          <str name="hl.simple.post">&lt;/b&gt;</str>
        </lst>
      </formatter>

      <!-- Configure the standard encoder -->
      <encoder name="html"
               class="solr.highlight.HtmlEncoder" />

      <!-- Configure the standard fragListBuilder -->
      <fragListBuilder name="simple"
                       class="solr.highlight.SimpleFragListBuilder"/>

      <!-- Configure the single fragListBuilder -->
      <fragListBuilder name="single"
                       class="solr.highlight.SingleFragListBuilder"/>

      <!-- Configure the weighted fragListBuilder -->
     <fragListBuilder name="weighted"
                       default="true"
                       class="solr.highlight.WeightedFragListBuilder"/>

      <!-- default tag FragmentsBuilder -->
      <fragmentsBuilder name="default"
                        default="true"
                        class="solr.highlight.ScoreOrderFragmentsBuilder">
        <!--
        <lst name="defaults">
          <str name="hl.multiValuedSeparatorChar">/</str>
        </lst>
        -->
      </fragmentsBuilder>

      <!-- multi-colored tag FragmentsBuilder -->
      <fragmentsBuilder name="colored"
                        class="solr.highlight.ScoreOrderFragmentsBuilder">
        <lst name="defaults">
          <str name="hl.tag.pre"><![CDATA[
               <b style="background:yellow">,<b style="background:lawgreen">,
               <b style="background:aquamarine">,<b style="background:magenta">,
               <b style="background:palegreen">,<b style="background:coral">,
               <b style="background:wheat">,<b style="background:khaki">,
               <b style="background:lime">,<b style="background:deepskyblue">]]></str>
          <str name="hl.tag.post"><![CDATA[</b>]]></str>
        </lst>
      </fragmentsBuilder>

      <boundaryScanner name="default"
                       default="true"
                       class="solr.highlight.SimpleBoundaryScanner">
        <lst name="defaults">
          <str name="hl.bs.maxScan">10</str>
          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
        </lst>
      </boundaryScanner>

      <boundaryScanner name="breakIterator"
                       class="solr.highlight.BreakIteratorBoundaryScanner">
        <lst name="defaults">
          <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
          <str name="hl.bs.type">WORD</str>
          <!-- language and country are used when constructing Locale object.  -->
          <!-- And the Locale object will be used when getting instance of BreakIterator -->
          <str name="hl.bs.language">da</str>
        </lst>
      </boundaryScanner>
    </highlighting>
  </searchComponent>

Hope that some one can help, thanks in advance.

Best regards
Martin



Internal - KMD A/S

Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.

Protection of your personal data is important to us. Here you can read KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your personal data.

Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.

Please note that this message may contain confidential information. If you have received this message by mistake, please inform the sender of the mistake by sending a reply, then delete the message from your system without making, distributing or retaining any copies of it. Although we believe that the message and any attachments are free from viruses and other errors that might affect the computer or it-system where it is received and read, the recipient opens the message at his or her own risk. We assume no responsibility for any loss or damage arising from the receipt or use of this message.

RE: highlighting not working as expected

Posted by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk>.
Hi Edwin,

Thanks for your explanation, makes sense now.

Best regards

Martin


Internal - KMD A/S

-----Original Message-----
From: Zheng Lin Edwin Yeo <ed...@gmail.com>
Sent: 30. juni 2019 01:57
To: solr-user@lucene.apache.org
Subject: Re: highlighting not working as expected

Hi,

If you are using the type "string", it will require exact match, including space and upper/lower case.

You can use the type "text" for a start, but further down the road it will be good to have your own custom fieldType with your own tokenizer and filter.

Regards,
Edwin

On Tue, 25 Jun 2019 at 14:52, Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi again,
>
> I have tested a bit and I was wondering if the highlighter requires a
> field to be of type "text"? Whenever I try highlighting on fields
> which are of type "string" nothing gets returned.
>
> Best regards
>
> Martin
>
>
> Internal - KMD A/S
>
> -----Original Message-----
> From: Jörn Franke <jo...@gmail.com>
> Sent: 11. juni 2019 08:45
> To: solr-user@lucene.apache.org
> Subject: Re: highlighting not working as expected
>
> Could it be a stop word ? What is the exact type definition of those
> fields? Could this word be omitted or with wrong encoding during
> loading of the documents?
>
> > Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) <MH...@kmd.dk>:
> >
> > Hi,
> >
> > I am having some difficulties making highlighting work. For some
> > reason
> the highlighting feature only works on some fields but not on other
> fields even though these fields are stored.
> >
> > An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagst
> itel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=
> on&q=rotte
> >
> > It simply returns an empty set, for all documents even though I can
> > see
> several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
> >
> > I am using the standard highlighter as below.
> >
> >
> > <searchComponent class="solr.HighlightComponent" name="highlight">
> >    <highlighting>
> >      <!-- Configure the standard fragmenter -->
> >      <!-- This could most likely be commented out in the "default"
> > case
> -->
> >      <fragmenter name="gap"
> >                  default="true"
> >                  class="solr.highlight.GapFragmenter">
> >        <lst name="defaults">
> >          <int name="hl.fragsize">100</int>
> >        </lst>
> >      </fragmenter>
> >
> >      <!-- A regular-expression-based fragmenter
> >           (for sentence extraction)
> >        -->
> >      <fragmenter name="regex"
> >                  class="solr.highlight.RegexFragmenter">
> >        <lst name="defaults">
> >          <!-- slightly smaller fragsizes work better because of slop -->
> >          <int name="hl.fragsize">70</int>
> >          <!-- allow 50% slop on fragment sizes -->
> >          <float name="hl.regex.slop">0.5</float>
> >          <!-- a basic sentence pattern -->
> >          <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
> >        </lst>
> >      </fragmenter>
> >
> >      <!-- Configure the standard formatter -->
> >      <formatter name="html"
> >                 default="true"
> >                 class="solr.highlight.HtmlFormatter">
> >        <lst name="defaults">
> >          <str name="hl.simple.pre">&lt;b&gt;</str>
> >          <str name="hl.simple.post">&lt;/b&gt;</str>
> >        </lst>
> >      </formatter>
> >
> >      <!-- Configure the standard encoder -->
> >      <encoder name="html"
> >               class="solr.highlight.HtmlEncoder" />
> >
> >      <!-- Configure the standard fragListBuilder -->
> >      <fragListBuilder name="simple"
> >                       class="solr.highlight.SimpleFragListBuilder"/>
> >
> >      <!-- Configure the single fragListBuilder -->
> >      <fragListBuilder name="single"
> >                       class="solr.highlight.SingleFragListBuilder"/>
> >
> >      <!-- Configure the weighted fragListBuilder -->
> >     <fragListBuilder name="weighted"
> >                       default="true"
> >
> > class="solr.highlight.WeightedFragListBuilder"/>
> >
> >      <!-- default tag FragmentsBuilder -->
> >      <fragmentsBuilder name="default"
> >                        default="true"
> >                        class="solr.highlight.ScoreOrderFragmentsBuilder">
> >        <!--
> >        <lst name="defaults">
> >          <str name="hl.multiValuedSeparatorChar">/</str>
> >        </lst>
> >        -->
> >      </fragmentsBuilder>
> >
> >      <!-- multi-colored tag FragmentsBuilder -->
> >      <fragmentsBuilder name="colored"
> >                        class="solr.highlight.ScoreOrderFragmentsBuilder">
> >        <lst name="defaults">
> >          <str name="hl.tag.pre"><![CDATA[
> >               <b style="background:yellow">,<b
> style="background:lawgreen">,
> >               <b style="background:aquamarine">,<b
> style="background:magenta">,
> >               <b style="background:palegreen">,<b
> style="background:coral">,
> >               <b style="background:wheat">,<b style="background:khaki">,
> >               <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
> >          <str name="hl.tag.post"><![CDATA[</b>]]></str>
> >        </lst>
> >      </fragmentsBuilder>
> >
> >      <boundaryScanner name="default"
> >                       default="true"
> >                       class="solr.highlight.SimpleBoundaryScanner">
> >        <lst name="defaults">
> >          <str name="hl.bs.maxScan">10</str>
> >          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
> >        </lst>
> >      </boundaryScanner>
> >
> >      <boundaryScanner name="breakIterator"
> >
>  class="solr.highlight.BreakIteratorBoundaryScanner">
> >        <lst name="defaults">
> >          <!-- type should be one of CHARACTER, WORD(default), LINE
> > and
> SENTENCE -->
> >          <str name="hl.bs.type">WORD</str>
> >          <!-- language and country are used when constructing Locale
> object.  -->
> >          <!-- And the Locale object will be used when getting
> > instance
> of BreakIterator -->
> >          <str name="hl.bs.language">da</str>
> >        </lst>
> >      </boundaryScanner>
> >    </highlighting>
> >  </searchComponent>
> >
> > Hope that some one can help, thanks in advance.
> >
> > Best regards
> > Martin
> >
> >
> >
> > Internal - KMD A/S
> >
> > Beskyttelse af dine personlige oplysninger er vigtig for os. Her
> > finder
> du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der
> fortæller, hvordan vi behandler oplysninger om dig.
> >
> > Protection of your personal data is important to us. Here you can
> > read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how
> we process your personal data.
> >
> > Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig
> information. Hvis du ved en fejltagelse modtager e-mailen, beder vi
> dig venligst informere afsender om fejlen ved at bruge svarfunktionen.
> Samtidig beder vi dig slette e-mailen i dit system uden at videresende
> eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter
> vores overbevisning er fri for virus og andre fejl, som kan påvirke
> computeren eller it-systemet, hvori den modtages og læses, åbnes den
> på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og
> skade, som er opstået i forbindelse med at modtage og bruge e-mailen.
> >
> > Please note that this message may contain confidential information.
> > If
> you have received this message by mistake, please inform the sender of
> the mistake by sending a reply, then delete the message from your
> system without making, distributing or retaining any copies of it.
> Although we believe that the message and any attachments are free from
> viruses and other errors that might affect the computer or it-system
> where it is received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the
> receipt or use of this message.
>

Re: highlighting not working as expected

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi,

If you are using the type "string", it will require exact match, including
space and upper/lower case.

You can use the type "text" for a start, but further down the road it will
be good to have your own custom fieldType with your own tokenizer and
filter.

Regards,
Edwin

On Tue, 25 Jun 2019 at 14:52, Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi again,
>
> I have tested a bit and I was wondering if the highlighter requires a
> field to be of type "text"? Whenever I try highlighting on fields which are
> of type "string" nothing gets returned.
>
> Best regards
>
> Martin
>
>
> Internal - KMD A/S
>
> -----Original Message-----
> From: Jörn Franke <jo...@gmail.com>
> Sent: 11. juni 2019 08:45
> To: solr-user@lucene.apache.org
> Subject: Re: highlighting not working as expected
>
> Could it be a stop word ? What is the exact type definition of those
> fields? Could this word be omitted or with wrong encoding during loading of
> the documents?
>
> > Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) <MH...@kmd.dk>:
> >
> > Hi,
> >
> > I am having some difficulties making highlighting work. For some reason
> the highlighting feature only works on some fields but not on other fields
> even though these fields are stored.
> >
> > An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
> >
> > It simply returns an empty set, for all documents even though I can see
> several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
> >
> > I am using the standard highlighter as below.
> >
> >
> > <searchComponent class="solr.HighlightComponent" name="highlight">
> >    <highlighting>
> >      <!-- Configure the standard fragmenter -->
> >      <!-- This could most likely be commented out in the "default" case
> -->
> >      <fragmenter name="gap"
> >                  default="true"
> >                  class="solr.highlight.GapFragmenter">
> >        <lst name="defaults">
> >          <int name="hl.fragsize">100</int>
> >        </lst>
> >      </fragmenter>
> >
> >      <!-- A regular-expression-based fragmenter
> >           (for sentence extraction)
> >        -->
> >      <fragmenter name="regex"
> >                  class="solr.highlight.RegexFragmenter">
> >        <lst name="defaults">
> >          <!-- slightly smaller fragsizes work better because of slop -->
> >          <int name="hl.fragsize">70</int>
> >          <!-- allow 50% slop on fragment sizes -->
> >          <float name="hl.regex.slop">0.5</float>
> >          <!-- a basic sentence pattern -->
> >          <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
> >        </lst>
> >      </fragmenter>
> >
> >      <!-- Configure the standard formatter -->
> >      <formatter name="html"
> >                 default="true"
> >                 class="solr.highlight.HtmlFormatter">
> >        <lst name="defaults">
> >          <str name="hl.simple.pre">&lt;b&gt;</str>
> >          <str name="hl.simple.post">&lt;/b&gt;</str>
> >        </lst>
> >      </formatter>
> >
> >      <!-- Configure the standard encoder -->
> >      <encoder name="html"
> >               class="solr.highlight.HtmlEncoder" />
> >
> >      <!-- Configure the standard fragListBuilder -->
> >      <fragListBuilder name="simple"
> >                       class="solr.highlight.SimpleFragListBuilder"/>
> >
> >      <!-- Configure the single fragListBuilder -->
> >      <fragListBuilder name="single"
> >                       class="solr.highlight.SingleFragListBuilder"/>
> >
> >      <!-- Configure the weighted fragListBuilder -->
> >     <fragListBuilder name="weighted"
> >                       default="true"
> >                       class="solr.highlight.WeightedFragListBuilder"/>
> >
> >      <!-- default tag FragmentsBuilder -->
> >      <fragmentsBuilder name="default"
> >                        default="true"
> >                        class="solr.highlight.ScoreOrderFragmentsBuilder">
> >        <!--
> >        <lst name="defaults">
> >          <str name="hl.multiValuedSeparatorChar">/</str>
> >        </lst>
> >        -->
> >      </fragmentsBuilder>
> >
> >      <!-- multi-colored tag FragmentsBuilder -->
> >      <fragmentsBuilder name="colored"
> >                        class="solr.highlight.ScoreOrderFragmentsBuilder">
> >        <lst name="defaults">
> >          <str name="hl.tag.pre"><![CDATA[
> >               <b style="background:yellow">,<b
> style="background:lawgreen">,
> >               <b style="background:aquamarine">,<b
> style="background:magenta">,
> >               <b style="background:palegreen">,<b
> style="background:coral">,
> >               <b style="background:wheat">,<b style="background:khaki">,
> >               <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
> >          <str name="hl.tag.post"><![CDATA[</b>]]></str>
> >        </lst>
> >      </fragmentsBuilder>
> >
> >      <boundaryScanner name="default"
> >                       default="true"
> >                       class="solr.highlight.SimpleBoundaryScanner">
> >        <lst name="defaults">
> >          <str name="hl.bs.maxScan">10</str>
> >          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
> >        </lst>
> >      </boundaryScanner>
> >
> >      <boundaryScanner name="breakIterator"
> >
>  class="solr.highlight.BreakIteratorBoundaryScanner">
> >        <lst name="defaults">
> >          <!-- type should be one of CHARACTER, WORD(default), LINE and
> SENTENCE -->
> >          <str name="hl.bs.type">WORD</str>
> >          <!-- language and country are used when constructing Locale
> object.  -->
> >          <!-- And the Locale object will be used when getting instance
> of BreakIterator -->
> >          <str name="hl.bs.language">da</str>
> >        </lst>
> >      </boundaryScanner>
> >    </highlighting>
> >  </searchComponent>
> >
> > Hope that some one can help, thanks in advance.
> >
> > Best regards
> > Martin
> >
> >
> >
> > Internal - KMD A/S
> >
> > Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder
> du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der
> fortæller, hvordan vi behandler oplysninger om dig.
> >
> > Protection of your personal data is important to us. Here you can read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we
> process your personal data.
> >
> > Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig
> information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig
> venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig
> beder vi dig slette e-mailen i dit system uden at videresende eller kopiere
> den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning
> er fri for virus og andre fejl, som kan påvirke computeren eller
> it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget
> ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i
> forbindelse med at modtage og bruge e-mailen.
> >
> > Please note that this message may contain confidential information. If
> you have received this message by mistake, please inform the sender of the
> mistake by sending a reply, then delete the message from your system
> without making, distributing or retaining any copies of it. Although we
> believe that the message and any attachments are free from viruses and
> other errors that might affect the computer or it-system where it is
> received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the receipt
> or use of this message.
>

RE: highlighting not working as expected

Posted by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk>.
Hi again,

I have tested a bit and I was wondering if the highlighter requires a field to be of type "text"? Whenever I try highlighting on fields which are of type "string" nothing gets returned.

Best regards

Martin


Internal - KMD A/S

-----Original Message-----
From: Jörn Franke <jo...@gmail.com>
Sent: 11. juni 2019 08:45
To: solr-user@lucene.apache.org
Subject: Re: highlighting not working as expected

Could it be a stop word ? What is the exact type definition of those fields? Could this word be omitted or with wrong encoding during loading of the documents?

> Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) <MH...@kmd.dk>:
>
> Hi,
>
> I am having some difficulties making highlighting work. For some reason the highlighting feature only works on some fields but not on other fields even though these fields are stored.
>
> An example of a request looks like this: http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
>
> It simply returns an empty set, for all documents even though I can see several documents which have “Sagstitel” containing the word “rotte” (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>    <highlighting>
>      <!-- Configure the standard fragmenter -->
>      <!-- This could most likely be commented out in the "default" case -->
>      <fragmenter name="gap"
>                  default="true"
>                  class="solr.highlight.GapFragmenter">
>        <lst name="defaults">
>          <int name="hl.fragsize">100</int>
>        </lst>
>      </fragmenter>
>
>      <!-- A regular-expression-based fragmenter
>           (for sentence extraction)
>        -->
>      <fragmenter name="regex"
>                  class="solr.highlight.RegexFragmenter">
>        <lst name="defaults">
>          <!-- slightly smaller fragsizes work better because of slop -->
>          <int name="hl.fragsize">70</int>
>          <!-- allow 50% slop on fragment sizes -->
>          <float name="hl.regex.slop">0.5</float>
>          <!-- a basic sentence pattern -->
>          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
>        </lst>
>      </fragmenter>
>
>      <!-- Configure the standard formatter -->
>      <formatter name="html"
>                 default="true"
>                 class="solr.highlight.HtmlFormatter">
>        <lst name="defaults">
>          <str name="hl.simple.pre">&lt;b&gt;</str>
>          <str name="hl.simple.post">&lt;/b&gt;</str>
>        </lst>
>      </formatter>
>
>      <!-- Configure the standard encoder -->
>      <encoder name="html"
>               class="solr.highlight.HtmlEncoder" />
>
>      <!-- Configure the standard fragListBuilder -->
>      <fragListBuilder name="simple"
>                       class="solr.highlight.SimpleFragListBuilder"/>
>
>      <!-- Configure the single fragListBuilder -->
>      <fragListBuilder name="single"
>                       class="solr.highlight.SingleFragListBuilder"/>
>
>      <!-- Configure the weighted fragListBuilder -->
>     <fragListBuilder name="weighted"
>                       default="true"
>                       class="solr.highlight.WeightedFragListBuilder"/>
>
>      <!-- default tag FragmentsBuilder -->
>      <fragmentsBuilder name="default"
>                        default="true"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <!--
>        <lst name="defaults">
>          <str name="hl.multiValuedSeparatorChar">/</str>
>        </lst>
>        -->
>      </fragmentsBuilder>
>
>      <!-- multi-colored tag FragmentsBuilder -->
>      <fragmentsBuilder name="colored"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <lst name="defaults">
>          <str name="hl.tag.pre"><![CDATA[
>               <b style="background:yellow">,<b style="background:lawgreen">,
>               <b style="background:aquamarine">,<b style="background:magenta">,
>               <b style="background:palegreen">,<b style="background:coral">,
>               <b style="background:wheat">,<b style="background:khaki">,
>               <b style="background:lime">,<b style="background:deepskyblue">]]></str>
>          <str name="hl.tag.post"><![CDATA[</b>]]></str>
>        </lst>
>      </fragmentsBuilder>
>
>      <boundaryScanner name="default"
>                       default="true"
>                       class="solr.highlight.SimpleBoundaryScanner">
>        <lst name="defaults">
>          <str name="hl.bs.maxScan">10</str>
>          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>        </lst>
>      </boundaryScanner>
>
>      <boundaryScanner name="breakIterator"
>                       class="solr.highlight.BreakIteratorBoundaryScanner">
>        <lst name="defaults">
>          <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
>          <str name="hl.bs.type">WORD</str>
>          <!-- language and country are used when constructing Locale object.  -->
>          <!-- And the Locale object will be used when getting instance of BreakIterator -->
>          <str name="hl.bs.language">da</str>
>        </lst>
>      </boundaryScanner>
>    </highlighting>
>  </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If you have received this message by mistake, please inform the sender of the mistake by sending a reply, then delete the message from your system without making, distributing or retaining any copies of it. Although we believe that the message and any attachments are free from viruses and other errors that might affect the computer or it-system where it is received and read, the recipient opens the message at his or her own risk. We assume no responsibility for any loss or damage arising from the receipt or use of this message.

RE: highlighting not working as expected

Posted by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk>.
Hi Jörn,

Thanks for your input!

I do not use stop-words, so that should not be the issue. The encoding of the documents might be an issue, as they come in many different file formats. It will however need to test this.

The field is defined as below:

<field name="Sagstitel" type="string" indexed="true" stored="true" />

BR

Martin


Internal - KMD A/S

-----Original Message-----
From: Jörn Franke <jo...@gmail.com>
Sent: 11. juni 2019 08:45
To: solr-user@lucene.apache.org
Subject: Re: highlighting not working as expected

Could it be a stop word ? What is the exact type definition of those fields? Could this word be omitted or with wrong encoding during loading of the documents?

> Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) <MH...@kmd.dk>:
>
> Hi,
>
> I am having some difficulties making highlighting work. For some reason the highlighting feature only works on some fields but not on other fields even though these fields are stored.
>
> An example of a request looks like this: http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
>
> It simply returns an empty set, for all documents even though I can see several documents which have “Sagstitel” containing the word “rotte” (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>    <highlighting>
>      <!-- Configure the standard fragmenter -->
>      <!-- This could most likely be commented out in the "default" case -->
>      <fragmenter name="gap"
>                  default="true"
>                  class="solr.highlight.GapFragmenter">
>        <lst name="defaults">
>          <int name="hl.fragsize">100</int>
>        </lst>
>      </fragmenter>
>
>      <!-- A regular-expression-based fragmenter
>           (for sentence extraction)
>        -->
>      <fragmenter name="regex"
>                  class="solr.highlight.RegexFragmenter">
>        <lst name="defaults">
>          <!-- slightly smaller fragsizes work better because of slop -->
>          <int name="hl.fragsize">70</int>
>          <!-- allow 50% slop on fragment sizes -->
>          <float name="hl.regex.slop">0.5</float>
>          <!-- a basic sentence pattern -->
>          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
>        </lst>
>      </fragmenter>
>
>      <!-- Configure the standard formatter -->
>      <formatter name="html"
>                 default="true"
>                 class="solr.highlight.HtmlFormatter">
>        <lst name="defaults">
>          <str name="hl.simple.pre">&lt;b&gt;</str>
>          <str name="hl.simple.post">&lt;/b&gt;</str>
>        </lst>
>      </formatter>
>
>      <!-- Configure the standard encoder -->
>      <encoder name="html"
>               class="solr.highlight.HtmlEncoder" />
>
>      <!-- Configure the standard fragListBuilder -->
>      <fragListBuilder name="simple"
>                       class="solr.highlight.SimpleFragListBuilder"/>
>
>      <!-- Configure the single fragListBuilder -->
>      <fragListBuilder name="single"
>                       class="solr.highlight.SingleFragListBuilder"/>
>
>      <!-- Configure the weighted fragListBuilder -->
>     <fragListBuilder name="weighted"
>                       default="true"
>                       class="solr.highlight.WeightedFragListBuilder"/>
>
>      <!-- default tag FragmentsBuilder -->
>      <fragmentsBuilder name="default"
>                        default="true"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <!--
>        <lst name="defaults">
>          <str name="hl.multiValuedSeparatorChar">/</str>
>        </lst>
>        -->
>      </fragmentsBuilder>
>
>      <!-- multi-colored tag FragmentsBuilder -->
>      <fragmentsBuilder name="colored"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <lst name="defaults">
>          <str name="hl.tag.pre"><![CDATA[
>               <b style="background:yellow">,<b style="background:lawgreen">,
>               <b style="background:aquamarine">,<b style="background:magenta">,
>               <b style="background:palegreen">,<b style="background:coral">,
>               <b style="background:wheat">,<b style="background:khaki">,
>               <b style="background:lime">,<b style="background:deepskyblue">]]></str>
>          <str name="hl.tag.post"><![CDATA[</b>]]></str>
>        </lst>
>      </fragmentsBuilder>
>
>      <boundaryScanner name="default"
>                       default="true"
>                       class="solr.highlight.SimpleBoundaryScanner">
>        <lst name="defaults">
>          <str name="hl.bs.maxScan">10</str>
>          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>        </lst>
>      </boundaryScanner>
>
>      <boundaryScanner name="breakIterator"
>                       class="solr.highlight.BreakIteratorBoundaryScanner">
>        <lst name="defaults">
>          <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
>          <str name="hl.bs.type">WORD</str>
>          <!-- language and country are used when constructing Locale object.  -->
>          <!-- And the Locale object will be used when getting instance of BreakIterator -->
>          <str name="hl.bs.language">da</str>
>        </lst>
>      </boundaryScanner>
>    </highlighting>
>  </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If you have received this message by mistake, please inform the sender of the mistake by sending a reply, then delete the message from your system without making, distributing or retaining any copies of it. Although we believe that the message and any attachments are free from viruses and other errors that might affect the computer or it-system where it is received and read, the recipient opens the message at his or her own risk. We assume no responsibility for any loss or damage arising from the receipt or use of this message.

Re: highlighting not working as expected

Posted by Jörn Franke <jo...@gmail.com>.
Could it be a stop word ? What is the exact type definition of those fields? Could this word be omitted or with wrong encoding during loading of the documents?

> Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) <MH...@kmd.dk>:
> 
> Hi,
> 
> I am having some difficulties making highlighting work. For some reason the highlighting feature only works on some fields but not on other fields even though these fields are stored.
> 
> An example of a request looks like this: http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
> 
> It simply returns an empty set, for all documents even though I can see several documents which have “Sagstitel” containing the word “rotte” (rotte=rat).  What am I missing here?
> 
> I am using the standard highlighter as below.
> 
> 
> <searchComponent class="solr.HighlightComponent" name="highlight">
>    <highlighting>
>      <!-- Configure the standard fragmenter -->
>      <!-- This could most likely be commented out in the "default" case -->
>      <fragmenter name="gap"
>                  default="true"
>                  class="solr.highlight.GapFragmenter">
>        <lst name="defaults">
>          <int name="hl.fragsize">100</int>
>        </lst>
>      </fragmenter>
> 
>      <!-- A regular-expression-based fragmenter
>           (for sentence extraction)
>        -->
>      <fragmenter name="regex"
>                  class="solr.highlight.RegexFragmenter">
>        <lst name="defaults">
>          <!-- slightly smaller fragsizes work better because of slop -->
>          <int name="hl.fragsize">70</int>
>          <!-- allow 50% slop on fragment sizes -->
>          <float name="hl.regex.slop">0.5</float>
>          <!-- a basic sentence pattern -->
>          <str name="hl.regex.pattern">[-\w ,/\n\&quot;&apos;]{20,200}</str>
>        </lst>
>      </fragmenter>
> 
>      <!-- Configure the standard formatter -->
>      <formatter name="html"
>                 default="true"
>                 class="solr.highlight.HtmlFormatter">
>        <lst name="defaults">
>          <str name="hl.simple.pre">&lt;b&gt;</str>
>          <str name="hl.simple.post">&lt;/b&gt;</str>
>        </lst>
>      </formatter>
> 
>      <!-- Configure the standard encoder -->
>      <encoder name="html"
>               class="solr.highlight.HtmlEncoder" />
> 
>      <!-- Configure the standard fragListBuilder -->
>      <fragListBuilder name="simple"
>                       class="solr.highlight.SimpleFragListBuilder"/>
> 
>      <!-- Configure the single fragListBuilder -->
>      <fragListBuilder name="single"
>                       class="solr.highlight.SingleFragListBuilder"/>
> 
>      <!-- Configure the weighted fragListBuilder -->
>     <fragListBuilder name="weighted"
>                       default="true"
>                       class="solr.highlight.WeightedFragListBuilder"/>
> 
>      <!-- default tag FragmentsBuilder -->
>      <fragmentsBuilder name="default"
>                        default="true"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <!--
>        <lst name="defaults">
>          <str name="hl.multiValuedSeparatorChar">/</str>
>        </lst>
>        -->
>      </fragmentsBuilder>
> 
>      <!-- multi-colored tag FragmentsBuilder -->
>      <fragmentsBuilder name="colored"
>                        class="solr.highlight.ScoreOrderFragmentsBuilder">
>        <lst name="defaults">
>          <str name="hl.tag.pre"><![CDATA[
>               <b style="background:yellow">,<b style="background:lawgreen">,
>               <b style="background:aquamarine">,<b style="background:magenta">,
>               <b style="background:palegreen">,<b style="background:coral">,
>               <b style="background:wheat">,<b style="background:khaki">,
>               <b style="background:lime">,<b style="background:deepskyblue">]]></str>
>          <str name="hl.tag.post"><![CDATA[</b>]]></str>
>        </lst>
>      </fragmentsBuilder>
> 
>      <boundaryScanner name="default"
>                       default="true"
>                       class="solr.highlight.SimpleBoundaryScanner">
>        <lst name="defaults">
>          <str name="hl.bs.maxScan">10</str>
>          <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>        </lst>
>      </boundaryScanner>
> 
>      <boundaryScanner name="breakIterator"
>                       class="solr.highlight.BreakIteratorBoundaryScanner">
>        <lst name="defaults">
>          <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE -->
>          <str name="hl.bs.type">WORD</str>
>          <!-- language and country are used when constructing Locale object.  -->
>          <!-- And the Locale object will be used when getting instance of BreakIterator -->
>          <str name="hl.bs.language">da</str>
>        </lst>
>      </boundaryScanner>
>    </highlighting>
>  </searchComponent>
> 
> Hope that some one can help, thanks in advance.
> 
> Best regards
> Martin
> 
> 
> 
> Internal - KMD A/S
> 
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
> 
> Protection of your personal data is important to us. Here you can read KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we process your personal data.
> 
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den. Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri for virus og andre fejl, som kan påvirke computeren eller it-systemet, hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse med at modtage og bruge e-mailen.
> 
> Please note that this message may contain confidential information. If you have received this message by mistake, please inform the sender of the mistake by sending a reply, then delete the message from your system without making, distributing or retaining any copies of it. Although we believe that the message and any attachments are free from viruses and other errors that might affect the computer or it-system where it is received and read, the recipient opens the message at his or her own risk. We assume no responsibility for any loss or damage arising from the receipt or use of this message.

RE: highlighting not working as expected

Posted by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk>.
Hi David,

Thanks for your response and sorry my late reply.

Still the same result when using hl.method=unified.

Best regards
Martin


Internal - KMD A/S

-----Original Message-----
From: David Smiley <da...@gmail.com>
Sent: 10. juni 2019 16:48
To: solr-user <so...@lucene.apache.org>
Subject: Re: highlighting not working as expected

Please try hl.method=unified and tell us if that helps.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jun 3, 2019 at 4:06 AM Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi,
>
> I am having some difficulties making highlighting work. For some
> reason the highlighting feature only works on some fields but not on
> other fields even though these fields are stored.
>
> An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagst
> itel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=
> on&q=rotte
>
> It simply returns an empty set, for all documents even though I can
> see several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>     <highlighting>
>       <!-- Configure the standard fragmenter -->
>       <!-- This could most likely be commented out in the "default"
> case
> -->
>       <fragmenter name="gap"
>                   default="true"
>                   class="solr.highlight.GapFragmenter">
>         <lst name="defaults">
>           <int name="hl.fragsize">100</int>
>         </lst>
>       </fragmenter>
>
>       <!-- A regular-expression-based fragmenter
>            (for sentence extraction)
>         -->
>       <fragmenter name="regex"
>                   class="solr.highlight.RegexFragmenter">
>         <lst name="defaults">
>           <!-- slightly smaller fragsizes work better because of slop -->
>           <int name="hl.fragsize">70</int>
>           <!-- allow 50% slop on fragment sizes -->
>           <float name="hl.regex.slop">0.5</float>
>           <!-- a basic sentence pattern -->
>           <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
>         </lst>
>       </fragmenter>
>
>       <!-- Configure the standard formatter -->
>       <formatter name="html"
>                  default="true"
>                  class="solr.highlight.HtmlFormatter">
>         <lst name="defaults">
>           <str name="hl.simple.pre">&lt;b&gt;</str>
>           <str name="hl.simple.post">&lt;/b&gt;</str>
>         </lst>
>       </formatter>
>
>       <!-- Configure the standard encoder -->
>       <encoder name="html"
>                class="solr.highlight.HtmlEncoder" />
>
>       <!-- Configure the standard fragListBuilder -->
>       <fragListBuilder name="simple"
>                        class="solr.highlight.SimpleFragListBuilder"/>
>
>       <!-- Configure the single fragListBuilder -->
>       <fragListBuilder name="single"
>                        class="solr.highlight.SingleFragListBuilder"/>
>
>       <!-- Configure the weighted fragListBuilder -->
>      <fragListBuilder name="weighted"
>                        default="true"
>
> class="solr.highlight.WeightedFragListBuilder"/>
>
>       <!-- default tag FragmentsBuilder -->
>       <fragmentsBuilder name="default"
>                         default="true"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <!--
>         <lst name="defaults">
>           <str name="hl.multiValuedSeparatorChar">/</str>
>         </lst>
>         -->
>       </fragmentsBuilder>
>
>       <!-- multi-colored tag FragmentsBuilder -->
>       <fragmentsBuilder name="colored"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <lst name="defaults">
>           <str name="hl.tag.pre"><![CDATA[
>                <b style="background:yellow">,<b
> style="background:lawgreen">,
>                <b style="background:aquamarine">,<b
> style="background:magenta">,
>                <b style="background:palegreen">,<b
> style="background:coral">,
>                <b style="background:wheat">,<b style="background:khaki">,
>                <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
>           <str name="hl.tag.post"><![CDATA[</b>]]></str>
>         </lst>
>       </fragmentsBuilder>
>
>       <boundaryScanner name="default"
>                        default="true"
>                        class="solr.highlight.SimpleBoundaryScanner">
>         <lst name="defaults">
>           <str name="hl.bs.maxScan">10</str>
>           <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>         </lst>
>       </boundaryScanner>
>
>       <boundaryScanner name="breakIterator"
>                        class="solr.highlight.BreakIteratorBoundaryScanner">
>         <lst name="defaults">
>           <!-- type should be one of CHARACTER, WORD(default), LINE
> and SENTENCE -->
>           <str name="hl.bs.type">WORD</str>
>           <!-- language and country are used when constructing Locale
> object.  -->
>           <!-- And the Locale object will be used when getting
> instance of BreakIterator -->
>           <str name="hl.bs.language">da</str>
>         </lst>
>       </boundaryScanner>
>     </highlighting>
>   </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her
> finder du KMD’s
> Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how
> we process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information.
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst
> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig
> beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den.
> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning
> er fri for virus og andre fejl, som kan påvirke computeren eller
> it-systemet, hvori den modtages og læses, åbnes den på modtagerens
> eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er
> opstået i forbindelse med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If
> you have received this message by mistake, please inform the sender of
> the mistake by sending a reply, then delete the message from your
> system without making, distributing or retaining any copies of it.
> Although we believe that the message and any attachments are free from
> viruses and other errors that might affect the computer or it-system
> where it is received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the
> receipt or use of this message.
>

Re: highlighting not working as expected

Posted by David Smiley <da...@gmail.com>.
Please try hl.method=unified and tell us if that helps.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jun 3, 2019 at 4:06 AM Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi,
>
> I am having some difficulties making highlighting work. For some reason
> the highlighting feature only works on some fields but not on other fields
> even though these fields are stored.
>
> An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
>
> It simply returns an empty set, for all documents even though I can see
> several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>     <highlighting>
>       <!-- Configure the standard fragmenter -->
>       <!-- This could most likely be commented out in the "default" case
> -->
>       <fragmenter name="gap"
>                   default="true"
>                   class="solr.highlight.GapFragmenter">
>         <lst name="defaults">
>           <int name="hl.fragsize">100</int>
>         </lst>
>       </fragmenter>
>
>       <!-- A regular-expression-based fragmenter
>            (for sentence extraction)
>         -->
>       <fragmenter name="regex"
>                   class="solr.highlight.RegexFragmenter">
>         <lst name="defaults">
>           <!-- slightly smaller fragsizes work better because of slop -->
>           <int name="hl.fragsize">70</int>
>           <!-- allow 50% slop on fragment sizes -->
>           <float name="hl.regex.slop">0.5</float>
>           <!-- a basic sentence pattern -->
>           <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
>         </lst>
>       </fragmenter>
>
>       <!-- Configure the standard formatter -->
>       <formatter name="html"
>                  default="true"
>                  class="solr.highlight.HtmlFormatter">
>         <lst name="defaults">
>           <str name="hl.simple.pre">&lt;b&gt;</str>
>           <str name="hl.simple.post">&lt;/b&gt;</str>
>         </lst>
>       </formatter>
>
>       <!-- Configure the standard encoder -->
>       <encoder name="html"
>                class="solr.highlight.HtmlEncoder" />
>
>       <!-- Configure the standard fragListBuilder -->
>       <fragListBuilder name="simple"
>                        class="solr.highlight.SimpleFragListBuilder"/>
>
>       <!-- Configure the single fragListBuilder -->
>       <fragListBuilder name="single"
>                        class="solr.highlight.SingleFragListBuilder"/>
>
>       <!-- Configure the weighted fragListBuilder -->
>      <fragListBuilder name="weighted"
>                        default="true"
>                        class="solr.highlight.WeightedFragListBuilder"/>
>
>       <!-- default tag FragmentsBuilder -->
>       <fragmentsBuilder name="default"
>                         default="true"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <!--
>         <lst name="defaults">
>           <str name="hl.multiValuedSeparatorChar">/</str>
>         </lst>
>         -->
>       </fragmentsBuilder>
>
>       <!-- multi-colored tag FragmentsBuilder -->
>       <fragmentsBuilder name="colored"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <lst name="defaults">
>           <str name="hl.tag.pre"><![CDATA[
>                <b style="background:yellow">,<b
> style="background:lawgreen">,
>                <b style="background:aquamarine">,<b
> style="background:magenta">,
>                <b style="background:palegreen">,<b
> style="background:coral">,
>                <b style="background:wheat">,<b style="background:khaki">,
>                <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
>           <str name="hl.tag.post"><![CDATA[</b>]]></str>
>         </lst>
>       </fragmentsBuilder>
>
>       <boundaryScanner name="default"
>                        default="true"
>                        class="solr.highlight.SimpleBoundaryScanner">
>         <lst name="defaults">
>           <str name="hl.bs.maxScan">10</str>
>           <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>         </lst>
>       </boundaryScanner>
>
>       <boundaryScanner name="breakIterator"
>                        class="solr.highlight.BreakIteratorBoundaryScanner">
>         <lst name="defaults">
>           <!-- type should be one of CHARACTER, WORD(default), LINE and
> SENTENCE -->
>           <str name="hl.bs.type">WORD</str>
>           <!-- language and country are used when constructing Locale
> object.  -->
>           <!-- And the Locale object will be used when getting instance of
> BreakIterator -->
>           <str name="hl.bs.language">da</str>
>         </lst>
>       </boundaryScanner>
>     </highlighting>
>   </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du
> KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der
> fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we
> process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information.
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst
> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi
> dig slette e-mailen i dit system uden at videresende eller kopiere den.
> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri
> for virus og andre fejl, som kan påvirke computeren eller it-systemet,
> hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi
> påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse
> med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If you
> have received this message by mistake, please inform the sender of the
> mistake by sending a reply, then delete the message from your system
> without making, distributing or retaining any copies of it. Although we
> believe that the message and any attachments are free from viruses and
> other errors that might affect the computer or it-system where it is
> received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the receipt
> or use of this message.
>

RE: highlighting not working as expected

Posted by "Martin Frank Hansen (MHQ)" <MH...@kmd.dk>.
Hi Edwin,

Yes the field is defined just like the other fields:

<field name="Sagstitel" type="string" indexed="true" stored="true" />

BR
Martin


Internal - KMD A/S

-----Original Message-----
From: Zheng Lin Edwin Yeo <ed...@gmail.com>
Sent: 4. juni 2019 10:32
To: solr-user@lucene.apache.org
Subject: Re: highlighting not working as expected

Hi Martin,

What fieldType are you using for the field “Sagstitel”? Is it the same as other fields?

Regards,
Edwin

On Mon, 3 Jun 2019 at 16:06, Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi,
>
> I am having some difficulties making highlighting work. For some
> reason the highlighting feature only works on some fields but not on
> other fields even though these fields are stored.
>
> An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagst
> itel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=
> on&q=rotte
>
> It simply returns an empty set, for all documents even though I can
> see several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>     <highlighting>
>       <!-- Configure the standard fragmenter -->
>       <!-- This could most likely be commented out in the "default"
> case
> -->
>       <fragmenter name="gap"
>                   default="true"
>                   class="solr.highlight.GapFragmenter">
>         <lst name="defaults">
>           <int name="hl.fragsize">100</int>
>         </lst>
>       </fragmenter>
>
>       <!-- A regular-expression-based fragmenter
>            (for sentence extraction)
>         -->
>       <fragmenter name="regex"
>                   class="solr.highlight.RegexFragmenter">
>         <lst name="defaults">
>           <!-- slightly smaller fragsizes work better because of slop -->
>           <int name="hl.fragsize">70</int>
>           <!-- allow 50% slop on fragment sizes -->
>           <float name="hl.regex.slop">0.5</float>
>           <!-- a basic sentence pattern -->
>           <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
>         </lst>
>       </fragmenter>
>
>       <!-- Configure the standard formatter -->
>       <formatter name="html"
>                  default="true"
>                  class="solr.highlight.HtmlFormatter">
>         <lst name="defaults">
>           <str name="hl.simple.pre">&lt;b&gt;</str>
>           <str name="hl.simple.post">&lt;/b&gt;</str>
>         </lst>
>       </formatter>
>
>       <!-- Configure the standard encoder -->
>       <encoder name="html"
>                class="solr.highlight.HtmlEncoder" />
>
>       <!-- Configure the standard fragListBuilder -->
>       <fragListBuilder name="simple"
>                        class="solr.highlight.SimpleFragListBuilder"/>
>
>       <!-- Configure the single fragListBuilder -->
>       <fragListBuilder name="single"
>                        class="solr.highlight.SingleFragListBuilder"/>
>
>       <!-- Configure the weighted fragListBuilder -->
>      <fragListBuilder name="weighted"
>                        default="true"
>
> class="solr.highlight.WeightedFragListBuilder"/>
>
>       <!-- default tag FragmentsBuilder -->
>       <fragmentsBuilder name="default"
>                         default="true"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <!--
>         <lst name="defaults">
>           <str name="hl.multiValuedSeparatorChar">/</str>
>         </lst>
>         -->
>       </fragmentsBuilder>
>
>       <!-- multi-colored tag FragmentsBuilder -->
>       <fragmentsBuilder name="colored"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <lst name="defaults">
>           <str name="hl.tag.pre"><![CDATA[
>                <b style="background:yellow">,<b
> style="background:lawgreen">,
>                <b style="background:aquamarine">,<b
> style="background:magenta">,
>                <b style="background:palegreen">,<b
> style="background:coral">,
>                <b style="background:wheat">,<b style="background:khaki">,
>                <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
>           <str name="hl.tag.post"><![CDATA[</b>]]></str>
>         </lst>
>       </fragmentsBuilder>
>
>       <boundaryScanner name="default"
>                        default="true"
>                        class="solr.highlight.SimpleBoundaryScanner">
>         <lst name="defaults">
>           <str name="hl.bs.maxScan">10</str>
>           <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>         </lst>
>       </boundaryScanner>
>
>       <boundaryScanner name="breakIterator"
>                        class="solr.highlight.BreakIteratorBoundaryScanner">
>         <lst name="defaults">
>           <!-- type should be one of CHARACTER, WORD(default), LINE
> and SENTENCE -->
>           <str name="hl.bs.type">WORD</str>
>           <!-- language and country are used when constructing Locale
> object.  -->
>           <!-- And the Locale object will be used when getting
> instance of BreakIterator -->
>           <str name="hl.bs.language">da</str>
>         </lst>
>       </boundaryScanner>
>     </highlighting>
>   </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her
> finder du KMD’s
> Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how
> we process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information.
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst
> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig
> beder vi dig slette e-mailen i dit system uden at videresende eller kopiere den.
> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning
> er fri for virus og andre fejl, som kan påvirke computeren eller
> it-systemet, hvori den modtages og læses, åbnes den på modtagerens
> eget ansvar. Vi påtager os ikke noget ansvar for tab og skade, som er
> opstået i forbindelse med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If
> you have received this message by mistake, please inform the sender of
> the mistake by sending a reply, then delete the message from your
> system without making, distributing or retaining any copies of it.
> Although we believe that the message and any attachments are free from
> viruses and other errors that might affect the computer or it-system
> where it is received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the
> receipt or use of this message.
>

Re: highlighting not working as expected

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi Martin,

What fieldType are you using for the field “Sagstitel”? Is it the same as
other fields?

Regards,
Edwin

On Mon, 3 Jun 2019 at 16:06, Martin Frank Hansen (MHQ) <MH...@kmd.dk> wrote:

> Hi,
>
> I am having some difficulties making highlighting work. For some reason
> the highlighting feature only works on some fields but not on other fields
> even though these fields are stored.
>
> An example of a request looks like this:
> http://localhost/solr/mytest/select?fl=id,doc.Type,Journalnummer,Sagstitel&hl.fl=Sagstitel&hl.simple.post=%3C/b%3E&hl.simple.pre=%3Cb%3E&hl=on&q=rotte
>
> It simply returns an empty set, for all documents even though I can see
> several documents which have “Sagstitel” containing the word “rotte”
> (rotte=rat).  What am I missing here?
>
> I am using the standard highlighter as below.
>
>
> <searchComponent class="solr.HighlightComponent" name="highlight">
>     <highlighting>
>       <!-- Configure the standard fragmenter -->
>       <!-- This could most likely be commented out in the "default" case
> -->
>       <fragmenter name="gap"
>                   default="true"
>                   class="solr.highlight.GapFragmenter">
>         <lst name="defaults">
>           <int name="hl.fragsize">100</int>
>         </lst>
>       </fragmenter>
>
>       <!-- A regular-expression-based fragmenter
>            (for sentence extraction)
>         -->
>       <fragmenter name="regex"
>                   class="solr.highlight.RegexFragmenter">
>         <lst name="defaults">
>           <!-- slightly smaller fragsizes work better because of slop -->
>           <int name="hl.fragsize">70</int>
>           <!-- allow 50% slop on fragment sizes -->
>           <float name="hl.regex.slop">0.5</float>
>           <!-- a basic sentence pattern -->
>           <str name="hl.regex.pattern">[-\w
> ,/\n\&quot;&apos;]{20,200}</str>
>         </lst>
>       </fragmenter>
>
>       <!-- Configure the standard formatter -->
>       <formatter name="html"
>                  default="true"
>                  class="solr.highlight.HtmlFormatter">
>         <lst name="defaults">
>           <str name="hl.simple.pre">&lt;b&gt;</str>
>           <str name="hl.simple.post">&lt;/b&gt;</str>
>         </lst>
>       </formatter>
>
>       <!-- Configure the standard encoder -->
>       <encoder name="html"
>                class="solr.highlight.HtmlEncoder" />
>
>       <!-- Configure the standard fragListBuilder -->
>       <fragListBuilder name="simple"
>                        class="solr.highlight.SimpleFragListBuilder"/>
>
>       <!-- Configure the single fragListBuilder -->
>       <fragListBuilder name="single"
>                        class="solr.highlight.SingleFragListBuilder"/>
>
>       <!-- Configure the weighted fragListBuilder -->
>      <fragListBuilder name="weighted"
>                        default="true"
>                        class="solr.highlight.WeightedFragListBuilder"/>
>
>       <!-- default tag FragmentsBuilder -->
>       <fragmentsBuilder name="default"
>                         default="true"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <!--
>         <lst name="defaults">
>           <str name="hl.multiValuedSeparatorChar">/</str>
>         </lst>
>         -->
>       </fragmentsBuilder>
>
>       <!-- multi-colored tag FragmentsBuilder -->
>       <fragmentsBuilder name="colored"
>                         class="solr.highlight.ScoreOrderFragmentsBuilder">
>         <lst name="defaults">
>           <str name="hl.tag.pre"><![CDATA[
>                <b style="background:yellow">,<b
> style="background:lawgreen">,
>                <b style="background:aquamarine">,<b
> style="background:magenta">,
>                <b style="background:palegreen">,<b
> style="background:coral">,
>                <b style="background:wheat">,<b style="background:khaki">,
>                <b style="background:lime">,<b
> style="background:deepskyblue">]]></str>
>           <str name="hl.tag.post"><![CDATA[</b>]]></str>
>         </lst>
>       </fragmentsBuilder>
>
>       <boundaryScanner name="default"
>                        default="true"
>                        class="solr.highlight.SimpleBoundaryScanner">
>         <lst name="defaults">
>           <str name="hl.bs.maxScan">10</str>
>           <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>         </lst>
>       </boundaryScanner>
>
>       <boundaryScanner name="breakIterator"
>                        class="solr.highlight.BreakIteratorBoundaryScanner">
>         <lst name="defaults">
>           <!-- type should be one of CHARACTER, WORD(default), LINE and
> SENTENCE -->
>           <str name="hl.bs.type">WORD</str>
>           <!-- language and country are used when constructing Locale
> object.  -->
>           <!-- And the Locale object will be used when getting instance of
> BreakIterator -->
>           <str name="hl.bs.language">da</str>
>         </lst>
>       </boundaryScanner>
>     </highlighting>
>   </searchComponent>
>
> Hope that some one can help, thanks in advance.
>
> Best regards
> Martin
>
>
>
> Internal - KMD A/S
>
> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du
> KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der
> fortæller, hvordan vi behandler oplysninger om dig.
>
> Protection of your personal data is important to us. Here you can read
> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we
> process your personal data.
>
> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information.
> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst
> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi
> dig slette e-mailen i dit system uden at videresende eller kopiere den.
> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri
> for virus og andre fejl, som kan påvirke computeren eller it-systemet,
> hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi
> påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse
> med at modtage og bruge e-mailen.
>
> Please note that this message may contain confidential information. If you
> have received this message by mistake, please inform the sender of the
> mistake by sending a reply, then delete the message from your system
> without making, distributing or retaining any copies of it. Although we
> believe that the message and any attachments are free from viruses and
> other errors that might affect the computer or it-system where it is
> received and read, the recipient opens the message at his or her own risk.
> We assume no responsibility for any loss or damage arising from the receipt
> or use of this message.
>