You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tom Evans <te...@googlemail.com> on 2016/06/16 13:11:24 UTC

Strange highlighting on search

Hi all

I'm investigating a bug where by every term in the highlighted field
gets marked for highlighting instead of just the words that match the
fulltext portion of the query. This is on Solr 5.5.0, but I didn't see
any bug fixes related to highlighting in 5.5.1 or 6.0 release notes.

The query that affects it is where we have a not clause on a specific
field (not the fulltext field) and also only include documents where
that field has a value:

q: cosmetics_packaging_fulltext:(Mist) AND ingredient_tag_id:[0 TO *]
AND -ingredient_tag_id:(35223)

This returns the correct results, but the highlighting has matched
every word in the results (see below for debugQuery output). If I
change the query to put the exclusion in to an fq, the highlighting is
correct again (and the results are correct):

q: cosmetics_packaging_fulltext:(Mist)
fq: {!cache=false} ingredient_tag_id:[0 TO *] AND -ingredient_tag_id:(35223)

Is there any way I can make the query and highlighting work as
expected as part of q?

Is there any downside to putting the exclusion part in the fq in terms
of performance? We don't use score at all for our results, we always
order by other parameters.

Cheers

Tom

Query with strange highlighting:

{
  "responseHeader":{
    "status":0,
    "QTime":314,
    "params":{
      "q":"cosmetics_packaging_fulltext:(Mist) AND
ingredient_tag_id:[0 TO *] AND -ingredient_tag_id:(35223)",
      "hl":"true",
      "hl.simple.post":"</em>",
      "indent":"true",
      "fl":"id,product",
      "hl.fragsize":"0",
      "hl.fl":"product",
      "rows":"5",
      "wt":"json",
      "debugQuery":"true",
      "hl.simple.pre":"<em>"}},
  "response":{"numFound":10132,"start":0,"docs":[
      {
        "id":"2403841-1498608",
        "product":"Mist"},
      {
        "id":"2410603-1502577",
        "product":"Mist"},
      {
        "id":"5988531-3882415",
        "product":"Ao + Mist"},
      {
        "id":"6020805-3904203",
        "product":"UV Mist Cushion SPF 50+ PA+++"},
      {
        "id":"2617977-1629335",
        "product":"Ultra Radiance Facial Re-Hydrating Mist"}]
  },
  "highlighting":{
    "2403841-1498608":{
      "product":["<em>Mist</em>"]},
    "2410603-1502577":{
      "product":["<em>Mist</em>"]},
    "5988531-3882415":{
      "product":["<em>Ao</em> + <em>Mist</em>"]},
    "6020805-3904203":{
      "product":["<em>UV</em> <em>Mist</em> <em>Cushion</em>
<em>SPF</em> <em>50</em>+ <em>PA</em>+++"]},
    "2617977-1629335":{
      "product":["<em>Ultra</em> <em>Radiance</em> <em>Facial</em>
<em>Re-Hydrating</em> <em>Mist</em>"]}},
  "debug":{
    "rawquerystring":"cosmetics_packaging_fulltext:(Mist) AND
ingredient_tag_id:[0 TO *] AND -ingredient_tag_id:(35223)",
    "querystring":"cosmetics_packaging_fulltext:(Mist) AND
ingredient_tag_id:[0 TO *] AND -ingredient_tag_id:(35223)",
    "parsedquery":"+cosmetics_packaging_fulltext:mist
+ingredient_tag_id:[0 TO *] -ingredient_tag_id:35223",
    "parsedquery_toString":"+cosmetics_packaging_fulltext:mist
+ingredient_tag_id:[0 TO *] -ingredient_tag_id:35223",
    "explain":{
      "2403841-1498608":"\n40.082462 = sum of:\n  39.92971 =
weight(cosmetics_packaging_fulltext:mist in 13983)
[ClassicSimilarity], result of:\n    39.92971 =
score(doc=13983,freq=39.0), product of:\n      0.9882648 =
queryWeight, product of:\n        6.469795 = idf(docFreq=22502,
maxDocs=5342472)\n        0.15275055 = queryNorm\n      40.40386 =
fieldWeight in 13983, product of:\n        6.244998 = tf(freq=39.0),
with freq of:\n          39.0 = termFreq=39.0\n        6.469795 =
idf(docFreq=22502, maxDocs=5342472)\n        1.0 =
fieldNorm(doc=13983)\n  0.15275055 = ingredient_tag_id:[0 TO *],
product of:\n    1.0 = boost\n    0.15275055 = queryNorm\n",
      "2410603-1502577":"\n40.082462 = sum of:\n  39.92971 =
weight(cosmetics_packaging_fulltext:mist in 14023)
[ClassicSimilarity], result of:\n    39.92971 =
score(doc=14023,freq=39.0), product of:\n      0.9882648 =
queryWeight, product of:\n        6.469795 = idf(docFreq=22502,
maxDocs=5342472)\n        0.15275055 = queryNorm\n      40.40386 =
fieldWeight in 14023, product of:\n        6.244998 = tf(freq=39.0),
with freq of:\n          39.0 = termFreq=39.0\n        6.469795 =
idf(docFreq=22502, maxDocs=5342472)\n        1.0 =
fieldNorm(doc=14023)\n  0.15275055 = ingredient_tag_id:[0 TO *],
product of:\n    1.0 = boost\n    0.15275055 = queryNorm\n",
      "5988531-3882415":"\n37.435104 = sum of:\n  37.282352 =
weight(cosmetics_packaging_fulltext:mist in 1062788)
[ClassicSimilarity], result of:\n    37.282352 =
score(doc=1062788,freq=34.0), product of:\n      0.9882648 =
queryWeight, product of:\n        6.469795 = idf(docFreq=22502,
maxDocs=5342472)\n        0.15275055 = queryNorm\n      37.725063 =
fieldWeight in 1062788, product of:\n        5.8309517 =
tf(freq=34.0), with freq of:\n          34.0 = termFreq=34.0\n
6.469795 = idf(docFreq=22502, maxDocs=5342472)\n        1.0 =
fieldNorm(doc=1062788)\n  0.15275055 = ingredient_tag_id:[0 TO *],
product of:\n    1.0 = boost\n    0.15275055 = queryNorm\n",
      "6020805-3904203":"\n30.816679 = sum of:\n  30.663929 =
weight(cosmetics_packaging_fulltext:mist in 1029387)
[ClassicSimilarity], result of:\n    30.663929 =
score(doc=1029387,freq=23.0), product of:\n      0.9882648 =
queryWeight, product of:\n        6.469795 = idf(docFreq=22502,
maxDocs=5342472)\n        0.15275055 = queryNorm\n      31.02805 =
fieldWeight in 1029387, product of:\n        4.7958317 =
tf(freq=23.0), with freq of:\n          23.0 = termFreq=23.0\n
6.469795 = idf(docFreq=22502, maxDocs=5342472)\n        1.0 =
fieldNorm(doc=1029387)\n  0.15275055 = ingredient_tag_id:[0 TO *],
product of:\n    1.0 = boost\n    0.15275055 = queryNorm\n",
      "2617977-1629335":"\n29.453148 = sum of:\n  29.300398 =
weight(cosmetics_packaging_fulltext:mist in 648235)
[ClassicSimilarity], result of:\n    29.300398 =
score(doc=648235,freq=21.0), product of:\n      0.9882648 =
queryWeight, product of:\n        6.469795 = idf(docFreq=22502,
maxDocs=5342472)\n        0.15275055 = queryNorm\n      29.648327 =
fieldWeight in 648235, product of:\n        4.582576 = tf(freq=21.0),
with freq of:\n          21.0 = termFreq=21.0\n        6.469795 =
idf(docFreq=22502, maxDocs=5342472)\n        1.0 =
fieldNorm(doc=648235)\n  0.15275055 = ingredient_tag_id:[0 TO *],
product of:\n    1.0 = boost\n    0.15275055 = queryNorm\n"},
    "QParser":"LuceneQParser",
    "timing":{
      "time":314.0,
      "prepare":{
        "time":0.0,
        "query":{
          "time":0.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "debug":{
          "time":0.0}},
      "process":{
        "time":313.0,
        "query":{
          "time":0.0},
        "facet":{
          "time":0.0},
        "facet_module":{
          "time":0.0},
        "mlt":{
          "time":0.0},
        "highlight":{
          "time":0.0},
        "stats":{
          "time":0.0},
        "expand":{
          "time":0.0},
        "debug":{
          "time":313.0}}}}}