You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by elisabeth benoit <el...@gmail.com> on 2014/10/15 10:12:50 UTC

fuzzy search and edismax: how to do not sum up

Hello all,

We are using solr 4.2.1 (but planning to switch to solr 4.10 very soon).

We are trying to do approximative search using ~ operator.

We use catchall_light field without stemming (to do not mix fuzzy and
stemming)

We send a request to solr using fuzzy operator on non "frequent" words

for instance

q=catchall_light:(lyon 69002~1)

our handler uses edismax

that query gives a higher score to document Lyon, having postal codes
69001, 69002, 69003, 69004,...

than to other documents having only Lyon and postal code 69002 (the debug
output is below)

but we do not want to sum up all scores for Lyon document.

Does anyone knows if it is possible to change that?

Best regards,
Elisabeth


here is the debug output for Lyon
(we use idf for that field but want to get rid of it)

15.728481 = (MATCH) sum of:
  1.2349477 = (MATCH) weight(catchall_light:lyon in 707758)
[NoTFSimilarity], result of:
    1.2349477 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
      0.13427915 = queryWeight, product of:
        9.196869 = idf(docFreq=2924, maxDocs=10616483)
        0.014600528 = queryNorm
      9.196869 = fieldWeight in 707758, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        9.196869 = idf(docFreq=2924, maxDocs=10616483)
        1.0 = fieldNorm(doc=707758)
  14.493534 = (MATCH) sum of:
    1.576392 = (MATCH) weight(catchall_light:69001^0.8 in 707758)
[NoTFSimilarity], result of:
      1.576392 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13569424 = queryWeight, product of:
          0.8 = boost
          11.617237 = idf(docFreq=259, maxDocs=10616483)
          0.014600528 = queryNorm
        11.617237 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.617237 = idf(docFreq=259, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.8904426 = (MATCH) weight(catchall_light:69002 in 707758)
[NoTFSimilarity], result of:
      1.8904426 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.16613688 = queryWeight, product of:
          11.378826 = idf(docFreq=329, maxDocs=10616483)
          0.014600528 = queryNorm
        11.378826 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.378826 = idf(docFreq=329, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.460347 = (MATCH) weight(catchall_light:69003^0.8 in 707758)
[NoTFSimilarity], result of:
      1.460347 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13060425 = queryWeight, product of:
          0.8 = boost
          11.181466 = idf(docFreq=401, maxDocs=10616483)
          0.014600528 = queryNorm
        11.181466 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.181466 = idf(docFreq=401, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.7109065 = (MATCH) weight(catchall_light:69004^0.8 in 707758)
[NoTFSimilarity], result of:
      1.7109065 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.14136517 = queryWeight, product of:
          0.8 = boost
          12.102744 = idf(docFreq=159, maxDocs=10616483)
          0.014600528 = queryNorm
        12.102744 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          12.102744 = idf(docFreq=159, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.5255939 = (MATCH) weight(catchall_light:69005^0.8 in 707758)
[NoTFSimilarity], result of:
      1.5255939 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13349001 = queryWeight, product of:
          0.8 = boost
          11.428525 = idf(docFreq=313, maxDocs=10616483)
          0.014600528 = queryNorm
        11.428525 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.428525 = idf(docFreq=313, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.6497903 = (MATCH) weight(catchall_light:69006^0.8 in 707758)
[NoTFSimilarity], result of:
      1.6497903 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13881733 = queryWeight, product of:
          0.8 = boost
          11.884614 = idf(docFreq=198, maxDocs=10616483)
          0.014600528 = queryNorm
        11.884614 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.884614 = idf(docFreq=198, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.5892421 = (MATCH) weight(catchall_light:69007^0.8 in 707758)
[NoTFSimilarity], result of:
      1.5892421 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13624617 = queryWeight, product of:
          0.8 = boost
          11.66449 = idf(docFreq=247, maxDocs=10616483)
          0.014600528 = queryNorm
        11.66449 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.66449 = idf(docFreq=247, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.5816664 = (MATCH) weight(catchall_light:69008^0.8 in 707758)
[NoTFSimilarity], result of:
      1.5816664 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13592105 = queryWeight, product of:
          0.8 = boost
          11.636655 = idf(docFreq=254, maxDocs=10616483)
          0.014600528 = queryNorm
        11.636655 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.636655 = idf(docFreq=254, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)
    1.509153 = (MATCH) weight(catchall_light:69009^0.8 in 707758)
[NoTFSimilarity], result of:
      1.509153 = score(doc=707758,freq=1.0 = termFreq=1.0
), product of:
        0.13276877 = queryWeight, product of:
          0.8 = boost
          11.366777 = idf(docFreq=333, maxDocs=10616483)
          0.014600528 = queryNorm
        11.366777 = fieldWeight in 707758, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          11.366777 = idf(docFreq=333, maxDocs=10616483)
          1.0 = fieldNorm(doc=707758)