You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by seez <sh...@gmail.com> on 2021/03/31 15:39:44 UTC

Unified Highlighter and Fuzzy Searches

Hello,

I have the following fuzzy search criteria: 

runnings~0 

Search itself returns expected results and I see documents that have the
exact term "runnings". However the same query criteria is not honored by
unified highlighter. It gives back no matching results. Although
"runnings~1" works (with the added caveat of also honoring the "1" edit
distance). 

So it appears unified highlighter only supports edit distance > 0 for fuzzy
searches. And this is not an issue with original or fastVector highlighters.
Is this a real problem or am I missing something?





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Unified Highlighter and Fuzzy Searches

Posted by David Smiley <ds...@apache.org>.
This was a bug that was fixed in 8.7:
https://issues.apache.org/jira/browse/LUCENE-9427

I thought perhaps hl.weightMatches=false might work but it doesn't.  So you
have to upgrade to get this.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Apr 1, 2021 at 8:23 AM Srijan <sh...@gmail.com> wrote:

> I tried both Standard and DisMax query parsers and the issue is easily
> reproducible. And forgot to mention earlier, I am trying this on Solr 8.6.3
>
> Just to add more clarity this is what I am doing:
>
> I have say a field called File_Content_Field with the following values
> indexed and stored: "runnings, running, run, runs"
>
> My query is something like this:
>
> q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on
>
> With original highlighter, I see the following response:
> "highlighting": {"document_id": {"File_Content_Field": ["\n \nTest Dataset
> 1 Running and <em>Runnings</em> and Runs and R\n \n "]
>
>
> With unified highlighter, no highlighting is returned:
>
>
> q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on&hl.method=unified
>
> "highlighting": {"document_id": {"File_Content_Field": []}
>
>
> However, runnings~1 works as expected (highlights both running and
> runnings)
>
>
> q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on&hl.method=unified
>  "highlighting": {
> "document_id": {"File_Content_Field": [ "\n \nTest Dataset 1
> <em>Running</em> and <em>Runnings</em> and Runs and R\n \n ]
> }
>
>
>
>
> On Thu, Apr 1, 2021 at 12:32 AM David Smiley <ds...@apache.org> wrote:
>
> > I tried this in tests both at the Lucene layer and Solr layer and I'm not
> > seeing the failure to highlight for the UH.  What query parser are you
> > using?
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
> >
> > On Wed, Mar 31, 2021 at 11:39 AM seez <sh...@gmail.com> wrote:
> >
> > > Hello,
> > >
> > > I have the following fuzzy search criteria:
> > >
> > > runnings~0
> > >
> > > Search itself returns expected results and I see documents that have
> the
> > > exact term "runnings". However the same query criteria is not honored
> by
> > > unified highlighter. It gives back no matching results. Although
> > > "runnings~1" works (with the added caveat of also honoring the "1" edit
> > > distance).
> > >
> > > So it appears unified highlighter only supports edit distance > 0 for
> > fuzzy
> > > searches. And this is not an issue with original or fastVector
> > > highlighters.
> > > Is this a real problem or am I missing something?
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> > >
> >
>

Re: Unified Highlighter and Fuzzy Searches

Posted by Srijan <sh...@gmail.com>.
I tried both Standard and DisMax query parsers and the issue is easily
reproducible. And forgot to mention earlier, I am trying this on Solr 8.6.3

Just to add more clarity this is what I am doing:

I have say a field called File_Content_Field with the following values
indexed and stored: "runnings, running, run, runs"

My query is something like this:

q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on

With original highlighter, I see the following response:
"highlighting": {"document_id": {"File_Content_Field": ["\n \nTest Dataset
1 Running and <em>Runnings</em> and Runs and R\n \n "]


With unified highlighter, no highlighting is returned:

q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on&hl.method=unified

"highlighting": {"document_id": {"File_Content_Field": []}


However, runnings~1 works as expected (highlights both running and runnings)

q=File_Content_Field:runnings~0&hl.fl=File_Content_Field&hl=on&hl.method=unified
 "highlighting": {
"document_id": {"File_Content_Field": [ "\n \nTest Dataset 1
<em>Running</em> and <em>Runnings</em> and Runs and R\n \n ]
}




On Thu, Apr 1, 2021 at 12:32 AM David Smiley <ds...@apache.org> wrote:

> I tried this in tests both at the Lucene layer and Solr layer and I'm not
> seeing the failure to highlight for the UH.  What query parser are you
> using?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Wed, Mar 31, 2021 at 11:39 AM seez <sh...@gmail.com> wrote:
>
> > Hello,
> >
> > I have the following fuzzy search criteria:
> >
> > runnings~0
> >
> > Search itself returns expected results and I see documents that have the
> > exact term "runnings". However the same query criteria is not honored by
> > unified highlighter. It gives back no matching results. Although
> > "runnings~1" works (with the added caveat of also honoring the "1" edit
> > distance).
> >
> > So it appears unified highlighter only supports edit distance > 0 for
> fuzzy
> > searches. And this is not an issue with original or fastVector
> > highlighters.
> > Is this a real problem or am I missing something?
> >
> >
> >
> >
> >
> > --
> > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>

Re: Unified Highlighter and Fuzzy Searches

Posted by David Smiley <ds...@apache.org>.
I tried this in tests both at the Lucene layer and Solr layer and I'm not
seeing the failure to highlight for the UH.  What query parser are you
using?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 31, 2021 at 11:39 AM seez <sh...@gmail.com> wrote:

> Hello,
>
> I have the following fuzzy search criteria:
>
> runnings~0
>
> Search itself returns expected results and I see documents that have the
> exact term "runnings". However the same query criteria is not honored by
> unified highlighter. It gives back no matching results. Although
> "runnings~1" works (with the added caveat of also honoring the "1" edit
> distance).
>
> So it appears unified highlighter only supports edit distance > 0 for fuzzy
> searches. And this is not an issue with original or fastVector
> highlighters.
> Is this a real problem or am I missing something?
>
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>