You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Alessandro Benedetti <a....@sease.io> on 2021/12/03 10:52:17 UTC

Re: Highlighting: echo matching query text?

Hi Stephen,
so you want to show in the UI the original token in the inverted index that
caused the match?
This relies on the text analysis configured Solr side and it would be far
from being intuitive to your final user.

With stemming you made the perfect example:
The stem of the term potentially is not even a term at all in the language
associated with the field.
Not sure if showing the token from the index would help at all from the
explainability side (in comparison to now showing it).
Unless the user is informed about the entire text analysis chain(including
the query time text analysis and he/she understands that), I guess showing
the token will just complicate even more the situation:
*e.g.*
q: arguing
D1 : argues (Match argu)

Without knowing what index and query time means, what stemming means, and
that stemming was applied at query time and at indexing time, I am not
entirely sure it's going to add that much to the final user experience.

Aside from my personal observations, I don't think there's anything in the
super stratified highlighting module, so you should be able to pick some of
the implementations and customize it.

Cheers
--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Tue, 30 Nov 2021 at 19:00, Stephen Lewis Bianamara <
stephen.bianamara@gmail.com> wrote:

> Hi SOLR Community,
>
> I am investigating some different options with highlighting, and one
> feature I wanted to build would require matching a highlighted match back
> to the original matchin token. I couldn't find a way to do that in the
> documentation, so I'm guessing that it doesn't exist yet. The application
> for this would be to leverage solr to understand the query -> field
> matching with many field types with varying matching rules.
>
> The simplest example of what I'd like would be something like this in
> english: A document like
>
>    { "id": "test", "comment_en": "we like dogs" }
>
> ..and a query like "dog OR cat". I'd like highlighting to be able to return
> something like this:
>
> {"highlighting": {
>    "text_en": {
>        "comment_en": ["we like *<em source='dog'>dogs</em>*" ]}}}
>
> So the essence is that I don't need to know anything about english matching
> rules (in this case plurals) outside of solr to know how it came to the
> conclusion that this document was a match.
>
> Has anyone come up with a solution to this before? Does anyone know an
> existing feature request for this if not?
>
> Thanks!
> Stephen
>

Re: Highlighting: echo matching query text?

Posted by Stephen Lewis Bianamara <st...@gmail.com>.
Hi Alessandro,

Thanks for your reply, apologies for the slowness to get back to you.

It wouldn't be intended for the UI in the end, but rather for application
logic in post processing. In your example of *argues *matching *arguing*,
the desired result would actually be something like

 "comment_en": ["in debate club we treat <em source='argues'>arguing</em>
like a sport" ]}}}


I'll try taking a look at the highlighters and see if this seems possible
to implement.

Thanks,
Stephen


On Fri, Dec 3, 2021 at 2:52 AM Alessandro Benedetti <a....@sease.io>
wrote:

> Hi Stephen,
> so you want to show in the UI the original token in the inverted index that
> caused the match?
> This relies on the text analysis configured Solr side and it would be far
> from being intuitive to your final user.
>
> With stemming you made the perfect example:
> The stem of the term potentially is not even a term at all in the language
> associated with the field.
> Not sure if showing the token from the index would help at all from the
> explainability side (in comparison to now showing it).
> Unless the user is informed about the entire text analysis chain(including
> the query time text analysis and he/she understands that), I guess showing
> the token will just complicate even more the situation:
> *e.g.*
> q: arguing
> D1 : argues (Match argu)
>
> Without knowing what index and query time means, what stemming means, and
> that stemming was applied at query time and at indexing time, I am not
> entirely sure it's going to add that much to the final user experience.
>
> Aside from my personal observations, I don't think there's anything in the
> super stratified highlighting module, so you should be able to pick some of
> the implementations and customize it.
>
> Cheers
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Tue, 30 Nov 2021 at 19:00, Stephen Lewis Bianamara <
> stephen.bianamara@gmail.com> wrote:
>
> > Hi SOLR Community,
> >
> > I am investigating some different options with highlighting, and one
> > feature I wanted to build would require matching a highlighted match back
> > to the original matchin token. I couldn't find a way to do that in the
> > documentation, so I'm guessing that it doesn't exist yet. The application
> > for this would be to leverage solr to understand the query -> field
> > matching with many field types with varying matching rules.
> >
> > The simplest example of what I'd like would be something like this in
> > english: A document like
> >
> >    { "id": "test", "comment_en": "we like dogs" }
> >
> > ..and a query like "dog OR cat". I'd like highlighting to be able to
> return
> > something like this:
> >
> > {"highlighting": {
> >    "text_en": {
> >        "comment_en": ["we like *<em source='dog'>dogs</em>*" ]}}}
> >
> > So the essence is that I don't need to know anything about english
> matching
> > rules (in this case plurals) outside of solr to know how it came to the
> > conclusion that this document was a match.
> >
> > Has anyone come up with a solution to this before? Does anyone know an
> > existing feature request for this if not?
> >
> > Thanks!
> > Stephen
> >
>