You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Christine Poerschke (Jira)" <ji...@apache.org> on 2022/03/29 11:07:00 UTC

[jira] [Commented] (SOLR-1105) Use a different stored field for highlighting

    [ https://issues.apache.org/jira/browse/SOLR-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514003#comment-17514003 ] 

Christine Poerschke commented on SOLR-1105:
-------------------------------------------

[~dsmiley] wrote on SOLR-16111:
{quote}Can this be used to solve https://issues.apache.org/jira/browse/SOLR-1105 ?
{quote}
Interesting question. Maybe, partly.

So the {{hl.queryFieldPattern}} under SOLR-16111 can be used in a
{code:java}
<field name="text_indexed_not_stored" type="text" indexed="true" stored="false"/>
<field name="text_stored_not_indexed" type="text" stored="true" indexed="false"/>
{code}
scenario e.g. if all documents are to be indexed but highlighting and thus storage is required only for a subset of documents.

For a request
{code:java}
q=text_indexed_not_stored:foo OR another_indexed_text_field:bar

hl.queryFieldPattern=text_indexed_not_stored

hl.fl=text_stored_not_indexed
{code}
the {{foo}} term (but not the {{bar}} term) is to be extracted from the query and any {{foo}} within the {{text_stored_not_indexed}} is to be highlighted.

In this foo/bar scenario the type is {{text}} for both fields i.e. the same whereas in the multi-lingual scenario the types differ. Okay, maybe an example would help think it through more:
{code:java}
<field name="title"    type="text"    stored="true"  indexed="true"/>
<field name="title_ru" type="text_ru" stored="false" indexed="true"/>
<field name="title_en" type="text_en" stored="false" indexed="true"/>
<field name="title_de" type="text_de" stored="false" indexed="true"/>
{code}
and
{code:java}
"document" : {
  "title"    : "hello hallo privyet", 
  "title_en" : "hello hallo privyet",
  "title_de" : "hello hallo privyet", 
  "title_ru" : "hello hallo privyet",
}
{code}
and
{code:java}
q=title_en:hello OR title_de:hallo OR title_ru:privyet OR some_other_indexed_field:foobar

hl.queryFieldPattern=title_*

hl.fl=title
{code}
as the hypothetical schema and document and query. So the terms should be correctly extracted but when highlighting on the generic {{title}} field, would it then depend on the exact analysis chain details and search terms w.r.t. whether or not all the terms are correctly highlighted?

> Use a different stored field for highlighting
> ---------------------------------------------
>
>                 Key: SOLR-1105
>                 URL: https://issues.apache.org/jira/browse/SOLR-1105
>             Project: Solr
>          Issue Type: Improvement
>          Components: highlighter
>            Reporter: Dmitry Lihachev
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: SOLR-1105-1_4_1.patch, SOLR-1105.patch, SOLR-1105_shared_content_field_1.3.0.patch
>
>
> DefaultSolrHighlighter uses stored field content to highlight. It has some disadvantages, because index grows up fast when using multilingual indexing due to several fields has to be stored with same content. This patch allows DefaultSolrHighlighter to use "contentField" attribute to loockup content in external field.
> Excerpt from old schema:
> {code:xml}
> <field name="title" type="text" stored="true" indexed="true" />
> <field name="title_ru" type="text_ru" stored="true" indexed="true" />
> <field name="title_en" type="text_en" stored="true" indexed="true" />
> <field name="title_de" type="text_de" stored="true" indexed="true" />
> {code}
> The same after patching, highlighter will now get content stored in "title" field
> {code:xml}
> <field name="title" type="text" stored="true" indexed="true" />
> <field name="title_ru" type="text_ru" stored="false" indexed="true" contentField="title"/>
> <field name="title_en" type="text_en" stored="false" indexed="true" contentField="title"/>
> <field name="title_de" type="text_de" stored="false" indexed="true" contentField="title"/>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org