You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Rajarshi Guha <ra...@gmail.com> on 2012/11/17 02:54:28 UTC

Highlighting and storage overhead

Hi, we're using Solr 3.6 to index and search a number of entities. The
entities have a large number of fields and to enable full text search
across all the fields I created a catch-all text field which is indexed.

Initially I stored the field allowing me to highlight the matching fragment
in the catch all field

However, the field is generally very large and was leading to poor
performance. As a result we no longer store it and thus cannot do
highlighting.

My questions are:

1) Is it preferable to have such a catch all field that collapses multiple
fields? Or is it better to have fields separate and use the DisMax parser?

2) Must fields be stored to support highlighting? If so, what is good
practice when one has many fields and would like to include them all when
running a query *and* support highlighting?

Any pointers would be appreciated

-- 
Rajarshi Guha | http://blog.rguha.net
NIH Center for Advancing Translational Science

Re: Highlighting and storage overhead

Posted by Rajarshi Guha <ra...@gmail.com>.

Thanks for the pointer. If I were to use (e)dismax, is it possible to
identify the field(s) that matched the query (irrespective of whether the
fields are stored or not)?


On Fri, Nov 16, 2012 at 9:10 PM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

> Hello,
>
> I prefer individual fields because this allows one to apply different
> query boosting and other nice (e)dismax things on different fields.  With a
> catch-all field you lose that.  Yes, to have highlighting you need to store
> fields you want to use for highlighting.
>
> See http://search-lucene.com/?q=solr+catchall+%22catch+all%22+catch-all
>
>
> Otis
> --
> Performance Monitoring - http://sematext.com/spm/index.html
> Search Analytics - http://sematext.com/search-analytics/index.html
>
>
>
>
> On Fri, Nov 16, 2012 at 8:54 PM, Rajarshi Guha <ra...@gmail.com>wrote:
>
>> Hi, we're using Solr 3.6 to index and search a number of entities. The
>> entities have a large number of fields and to enable full text search
>> across all the fields I created a catch-all text field which is indexed.
>>
>> Initially I stored the field allowing me to highlight the matching
>> fragment
>> in the catch all field
>>
>> However, the field is generally very large and was leading to poor
>> performance. As a result we no longer store it and thus cannot do
>> highlighting.
>>
>> My questions are:
>>
>> 1) Is it preferable to have such a catch all field that collapses multiple
>> fields? Or is it better to have fields separate and use the DisMax parser?
>>
>> 2) Must fields be stored to support highlighting? If so, what is good
>> practice when one has many fields and would like to include them all when
>> running a query *and* support highlighting?
>>
>> Any pointers would be appreciated
>>
>> --
>> Rajarshi Guha | http://blog.rguha.net
>> NIH Center for Advancing Translational Science
>>
>
>


-- 
Rajarshi Guha | http://blog.rguha.net
NIH Center for Advancing Translational Science

Re: Highlighting and storage overhead

Posted by Otis Gospodnetic <ot...@gmail.com>.

Hello,

I prefer individual fields because this allows one to apply different query
boosting and other nice (e)dismax things on different fields.  With a
catch-all field you lose that.  Yes, to have highlighting you need to store
fields you want to use for highlighting.

See http://search-lucene.com/?q=solr+catchall+%22catch+all%22+catch-all


Otis
--
Performance Monitoring - http://sematext.com/spm/index.html
Search Analytics - http://sematext.com/search-analytics/index.html




On Fri, Nov 16, 2012 at 8:54 PM, Rajarshi Guha <ra...@gmail.com>wrote:

> Hi, we're using Solr 3.6 to index and search a number of entities. The
> entities have a large number of fields and to enable full text search
> across all the fields I created a catch-all text field which is indexed.
>
> Initially I stored the field allowing me to highlight the matching fragment
> in the catch all field
>
> However, the field is generally very large and was leading to poor
> performance. As a result we no longer store it and thus cannot do
> highlighting.
>
> My questions are:
>
> 1) Is it preferable to have such a catch all field that collapses multiple
> fields? Or is it better to have fields separate and use the DisMax parser?
>
> 2) Must fields be stored to support highlighting? If so, what is good
> practice when one has many fields and would like to include them all when
> running a query *and* support highlighting?
>
> Any pointers would be appreciated
>
> --
> Rajarshi Guha | http://blog.rguha.net
> NIH Center for Advancing Translational Science
>