You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by S G <sg...@gmail.com> on 2017/12/21 01:09:03 UTC
DocValues for multivalued strings and boolean fields
Hi,
One of our Solr users is trying to set docValues="true" for multivalued
string fields and boolean-type fields.
I am not sure what the performance impact of that would be.
Can docValues negatively affect performance in any way?
We are using Solr 6.5.1 and also experimenting with 7.1.0
Thanks
SG
Re: DocValues for multivalued strings and boolean fields
Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/20/2017 6:09 PM, S G wrote:
> One of our Solr users is trying to set docValues="true" for multivalued
> string fields and boolean-type fields.
>
> I am not sure what the performance impact of that would be.
> Can docValues negatively affect performance in any way?
Adding to what Emir said:
The docValues data will be the same as stored data, but it will be
uncompressed, and written in such a way that Lucene can read all values
for one field simply by reading data off the disk, no computations or
seeks within the file are required.
If the field is indexed and stored, then docValues will not be accessed
during normal queries unless there is a sort parameter or a facet
parameter that mentions a field with docValues. If present, docValues
data will be used for sorting and facets, otherwise indexed values will
be used. Usually, sorting or facets with docValues uses less memory and
performs faster than the same operation without docValues. If the
machine has insufficient system RAM to effectively cache index data, the
performance may not improve.
When docValues is added to a field, a complete reindex is required, or
Solr will not work properly.
If a field that already contains docValues has a change in the setting
for multiValued, then that will require a reindex, but you must also
take another step -- completely wiping the index directory before
reloading or restarting. If the wipe doesn't happen in this situation,
then the core is going to completely break and throw exceptions.
Thanks,
Shawn
Re: DocValues for multivalued strings and boolean fields
Posted by Emir Arnautović <em...@sematext.com>.
Hi SG,
Doc values is another file to write so indexing performances will suffer. In theory, query performances will suffer because alternative is in memory structure (fieldCache and fieldValueCache). In practice, it will not because in memory structure requires larger heap, requires time/resources to build after each commit or on first query and it is likely that doc values’ files will be cached by OS so it will not be “disk speed”.
HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> On 21 Dec 2017, at 02:09, S G <sg...@gmail.com> wrote:
>
> Hi,
>
> One of our Solr users is trying to set docValues="true" for multivalued
> string fields and boolean-type fields.
>
> I am not sure what the performance impact of that would be.
> Can docValues negatively affect performance in any way?
>
> We are using Solr 6.5.1 and also experimenting with 7.1.0
>
> Thanks
> SG