You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/12/01 15:54:11 UTC

[jira] [Commented] (SOLR-8344) Replace reading stored fields to instead read from docValues

    [ https://issues.apache.org/jira/browse/SOLR-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033802#comment-15033802 ] 

Yonik Seeley commented on SOLR-8344:
------------------------------------

bq. But are you also arguing for always loading fields from docvalues even if they are stored?

If a client requests fl=a,b,c (and these three fields all have docvalues *and* are stored), it may be slower using docvalues *if* they are not cached yet.
The question then becomes.... why are they not cached?
- this is a one-off query, the docValues are not normally used
  -- this is a case we should not be optimizing too much for
- this is going to be a very common query
  -- in this case, we should use docvalues anyway.... the average latency will drop as things get cached.

If we're requesting a large result set, it probably makes sense to use docvalues.... every cache miss brings in 4K of that column, so subsequent accesses will become less likely to miss (vs the same scenario in stored fields).  If the sort is by \_docid\_ then access will even be linear, meaning there will be few cache misses.  OS read-ahead being triggered will reduce that even further.

If the index is so massive that the docvalues for these three fields can't be cached for the random access case, then how will docvalues compare to stored values?
With a disk-seek-per-doc-access, this is going to be a slow system regardless, and very specialized (i.e. if one can't effectively cache these fields, then things like sorting/faceting on these fields will be slow as well).

Based on what we know now, it feels like docValues is the right default.
Benchmarking to verify our assumptions would be a good thing.


> Replace reading stored fields to instead read from docValues
> ------------------------------------------------------------
>
>                 Key: SOLR-8344
>                 URL: https://issues.apache.org/jira/browse/SOLR-8344
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Ishan Chattopadhyaya
>
> This issue was discussed in the comments at SOLR-8220. Splitting it out to a separate issue so that we can have a focused discussion on whether/how to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org