You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by uyilmaz <uy...@vivaldi.net.INVALID> on 2020/11/04 11:43:02 UTC

when to use stored over docValues and useDocValuesAsStored

Hi,

I heavily use streaming expressions and facets, or export large amounts of data from Solr to Spark to make analyses.

Please correct me if I know wrong:

+ requesting a non-docValues field in a response causes whole document to be decompressed and read from disk
+ streaming expressions and export handler requires every field read to have docValues

- docValues increases index size, therefore memory requirement, stored only uses disk space
- stored preserves order of multivalued fields

It seems stored is only useful when I have a multivalued field that I care about the index-time order of things, and since I will be using the export handler, it will use docValues anyways and lose the order.

So is there any case that I need stored=true?

Best,
ufuk

-- 
uyilmaz <uy...@vivaldi.net>

Re: when to use stored over docValues and useDocValuesAsStored

Posted by Erick Erickson <er...@gmail.com>.
> On Nov 4, 2020, at 6:43 AM, uyilmaz <uy...@vivaldi.net.INVALID> wrote:
> 
> Hi,
> 
> I heavily use streaming expressions and facets, or export large amounts of data from Solr to Spark to make analyses.
> 
> Please correct me if I know wrong:
> 
> + requesting a non-docValues field in a response causes whole document to be decompressed and read from disk

non-docValues fields don’t work at all for many stream spources, IIRC only the Topic Stream will work with stored values. The read/decompress/extract cycle would be unacceptable performance-wise for large data sets otherwise.

> + streaming expressions and export handler requires every field read to have docValues

Pretty muche.

> 
> - docValues increases index size, therefore memory requirement, stored only uses disk space

Yes. 

> - stored preserves order of multivalued fields

Yes.

> 
> It seems stored is only useful when I have a multivalued field that I care about the index-time order of things, and since I will be using the export handler, it will use docValues anyways and lose the order.

Yes.

> 
> So is there any case that I need stored=true?

Not for export outside of the Topic Stream as above. stored=true is there for things like showing the user the original input and highlighting.

> 
> Best,
> ufuk
> 
> -- 
> uyilmaz <uy...@vivaldi.net>