You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Vaibhav Bhandari <va...@gmail.com> on 2015/03/18 20:59:43 UTC

High memory usage while querying with sort using cursor

Hi all,

My setup is as follows:

*Collection* size: 32GB, 2 shards, replication factor: 2 (~16GB on each
replica). Number of rows: 250million
4 *Solr* nodes: RAM: 30GB each. Heap size: 8GB. Version: 4.9.1

Besides the collection in question, the nodes have some other collections
present. The total size of all collections of each node is 30GB (which is
the same as the amount of RAM on them).

A simple query on the collection: ../select?q=*:* works perfectly fine.

But as soon as i add sorting, it crashes the nodes with OOM:
.../select?q=*:*&sort=unique_id asc&rows=0.

I have tried to disable filter-cache and query-result-cache. But that did
not help either.

Any ideas/suggestions?

Thanks,
Vaibhav

Re: High memory usage while querying with sort using cursor

Posted by Vaibhav Bhandari <va...@gmail.com>.
Thanks Chris, that makes a lot of sense.



On Wed, Mar 18, 2015 at 3:16 PM, Chris Hostetter <ho...@fucit.org>
wrote:

>
> : A simple query on the collection: ../select?q=*:* works perfectly fine.
> :
> : But as soon as i add sorting, it crashes the nodes with OOM:
> : .../select?q=*:*&sort=unique_id asc&rows=0.
>
> if you don't have docValues="true" on your unique_id field, then sorting
> rquires it to build up a large in memory data strucutre (formally known as
> "FieldCache", now just an on the fly DocValues structure)
>
> With explicit docValues constructed at index time, a lot of that data can
> just live in the operating system's filesystem cache, and lucene only has
> to load a small potion of it into the heap.
>
>
>
> -Hoss
> http://www.lucidworks.com/
>

Re: High memory usage while querying with sort using cursor

Posted by Chris Hostetter <ho...@fucit.org>.
: A simple query on the collection: ../select?q=*:* works perfectly fine.
: 
: But as soon as i add sorting, it crashes the nodes with OOM:
: .../select?q=*:*&sort=unique_id asc&rows=0.

if you don't have docValues="true" on your unique_id field, then sorting 
rquires it to build up a large in memory data strucutre (formally known as 
"FieldCache", now just an on the fly DocValues structure)

With explicit docValues constructed at index time, a lot of that data can 
just live in the operating system's filesystem cache, and lucene only has 
to load a small potion of it into the heap.



-Hoss
http://www.lucidworks.com/