You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vivek sar <vi...@gmail.com> on 2009/08/03 20:46:04 UTC

Re: Boosting for most recent documents

Hi,

 Related question to "getting the latest records first". After trying
few suggested ways (function query, index time boosting) of getting
the latest first I settled for simple "sort" parameter,

     sort=field+asc

As per wiki, http://wiki.apache.org/solr/SchemaDesign?highlight=(sort),

Lucene would cache "4 bytes * the number of documents" plus unique
terms for the sorted field in fieldcache. This is done so subsequent
sort requests can be retrieved from cache. So the memory usage if I
got 1 billion records in one Indexer instance, for ex,

1) 1 billion records
2) sort on time stamp field (rounded to hour) - for 1 year - 8760
unique terms. (negligible)
3) Total memory requirement  for sorting on this single field would be
around  1G * 4 = 4GB

So, if I run only one sort query once in a day there would still be
4GB required at all time. Is there any way to tell Solr/Lucene to
release the memory once the query has been run? Basically I don't want
cache. I've commented out all the cache parameters in the
solrconfig.xml, but I still see the very first time I run the sort
query the memory jumps by 4 G and remains there.

Is there any way so Lucene/Solr doesn't use so much memory for sorting
so my application can scale (sorting memory requirement won't be
function of number of documents)?

Thanks,
-vivek





On Thu, Jul 16, 2009 at 3:10 PM, Chris
Hostetter<ho...@fucit.org> wrote:
>
> :   Does anyone know if Solr supports sorting by internal document ids,
> : i.e, like Sort.INDEXORDER in Lucene? If so, how?
>
> It does not.  in Solr the decisison to make "score desc" the default
> search ment there is no way to request simple docId ordering.
>
> : Also, if anyone have any insight on if function query loads up unique
> : terms (like field sorts) in memory or not.
>
> It uses the exact same FieldCache as sorting.
>
>
>
>
> -Hoss
>

Re: Boosting for most recent documents

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Aug 3, 2009 at 2:46 PM, vivek sar<vi...@gmail.com> wrote:
> So, if I run only one sort query once in a day there would still be
> 4GB required at all time. Is there any way to tell Solr/Lucene to
> release the memory once the query has been run? Basically I don't want
> cache. I've commented out all the cache parameters in the
> solrconfig.xml, but I still see the very first time I run the sort
> query the memory jumps by 4 G and remains there.

There is currently no way to tell Lucene not to cache the FieldCache
entry it uses for sorting.
If you call commit though, a new searcher will be opened and the
memory should be released.

-Yonik
http://www.lucidimagination.com