You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by da...@correo.aeat.es on 2014/03/18 08:44:02 UTC

About enableLazyFieldLoading and memory

Hello,

we have a Solr Cloud 4.7, but this question is also related with other 
versions, because we have tested this in several installations.


We have a very big index ( more than 400K docs)  with big documents, but 
in our queries we don't fetch the large fields in fl parameter. But, we 
have seen that memory grows when we look for big documents because of the 
Document Cache. This only happens in some queries we have to do, with 
maybe more than 1000 docs in the response. We have set 
enableLazyFieldLoading to true, so, we think that Document Cache should 
grow more or less the same with all documents because we only fetch ID 
field.

My question is: What does Solr exactly do when enableLazyFieldLoading is 
set to true? Why does the size of the document have an influence in the 
memory consumption?


Thanks in advance,

David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de Tecnologías de Análisis de la Información e Investigación 
del Fraude
Teléfono: 917681160
Extensión: 30160

Re: About enableLazyFieldLoading and memory

Posted by da...@correo.aeat.es.

That could be an interesting test. Unfortunately now I don't have time to 
do that, but maybe in future.

In order to avoid these memory consumptions we have reduced DocumentCache, 
and we don't have any problems. Besides, big queries that can cause 
problems are never made twice, so the DocumentCache is not needed.

If I have time to check that out I'll post it.

Best regards,

David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de Tecnologías de Análisis de la Información e Investigación 
del Fraude
Teléfono: 917681160
Extensión: 30160



De:     Miguel <mi...@juntadeandalucia.es>
Para:   solr-user@lucene.apache.org, 
Fecha:  19/03/2014 08:35
Asunto: Re: About enableLazyFieldLoading and memory



An interesting check would be disable compression on stored fields, and to 
check if your searcher works better. Disable compression should increase 
stored and searcher should be quicker.

I have read that disable compression all you need to do is to write a new 
codec that uses a stored fields format which does not compress stored 
fields such as Lucene40StoredFieldsFormat.

Best regards

El 18/03/2014 14:47, Shawn Heisey escribió:
On 3/18/2014 7:18 AM, david.davila@correo.aeat.es wrote:

yes, but if I use enableLazyFieldLoading=trueand my queries only request 
for very small fields like ID, DocumentCache shouldn't grow, although my 
stored fields are very big. Am I wrong?


Since Solr 4.1, stored fields are compressed.  This probably means that
in order to get a tiny field out, it must still retrieve an an entire
block of compressed data and uncompress it.

The information in the issue that added the compression feature says
that only one compressed block is ever retrieved for a complete document.

https://issues.apache.org/jira/browse/LUCENE-4226

I wonder if perhaps either Solr or Lucene is dropping all the data into
one or more caches even though you only requested the ID, simply because
it is already available after decompression.  This is only a guess, and
I hope I'm wrong.  If this is indeed happening, it would defeat lazy
field loading.  Can someone with a better understanding comment?

Thanks,
Shawn

Re: About enableLazyFieldLoading and memory

Posted by Miguel <mi...@juntadeandalucia.es>.

An interesting check would be disable compression on stored fields, and 
to check if your searcher works better. Disable compression should 
increase stored and searcher should be quicker.

I have read that disable compression all you need to do is to write a 
new codec that uses a stored fields format which does not compress 
stored fields such as Lucene40StoredFieldsFormat 
<http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/codecs/lucene40/Lucene40StoredFieldsFormat.html>.

Best regards

El 18/03/2014 14:47, Shawn Heisey escribió:
> On 3/18/2014 7:18 AM, david.davila@correo.aeat.es wrote:
>> yes, but if I use enableLazyFieldLoading=trueand my queries only request
>> for very small fields like ID, DocumentCache shouldn't grow, although my
>> stored fields are very big. Am I wrong?
> Since Solr 4.1, stored fields are compressed.  This probably means that
> in order to get a tiny field out, it must still retrieve an an entire
> block of compressed data and uncompress it.
>
> The information in the issue that added the compression feature says
> that only one compressed block is ever retrieved for a complete document.
>
> https://issues.apache.org/jira/browse/LUCENE-4226
>
> I wonder if perhaps either Solr or Lucene is dropping all the data into
> one or more caches even though you only requested the ID, simply because
> it is already available after decompression.  This is only a guess, and
> I hope I'm wrong.  If this is indeed happening, it would defeat lazy
> field loading.  Can someone with a better understanding comment?
>
> Thanks,
> Shawn
>
>
>

Re: About enableLazyFieldLoading and memory

Posted by Shawn Heisey <so...@elyograg.org>.

On 3/18/2014 7:18 AM, david.davila@correo.aeat.es wrote:
> yes, but if I use enableLazyFieldLoading=trueand my queries only request 
> for very small fields like ID, DocumentCache shouldn't grow, although my 
> stored fields are very big. Am I wrong?

Since Solr 4.1, stored fields are compressed.  This probably means that
in order to get a tiny field out, it must still retrieve an an entire
block of compressed data and uncompress it.

The information in the issue that added the compression feature says
that only one compressed block is ever retrieved for a complete document.

https://issues.apache.org/jira/browse/LUCENE-4226

I wonder if perhaps either Solr or Lucene is dropping all the data into
one or more caches even though you only requested the ID, simply because
it is already available after decompression.  This is only a guess, and
I hope I'm wrong.  If this is indeed happening, it would defeat lazy
field loading.  Can someone with a better understanding comment?

Thanks,
Shawn

Re: About enableLazyFieldLoading and memory

Posted by da...@correo.aeat.es.

Hi Miguel,

yes, but if I use enableLazyFieldLoading=trueand my queries only request 
for very small fields like ID, DocumentCache shouldn't grow, although my 
stored fields are very big. Am I wrong?

Best regards,


David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de Tecnologías de Análisis de la Información e Investigación 
del Fraude
Teléfono: 917681160
Extensión: 30160



De:     Miguel <mi...@juntadeandalucia.es>
Para:   solr-user@lucene.apache.org, 
Fecha:  18/03/2014 14:12
Asunto: Re: About enableLazyFieldLoading and memory



Hi David

    If you use lazy field loading (enableLazyFieldLoading=true) 
documentCache functionality is somehow limited. This means that the 
document stored in the documentCache will contain only those fields that 
were passed to the fl parameter. 

documentCache requires memory, the more memory, the more field you stored 
in the index, so if you have many stored field then documentCache requires 
much memory.

Best regards

El 18/03/2014 8:44, david.davila@correo.aeat.es escribió:
Hello,

we have a Solr Cloud 4.7, but this question is also related with other 
versions, because we have tested this in several installations.


We have a very big index ( more than 400K docs)  with big documents, but 
in our queries we don't fetch the large fields in fl parameter. But, we 
have seen that memory grows when we look for big documents because of the 
Document Cache. This only happens in some queries we have to do, with 
maybe more than 1000 docs in the response. We have set 
enableLazyFieldLoading to true, so, we think that Document Cache should 
grow more or less the same with all documents because we only fetch ID 
field.

My question is: What does Solr exactly do when enableLazyFieldLoading is 
set to true? Why does the size of the document have an influence in the 
memory consumption?


Thanks in advance,

David Dávila Atienza
AEAT - Departamento de Informática Tributaria
Subdirección de Tecnologías de Análisis de la Información e Investigación 
del Fraude
Teléfono: 917681160
Extensión: 30160

Re: About enableLazyFieldLoading and memory

Posted by Miguel <mi...@juntadeandalucia.es>.

Hi David

     If you use lazy field loading (/enableLazyFieldLoading=true/) 
/documentCache/ functionality is somehow limited. This means that the 
document stored in the /documentCache/ will contain only those fields 
that were passed to the /fl /parameter.

/documentCache/ requires memory, the more memory, the more field you 
stored in the index, so if you have many stored field then documentCache 
requires much memory.

Best regards

El 18/03/2014 8:44, david.davila@correo.aeat.es escribió:
> Hello,
>
> we have a Solr Cloud 4.7, but this question is also related with other
> versions, because we have tested this in several installations.
>
>
> We have a very big index ( more than 400K docs)  with big documents, but
> in our queries we don't fetch the large fields in fl parameter. But, we
> have seen that memory grows when we look for big documents because of the
> Document Cache. This only happens in some queries we have to do, with
> maybe more than 1000 docs in the response. We have set
> enableLazyFieldLoading to true, so, we think that Document Cache should
> grow more or less the same with all documents because we only fetch ID
> field.
>
> My question is: What does Solr exactly do when enableLazyFieldLoading is
> set to true? Why does the size of the document have an influence in the
> memory consumption?
>
>
> Thanks in advance,
>
> David Dávila Atienza
> AEAT - Departamento de Informática Tributaria
> Subdirección de Tecnologías de Análisis de la Información e Investigación
> del Fraude
> Teléfono: 917681160
> Extensión: 30160