You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jay Hill <ja...@gmail.com> on 2009/12/19 20:25:45 UTC

Sort fields all look Strings in field cache, no matter schema type

I'm on a project where I'm trying to determine the size of the field cache.
We're seeing lots of memory problems, and I suspect that the field cache is
extremely large, but I'm trying to get exact counts on what's in the field
cache.

One thing that struck me as odd in the output of the stats.jsp page is that
the field cache always shows a String type for a field, even if it is not a
String. For example, the output below is for a field "cscore" that is a
double:

entry#0 : 'org.apache.lucene.index.ReadOnlyDirectoryReader@6239da8a'=>'cscore',class

org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#297347471


The index has 4,292,426 documents, so I would expect the field cache size
for this field to be:
cscore: double (8 bytes) x 4,292,426 docs = 34,339,408 bytes

But can someone explain why a double is using FieldCache$StringIndex please?
No matter what the type of the field is in the schema the field cache stats
always show FieldCache$StringIndex.

Thanks,
-Jay

Re: Sort fields all look Strings in field cache, no matter schema type

Posted by Jay Hill <ja...@gmail.com>.
Oh, forgot to add (just to keep the thread complete), the field is being
used for a sort, so it was able to use TrieDoubleField.

Thanks again,
-Jay


On Sat, Dec 19, 2009 at 12:21 PM, Jay Hill <ja...@gmail.com> wrote:

> This field is of class type solr.SortableDoubleField.
>
> I'm actually migrating a project from Solr 1.1 to 1.4, and am in the
> process of trying to update the schema and solrconfig in stages. Updating
> the field to TrieDoubleField w/ precisionStep=0 definitely helped.
>
> Thanks Yonik!
> -Jay
>
>
>
>
> On Sat, Dec 19, 2009 at 11:37 AM, Yonik Seeley <yonik@lucidimagination.com
> > wrote:
>
>> On Sat, Dec 19, 2009 at 2:25 PM, Jay Hill <ja...@gmail.com> wrote:
>> > One thing that struck me as odd in the output of the stats.jsp page is
>> that
>> > the field cache always shows a String type for a field, even if it is
>> not a
>> > String. For example, the output below is for a field "cscore" that is a
>> > double:
>>
>> What's the class type of the double?  Older style SortableDouble had
>> to use the string index.  Newer style trie-double based should use a
>> double[].
>>
>> It also matters what the FieldCache entry is being used for... certain
>> things like faceting on single valued fields still use the
>> StringIndex.  I believe the stats component does too.  Sorting and
>> function queries should work as expected.
>>
>> -Yonik
>>
>
>

Re: Sort fields all look Strings in field cache, no matter schema type

Posted by Jay Hill <ja...@gmail.com>.
This field is of class type solr.SortableDoubleField.

I'm actually migrating a project from Solr 1.1 to 1.4, and am in the process
of trying to update the schema and solrconfig in stages. Updating the field
to TrieDoubleField w/ precisionStep=0 definitely helped.

Thanks Yonik!
-Jay



On Sat, Dec 19, 2009 at 11:37 AM, Yonik Seeley
<yo...@lucidimagination.com>wrote:

> On Sat, Dec 19, 2009 at 2:25 PM, Jay Hill <ja...@gmail.com> wrote:
> > One thing that struck me as odd in the output of the stats.jsp page is
> that
> > the field cache always shows a String type for a field, even if it is not
> a
> > String. For example, the output below is for a field "cscore" that is a
> > double:
>
> What's the class type of the double?  Older style SortableDouble had
> to use the string index.  Newer style trie-double based should use a
> double[].
>
> It also matters what the FieldCache entry is being used for... certain
> things like faceting on single valued fields still use the
> StringIndex.  I believe the stats component does too.  Sorting and
> function queries should work as expected.
>
> -Yonik
>

Re: Sort fields all look Strings in field cache, no matter schema type

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Sat, Dec 19, 2009 at 2:25 PM, Jay Hill <ja...@gmail.com> wrote:
> One thing that struck me as odd in the output of the stats.jsp page is that
> the field cache always shows a String type for a field, even if it is not a
> String. For example, the output below is for a field "cscore" that is a
> double:

What's the class type of the double?  Older style SortableDouble had
to use the string index.  Newer style trie-double based should use a
double[].

It also matters what the FieldCache entry is being used for... certain
things like faceting on single valued fields still use the
StringIndex.  I believe the stats component does too.  Sorting and
function queries should work as expected.

-Yonik