You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by J Mohamed Zahoor <za...@indix.com> on 2013/05/20 12:43:05 UTC

multiple cache for same field

Hi

Why is that lucene field cache has multiple entries for the same field S_24.
It is a dynamic field.


'SegmentCoreReader(​owner=_3fgm(​4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382

'SegmentCoreReader(​owner=_3ffm(​4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344

'SegmentCoreReader(​owner=_3fgh(​4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764


Also, the number at the end.. does it specified the no of entries in that cache bucket?

./zahoor

Re: multiple cache for same field

Posted by J Mohamed Zahoor <za...@indix.com>.
It does not seem to be memory footprint also ? looks too high for my index.

./zahoor


On 20-May-2013, at 10:55 PM, Jason Hellman <jh...@innoventsolutions.com> wrote:

> Most definitely not the number of unique elements in each segment.  My 32 document sample index (built from the default example docs data) has the following:
> 
> entry#0:
> 'StandardDirectoryReader(​segments_b:29 _8(​4.2.1):C32)'=>'manu_exact',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1778857102
> 
> There is no chance for there to be 1.8 billion unique elements in that index.
> 
> On May 20, 2013, at 1:20 PM, Erick Erickson <er...@gmail.com> wrote:
> 
>> Not sure, never had to worry about what they are......
>> 
>> On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>>> 
>>> What is the number at the end?
>>> is it the no of unique elements in each segment?
>>> 
>>> ./zahoor
>>> 
>>> 
>>> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>>> 
>>>> Because the same field is split amongst a number of segments. If you
>>>> look in the index directory, you should see files like _3fgm.* and
>>>> _3ffm.*. Each such group represents one segment. The number of
>>>> segments changes with merging etc.
>>>> 
>>>> Best
>>>> Erick
>>>> 
>>>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>>>> Hi
>>>>> 
>>>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>>>> It is a dynamic field.
>>>>> 
>>>>> 
>>>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>>> 
>>>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>>> 
>>>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>>> 
>>>>> 
>>>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>>> 
>>>>> ./zahoor
>>> 
> 


Re: multiple cache for same field

Posted by Jason Hellman <jh...@innoventsolutions.com>.
Most definitely not the number of unique elements in each segment.  My 32 document sample index (built from the default example docs data) has the following:

entry#0:
'StandardDirectoryReader(​segments_b:29 _8(​4.2.1):C32)'=>'manu_exact',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1778857102

There is no chance for there to be 1.8 billion unique elements in that index.

On May 20, 2013, at 1:20 PM, Erick Erickson <er...@gmail.com> wrote:

> Not sure, never had to worry about what they are......
> 
> On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>> 
>> What is the number at the end?
>> is it the no of unique elements in each segment?
>> 
>> ./zahoor
>> 
>> 
>> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>> 
>>> Because the same field is split amongst a number of segments. If you
>>> look in the index directory, you should see files like _3fgm.* and
>>> _3ffm.*. Each such group represents one segment. The number of
>>> segments changes with merging etc.
>>> 
>>> Best
>>> Erick
>>> 
>>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>>> Hi
>>>> 
>>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>>> It is a dynamic field.
>>>> 
>>>> 
>>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>> 
>>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>> 
>>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>> 
>>>> 
>>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>> 
>>>> ./zahoor
>> 


Re: multiple cache for same field

Posted by Erick Erickson <er...@gmail.com>.
Not sure, never had to worry about what they are......

On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>
> What is the number at the end?
> is it the no of unique elements in each segment?
>
> ./zahoor
>
>
> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>
>> Because the same field is split amongst a number of segments. If you
>> look in the index directory, you should see files like _3fgm.* and
>> _3ffm.*. Each such group represents one segment. The number of
>> segments changes with merging etc.
>>
>> Best
>> Erick
>>
>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>> Hi
>>>
>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>> It is a dynamic field.
>>>
>>>
>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>
>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>
>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>
>>>
>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>
>>> ./zahoor
>

Re: multiple cache for same field

Posted by J Mohamed Zahoor <za...@indix.com>.
What is the number at the end?
is it the no of unique elements in each segment?

./zahoor


On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:

> Because the same field is split amongst a number of segments. If you
> look in the index directory, you should see files like _3fgm.* and
> _3ffm.*. Each such group represents one segment. The number of
> segments changes with merging etc.
> 
> Best
> Erick
> 
> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>> Hi
>> 
>> Why is that lucene field cache has multiple entries for the same field S_24.
>> It is a dynamic field.
>> 
>> 
>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>> 
>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>> 
>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>> 
>> 
>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>> 
>> ./zahoor


Re: multiple cache for same field

Posted by Erick Erickson <er...@gmail.com>.
Because the same field is split amongst a number of segments. If you
look in the index directory, you should see files like _3fgm.* and
_3ffm.*. Each such group represents one segment. The number of
segments changes with merging etc.

Best
Erick

On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
> Hi
>
> Why is that lucene field cache has multiple entries for the same field S_24.
> It is a dynamic field.
>
>
> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>
> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>
> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>
>
> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>
> ./zahoor