You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by J Mohamed Zahoor <za...@indix.com> on 2013/05/20 12:43:05 UTC
multiple cache for same field
Hi
Why is that lucene field cache has multiple entries for the same field S_24.
It is a dynamic field.
'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
Also, the number at the end.. does it specified the no of entries in that cache bucket?
./zahoor
Re: multiple cache for same field
Posted by J Mohamed Zahoor <za...@indix.com>.
It does not seem to be memory footprint also ? looks too high for my index.
./zahoor
On 20-May-2013, at 10:55 PM, Jason Hellman <jh...@innoventsolutions.com> wrote:
> Most definitely not the number of unique elements in each segment. My 32 document sample index (built from the default example docs data) has the following:
>
> entry#0:
> 'StandardDirectoryReader(segments_b:29 _8(4.2.1):C32)'=>'manu_exact',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1778857102
>
> There is no chance for there to be 1.8 billion unique elements in that index.
>
> On May 20, 2013, at 1:20 PM, Erick Erickson <er...@gmail.com> wrote:
>
>> Not sure, never had to worry about what they are......
>>
>> On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>>>
>>> What is the number at the end?
>>> is it the no of unique elements in each segment?
>>>
>>> ./zahoor
>>>
>>>
>>> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>>>
>>>> Because the same field is split amongst a number of segments. If you
>>>> look in the index directory, you should see files like _3fgm.* and
>>>> _3ffm.*. Each such group represents one segment. The number of
>>>> segments changes with merging etc.
>>>>
>>>> Best
>>>> Erick
>>>>
>>>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>>>> Hi
>>>>>
>>>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>>>> It is a dynamic field.
>>>>>
>>>>>
>>>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>>>
>>>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>>>
>>>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>>>
>>>>>
>>>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>>>
>>>>> ./zahoor
>>>
>
Re: multiple cache for same field
Posted by Jason Hellman <jh...@innoventsolutions.com>.
Most definitely not the number of unique elements in each segment. My 32 document sample index (built from the default example docs data) has the following:
entry#0:
'StandardDirectoryReader(segments_b:29 _8(4.2.1):C32)'=>'manu_exact',class org.apache.lucene.index.SortedDocValues,0.5=>org.apache.lucene.search.FieldCacheImpl$SortedDocValuesImpl#1778857102
There is no chance for there to be 1.8 billion unique elements in that index.
On May 20, 2013, at 1:20 PM, Erick Erickson <er...@gmail.com> wrote:
> Not sure, never had to worry about what they are......
>
> On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>>
>> What is the number at the end?
>> is it the no of unique elements in each segment?
>>
>> ./zahoor
>>
>>
>> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>>
>>> Because the same field is split amongst a number of segments. If you
>>> look in the index directory, you should see files like _3fgm.* and
>>> _3ffm.*. Each such group represents one segment. The number of
>>> segments changes with merging etc.
>>>
>>> Best
>>> Erick
>>>
>>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>>> Hi
>>>>
>>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>>> It is a dynamic field.
>>>>
>>>>
>>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>>
>>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>>
>>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>>
>>>>
>>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>>
>>>> ./zahoor
>>
Re: multiple cache for same field
Posted by Erick Erickson <er...@gmail.com>.
Not sure, never had to worry about what they are......
On Mon, May 20, 2013 at 12:28 PM, J Mohamed Zahoor <za...@indix.com> wrote:
>
> What is the number at the end?
> is it the no of unique elements in each segment?
>
> ./zahoor
>
>
> On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
>
>> Because the same field is split amongst a number of segments. If you
>> look in the index directory, you should see files like _3fgm.* and
>> _3ffm.*. Each such group represents one segment. The number of
>> segments changes with merging etc.
>>
>> Best
>> Erick
>>
>> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>>> Hi
>>>
>>> Why is that lucene field cache has multiple entries for the same field S_24.
>>> It is a dynamic field.
>>>
>>>
>>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>>
>>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>>
>>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>>
>>>
>>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>>
>>> ./zahoor
>
Re: multiple cache for same field
Posted by J Mohamed Zahoor <za...@indix.com>.
What is the number at the end?
is it the no of unique elements in each segment?
./zahoor
On 20-May-2013, at 7:37 PM, Erick Erickson <er...@gmail.com> wrote:
> Because the same field is split amongst a number of segments. If you
> look in the index directory, you should see files like _3fgm.* and
> _3ffm.*. Each such group represents one segment. The number of
> segments changes with merging etc.
>
> Best
> Erick
>
> On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
>> Hi
>>
>> Why is that lucene field cache has multiple entries for the same field S_24.
>> It is a dynamic field.
>>
>>
>> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>>
>> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>>
>> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>>
>>
>> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>>
>> ./zahoor
Re: multiple cache for same field
Posted by Erick Erickson <er...@gmail.com>.
Because the same field is split amongst a number of segments. If you
look in the index directory, you should see files like _3fgm.* and
_3ffm.*. Each such group represents one segment. The number of
segments changes with merging etc.
Best
Erick
On Mon, May 20, 2013 at 6:43 AM, J Mohamed Zahoor <za...@indix.com> wrote:
> Hi
>
> Why is that lucene field cache has multiple entries for the same field S_24.
> It is a dynamic field.
>
>
> 'SegmentCoreReader(owner=_3fgm(4.2.1):C7681)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1174240382
>
> 'SegmentCoreReader(owner=_3ffm(4.2.1):C1596758)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#83384344
>
> 'SegmentCoreReader(owner=_3fgh(4.2.1):C2301)'=>'S_24',double,org.apache.lucene.search.FieldCache.NUMERIC_UTILS_DOUBLE_PARSER=>org.apache.lucene.search.FieldCacheImpl$DoublesFromArray#1281331764
>
>
> Also, the number at the end.. does it specified the no of entries in that cache bucket?
>
> ./zahoor