You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Roman Chyla <ro...@gmail.com> on 2013/11/27 22:07:13 UTC

Caches contain deleted docs (?)

Hi,
I'd like to check - there is something I don't understand about cache - and
I don't know if it is a bug, or feature

the following calls return a cache

FieldCache.DEFAULT.getTerms(reader, idField);
FieldCache.DEFAULT.getInts(reader, idField, false);


the resulting arrays *will* contain entries for deleted docs, so to filter
them out, one has to manually check livedocs. Is this the expected
behaviour? I don't understand why the cache would be bothering to load data
for deleted docs. This is on SOLR4.0

Thanks!

  roman

Re: Caches contain deleted docs (?)

Posted by Roman Chyla <ro...@gmail.com>.
I understand that changes would be expensive, but shouldn't the cache
simply skip the deleted docs? In the same way as the cache for multivalued
fields (that accepts livedocs bits).
Thanks,

  roman


On Wed, Nov 27, 2013 at 6:26 PM, Erick Erickson <er...@gmail.com>wrote:

> Yep, it's expected. Segments are write-once. It's been
> a long standing design that deleted data will be
> reclaimed on segment merge, but not before. It's
> pretty expensive to change the terms loaded on the
> fly to respect deleted document's removed data.
>
> Best,
> Erick
>
>
> On Wed, Nov 27, 2013 at 4:07 PM, Roman Chyla <ro...@gmail.com>
> wrote:
>
> > Hi,
> > I'd like to check - there is something I don't understand about cache -
> and
> > I don't know if it is a bug, or feature
> >
> > the following calls return a cache
> >
> > FieldCache.DEFAULT.getTerms(reader, idField);
> > FieldCache.DEFAULT.getInts(reader, idField, false);
> >
> >
> > the resulting arrays *will* contain entries for deleted docs, so to
> filter
> > them out, one has to manually check livedocs. Is this the expected
> > behaviour? I don't understand why the cache would be bothering to load
> data
> > for deleted docs. This is on SOLR4.0
> >
> > Thanks!
> >
> >   roman
> >
>

Re: Caches contain deleted docs (?)

Posted by Erick Erickson <er...@gmail.com>.
Yep, it's expected. Segments are write-once. It's been
a long standing design that deleted data will be
reclaimed on segment merge, but not before. It's
pretty expensive to change the terms loaded on the
fly to respect deleted document's removed data.

Best,
Erick


On Wed, Nov 27, 2013 at 4:07 PM, Roman Chyla <ro...@gmail.com> wrote:

> Hi,
> I'd like to check - there is something I don't understand about cache - and
> I don't know if it is a bug, or feature
>
> the following calls return a cache
>
> FieldCache.DEFAULT.getTerms(reader, idField);
> FieldCache.DEFAULT.getInts(reader, idField, false);
>
>
> the resulting arrays *will* contain entries for deleted docs, so to filter
> them out, one has to manually check livedocs. Is this the expected
> behaviour? I don't understand why the cache would be bothering to load data
> for deleted docs. This is on SOLR4.0
>
> Thanks!
>
>   roman
>