You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Alan Woodward <al...@flax.co.uk> on 2013/03/22 18:36:17 UTC

Opening up FieldCacheImpl

I'm looking at exposing data held externally to an index via a ValueSource, and it would be nice to reuse the machinery in FieldCacheImpl to cache the data per-segment.  However, it's package-private at the moment, which means I can't extend it nicely.  Is there a reason for this?  Or should I put up a JIRA to make it public?

Alan Woodward
www.flax.co.uk

Re: Opening up FieldCacheImpl

Posted by Robert Muir <rc...@gmail.com>.

Note that what fieldcache does is not special, it just has a map and
calls the public SegmentReader.addCoreClosedListener method so that it
gets notifications when something is no longer needed.

I'm not sure we should make fieldcacheimpl public if thats the real
logic you want to reuse.

On Fri, Mar 22, 2013 at 1:36 PM, Alan Woodward <al...@flax.co.uk> wrote:
> I'm looking at exposing data held externally to an index via a ValueSource,
> and it would be nice to reuse the machinery in FieldCacheImpl to cache the
> data per-segment.  However, it's package-private at the moment, which means
> I can't extend it nicely.  Is there a reason for this?  Or should I put up a
> JIRA to make it public?
>
> Alan Woodward
> www.flax.co.uk
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Opening up FieldCacheImpl

Posted by Robert Muir <rc...@gmail.com>.

On Sat, Mar 23, 2013 at 7:25 AM, Alan Woodward <al...@flax.co.uk> wrote:
>> I think instead FieldCache should actually be completely package
>> private and hidden behind a UninvertingFilterReader and accessible via
>> the existing AtomicReader docValues methods.
>
> Aha, right, because SegmentCoreReaders already caches XXXDocValues instances (without using WeakReferences or anything like that).
>
> I should explain my motivation here.  I want to store various scoring factors externally to Lucene, but make them available via a ValueSource to CustomScoreQueries - essentially a generalisation of FileFloatSource to any external data source.  FFS already has a bunch of code copied from FieldCache, which was why my first thought was to open it up a bit and extend it, rather than copy and paste again.
>
> But it sounds as though a nicer way of doing this would be to create a new DocValuesProducer that talks to the external data source, and then access it through the AR docValues methods.  Does that sound plausible?  Is SPI going to make it difficult to pass parameters to a custom DVProducer (data location, host/port, other DV fields to use as primary key lookups, etc)?
>

its not involved if you implement via FilterAtomicReader. its only
involved for reading things that are actually written into the index.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Opening up FieldCacheImpl

Posted by Alan Woodward <al...@flax.co.uk>.

> I think instead FieldCache should actually be completely package
> private and hidden behind a UninvertingFilterReader and accessible via
> the existing AtomicReader docValues methods.

Aha, right, because SegmentCoreReaders already caches XXXDocValues instances (without using WeakReferences or anything like that).

I should explain my motivation here.  I want to store various scoring factors externally to Lucene, but make them available via a ValueSource to CustomScoreQueries - essentially a generalisation of FileFloatSource to any external data source.  FFS already has a bunch of code copied from FieldCache, which was why my first thought was to open it up a bit and extend it, rather than copy and paste again.

But it sounds as though a nicer way of doing this would be to create a new DocValuesProducer that talks to the external data source, and then access it through the AR docValues methods.  Does that sound plausible?  Is SPI going to make it difficult to pass parameters to a custom DVProducer (data location, host/port, other DV fields to use as primary key lookups, etc)?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Opening up FieldCacheImpl

Posted by Robert Muir <rc...@gmail.com>.

On Fri, Mar 22, 2013 at 6:26 PM, Alan Woodward <al...@flax.co.uk> wrote:
> Actually this would be really nice, wouldn't it.  Add a getFieldCache(String
> field) method to AtomicReader.  You'd have to be able to determine what to
> return depending on the field though - uninverted field, or docvalues, or
> another cached source.

but the cache isnt even on the reader, its on the SegmentCoreReaders.

>
> FieldCache and DocValues seem like they ought to have a common API, really.

They already do.

I think instead FieldCache should actually be completely package
private and hidden behind a UninvertingFilterReader and accessible via
the existing AtomicReader docValues methods.

Uninverting is a really crazy solution vs. indexing fields the way
they will be used.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Opening up FieldCacheImpl

Posted by Alan Woodward <al...@flax.co.uk>.

Actually this would be really nice, wouldn't it.  Add a getFieldCache(String field) method to AtomicReader.  You'd have to be able to determine what to return depending on the field though - uninverted field, or docvalues, or another cached source.  

FieldCache and DocValues seem like they ought to have a common API, really.  And ValueSource in the function queries package as well.  But that's another issue...

Alan Woodward
www.flax.co.uk

On 22 Mar 2013, at 20:48, Yonik Seeley wrote:

> The ability to cache stuff w/o resorting to weak references would be even nicer!
> Caches right on the segment readers?
> 
> -Yonik
> http://lucidworks.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

Re: Opening up FieldCacheImpl

Posted by Yonik Seeley <yo...@lucidworks.com>.

The ability to cache stuff w/o resorting to weak references would be even nicer!
Caches right on the segment readers?

-Yonik
http://lucidworks.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Opening up FieldCacheImpl

Posted by "David Smiley (@MITRE.org)" <DS...@mitre.org>.

That would be nice!  There is similar machinery in Solr's ExternalFileField. 
In the spatial module I'd like to cache data per-segment; it's current cache
sucks to say the least.  My current plans are to use BinaryDocValues so I
might not use this proposed machinery after-all but nonetheless I think it's
useful.

~ David


Alan Woodward-2 wrote
> I'm looking at exposing data held externally to an index via a
> ValueSource, and it would be nice to reuse the machinery in FieldCacheImpl
> to cache the data per-segment.  However, it's package-private at the
> moment, which means I can't extend it nicely.  Is there a reason for this? 
> Or should I put up a JIRA to make it public?
> 
> Alan Woodward
> www.flax.co.uk





-----
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: http://lucene.472066.n3.nabble.com/Opening-up-FieldCacheImpl-tp4050537p4050579.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org