You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aaron McKee <uc...@gmail.com> on 2009/10/06 18:40:11 UTC

De-basing / re-basing docIDs, or how to effectively pass calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process

(Posted here, per Yonik's suggestion)

In the code I'm working with, I generate a cache of calculated values as 
a by-product within a Filter.getDocidSet implementation (and within a 
Query-ized version of the filter and its Scorer method) . These values 
are keyed off the IndexReader's docID values, since that's all that's 
accessible at that level. Ultimately, however, I need to be able to 
access these values much higher up in the stack (Solr's 
QueryComponent.process method), so that I can inject the dynamic values 
into the response as a fake field. The IDs available here, however, are 
for the entire index and not just relative to the current IndexReader. 
I'm still fairly new to Lucene and I've been scratching my head a bit 
trying to find a reliable way to map these values into the same space, 
without having to hack up too many base classes. I noticed that there 
was a related discussion at:

http://issues.apache.org/jira/browse/LUCENE-1821?focusedCommentId=12745041#action_12745041 


... but also a bit of disagreement on the suggested strategies. Ideally, 
I'm also hoping there's a strategy that won't require me to hack up too 
much of the core product; subclassing IndexSearcher in the way suggested 
would basically require me to change all of the various SearchComponents 
I use in Solr, and that sounds like it'd end up a real maintenance 
nightmare. I was looking at the Collector class as possible solution, 
since it has knowledge of the docbase, but it looks like I'd then need 
to change every derived collector that the code ultimately uses and, 
including the various anonymous Collectors in Solr, that also looks like 
it'd be a fairly ghoulish solution. I suppose I'm being wishful, or 
lazy, but is there a reasonable and reliable way to do this, without 
having to fork the core code? If not, any suggestion on the best 
strategy to accomplish this, without adding too much overhead every time 
I wanted to up-rev the core Lucene and/or Solr code to the latest version?

Thanks a ton,
Aaron


Re: De-basing / re-basing docIDs, or how to effectively pass calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process

Posted by Chris Hostetter <ho...@fucit.org>.
: In the code I'm working with, I generate a cache of calculated values as a
: by-product within a Filter.getDocidSet implementation (and within a Query-ized
: version of the filter and its Scorer method) . These values are keyed off the
: IndexReader's docID values, since that's all that's accessible at that level.
: Ultimately, however, I need to be able to access these values much higher up
: in the stack (Solr's QueryComponent.process method), so that I can inject the

my suggestion would be to change your Filter to use the FieldCache to 
lookup the uiqueKey for your docid, and base your cache off that ... then 
other uses of your cache (higher up the chain) will have an idea that 
makes sense outside the ocntext of segment reader.




-Hoss