You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Aaron McKee <uc...@gmail.com> on 2009/10/06 17:22:35 UTC
De-basing / re-basing docIDs, or how to effectively pass calculated
values from a Scorer or Filter up to (Solr's) QueryComponent.process
In the code I'm working with, I generate a cache of calculated values as
a by-product within a Filter.getDocidSet implementation (and within a
Query-ized version of the filter and its Scorer method) . These values
are keyed off the IndexReader's docID values, since that's all that's
accessible at that level. Ultimately, however, I need to be able to
access these values much higher up in the stack (Solr's
QueryComponent.process method), so that I can inject the dynamic values
into the response as a fake field. The IDs available here, however, are
for the entire index and not just relative to the current IndexReader.
I'm still fairly new to Lucene and I've been scratching my head a bit
trying to find a reliable way to map these values into the same space,
without having to hack up too many base classes. I noticed that there
was a related discussion at:
http://issues.apache.org/jira/browse/LUCENE-1821?focusedCommentId=12745041#action_12745041
... but also a bit of disagreement on the suggested strategies. Ideally,
I'm also hoping there's a strategy that won't require me to hack up too
much of the core product; subclassing IndexSearcher in the way suggested
would basically require me to change all of the various SearchComponents
I use in Solr, and that sounds like it'd end up a real maintenance
nightmare. I was looking at the Collector class as possible solution,
since it has knowledge of the docbase, but it looks like I'd then need
to change every derived collector that the code ultimately uses and,
including the various anonymous Collectors in Solr, that also looks like
it'd be a fairly ghoulish solution. I suppose I'm being wishful, or
lazy, but is there a reasonable and reliable way to do this, without
having to fork the core code? If not, any suggestion on the best
strategy to accomplish this, without adding too much overhead every time
I wanted to up-rev the core Lucene and/or Solr code to the latest version?
Thanks a ton,
Aaron
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: De-basing / re-basing docIDs, or how to effectively pass
calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process
Posted by Earwin Burrfoot <ea...@gmail.com>.
Might still be lucene-ish issue.
We already have getSequentialSubReaders() on IR, in my patched version
I augmented this with public readerIndex(), and getSubReaderStarts().
Pretty much impossible to do some postprocessing on gathered hits
without at least one of these.
On Tue, Oct 6, 2009 at 19:50, Yonik Seeley <yo...@lucidimagination.com> wrote:
> Aaron, could you move this to solr-user?
>
> -Yonik
> http://www.lucidimagination.com
>
>
> On Tue, Oct 6, 2009 at 11:22 AM, Aaron McKee <uc...@gmail.com> wrote:
>>
>> In the code I'm working with, I generate a cache of calculated values as a
>> by-product within a Filter.getDocidSet implementation (and within a
>> Query-ized version of the filter and its Scorer method) . These values are
>> keyed off the IndexReader's docID values, since that's all that's accessible
>> at that level. Ultimately, however, I need to be able to access these values
>> much higher up in the stack (Solr's QueryComponent.process method), so that
>> I can inject the dynamic values into the response as a fake field. The IDs
>> available here, however, are for the entire index and not just relative to
>> the current IndexReader. I'm still fairly new to Lucene and I've been
>> scratching my head a bit trying to find a reliable way to map these values
>> into the same space, without having to hack up too many base classes. I
>> noticed that there was a related discussion at:
>>
>> http://issues.apache.org/jira/browse/LUCENE-1821?focusedCommentId=12745041#action_12745041
>>
>> ... but also a bit of disagreement on the suggested strategies. Ideally, I'm
>> also hoping there's a strategy that won't require me to hack up too much of
>> the core product; subclassing IndexSearcher in the way suggested would
>> basically require me to change all of the various SearchComponents I use in
>> Solr, and that sounds like it'd end up a real maintenance nightmare. I was
>> looking at the Collector class as possible solution, since it has knowledge
>> of the docbase, but it looks like I'd then need to change every derived
>> collector that the code ultimately uses and, including the various anonymous
>> Collectors in Solr, that also looks like it'd be a fairly ghoulish solution.
>> I suppose I'm being wishful, or lazy, but is there a reasonable and reliable
>> way to do this, without having to fork the core code? If not, any suggestion
>> on the best strategy to accomplish this, without adding too much overhead
>> every time I wanted to up-rev the core Lucene and/or Solr code to the latest
>> version?
>>
>> Thanks a ton,
>> Aaron
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
--
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: De-basing / re-basing docIDs, or how to effectively pass
calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process
Posted by Yonik Seeley <yo...@lucidimagination.com>.
Aaron, could you move this to solr-user?
-Yonik
http://www.lucidimagination.com
On Tue, Oct 6, 2009 at 11:22 AM, Aaron McKee <uc...@gmail.com> wrote:
>
> In the code I'm working with, I generate a cache of calculated values as a
> by-product within a Filter.getDocidSet implementation (and within a
> Query-ized version of the filter and its Scorer method) . These values are
> keyed off the IndexReader's docID values, since that's all that's accessible
> at that level. Ultimately, however, I need to be able to access these values
> much higher up in the stack (Solr's QueryComponent.process method), so that
> I can inject the dynamic values into the response as a fake field. The IDs
> available here, however, are for the entire index and not just relative to
> the current IndexReader. I'm still fairly new to Lucene and I've been
> scratching my head a bit trying to find a reliable way to map these values
> into the same space, without having to hack up too many base classes. I
> noticed that there was a related discussion at:
>
> http://issues.apache.org/jira/browse/LUCENE-1821?focusedCommentId=12745041#action_12745041
>
> ... but also a bit of disagreement on the suggested strategies. Ideally, I'm
> also hoping there's a strategy that won't require me to hack up too much of
> the core product; subclassing IndexSearcher in the way suggested would
> basically require me to change all of the various SearchComponents I use in
> Solr, and that sounds like it'd end up a real maintenance nightmare. I was
> looking at the Collector class as possible solution, since it has knowledge
> of the docbase, but it looks like I'd then need to change every derived
> collector that the code ultimately uses and, including the various anonymous
> Collectors in Solr, that also looks like it'd be a fairly ghoulish solution.
> I suppose I'm being wishful, or lazy, but is there a reasonable and reliable
> way to do this, without having to fork the core code? If not, any suggestion
> on the best strategy to accomplish this, without adding too much overhead
> every time I wanted to up-rev the core Lucene and/or Solr code to the latest
> version?
>
> Thanks a ton,
> Aaron
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org