You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by adfel70 <ad...@gmail.com> on 2015/06/10 18:59:55 UTC
Adding applicative cache to SolrSearcher
I am using RankQuery to implement my applicative scorer that returns a score
based on the value of specific field (lets call it 'score_field') that is
stored for every document.
The RankQuery creates a collector, and for every collected docId I retrieve
the value of score_field, calculate the score and add the doc id into
priority queue:
public class MyScorerrankQuet extends RankQuery {
...
@Override
public TopDocsCollector getTopDocsCollector(int i,
SolrIndexerSearcher.QueryCommand cmd, IndexSearcher searcher) {
...
return new MyCollector(...)
}
}
public class MyCollector extends TopDocsCollector{
MyScorer scorer;
SortedDocValues scoreFieldValues; //Initialized in constrctor
public MyCollector(){
scorer = new MyScorer();
scorer.start(); //the scorer's API needs to call start()
before every query and close() at the end of the query
AtomicReader r =
SlowCompositeReaderWrapper.wrap(searcher.getIndexReader());
scoreFieldValues = DocValues.getSorted(r, "score_field"); /*
THIS CALL IS TIME CONSUMING! */
}
@Override
public void collect(int id){
int docID = docBase + id;
//1. get specific field from the doc using DocValues and
calculate score using my scorer
String value = scoreFieldValues.get(docID).utf8ToString();
scorer.calcScore(value);
//2. add docId and score (ScoreDoc object) into
PriorityQueue.
}
}
I used DocValues to get the value of score_field. Currently its being
instantiate in collector's constructor - which is performance killer,
because it is being called for EVERY query, even if the index is static (no
commits). I want to make the DocValue.getStored() call only when it is
really necessary, but I dont know where to put that code. Is there a place
to plug that code so when a new searcher is being opened I can add my this
applicative cache?
--
View this message in context: http://lucene.472066.n3.nabble.com/Adding-applicative-cache-to-SolrSearcher-tp4211012.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Adding applicative cache to SolrSearcher
Posted by adfel70 <ad...@gmail.com>.
Works great, thanks guys!
Missed the leafReader because I looked at IndexSearcher instead of
SolrIndexSearcher...
--
View this message in context: http://lucene.472066.n3.nabble.com/Adding-applicative-cache-to-SolrSearcher-tp4211012p4211183.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Adding applicative cache to SolrSearcher
Posted by Chris Hostetter <ho...@fucit.org>.
:
: The problem is SlowCompositeReaderWrapper.wrap(searcher.getIndexReader());
: you hardly ever need to to this, at least because Solr already does it.
Specifically you should just use...
searcher.getLeafReader().getSortedSetDocValues(your_field_anme)
...instead of doing all this wrapping yourself.
If the field docValues="true" declared, this will all be precomputed at
index time and super fast. if not, then the UninvertingReader logic will
kick in once per searcher -- if you want to "pre-warm" it just configure a
requests that exercises your code as part of a firstSearcher and
newSearcher event listeners.
-Hoss
http://www.lucidworks.com/
Re: Adding applicative cache to SolrSearcher
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,
The problem is SlowCompositeReaderWrapper.wrap(searcher.getIndexReader());
you hardly ever need to to this, at least because Solr already does it.
DocValues need to be accessed per segment, leaf/atomic/reader/context
provided to collector.
eg look at DocTermsIndexDocValues.strVal(int)
DocTermsIndexDocValues.open(LeafReaderContext, String) or
TermOrdValComparator and TopFieldCollector.
On Wed, Jun 10, 2015 at 6:59 PM, adfel70 <ad...@gmail.com> wrote:
>
> I am using RankQuery to implement my applicative scorer that returns a
> score
> based on the value of specific field (lets call it 'score_field') that is
> stored for every document.
> The RankQuery creates a collector, and for every collected docId I retrieve
> the value of score_field, calculate the score and add the doc id into
> priority queue:
>
> public class MyScorerrankQuet extends RankQuery {
> ...
>
> @Override
> public TopDocsCollector getTopDocsCollector(int i,
> SolrIndexerSearcher.QueryCommand cmd, IndexSearcher searcher) {
> ...
> return new MyCollector(...)
> }
> }
>
> public class MyCollector extends TopDocsCollector{
> MyScorer scorer;
> SortedDocValues scoreFieldValues; //Initialized in constrctor
>
> public MyCollector(){
> scorer = new MyScorer();
> scorer.start(); //the scorer's API needs to call start()
> before every query and close() at the end of the query
> AtomicReader r =
> SlowCompositeReaderWrapper.wrap(searcher.getIndexReader());
> scoreFieldValues = DocValues.getSorted(r, "score_field");
> /*
> THIS CALL IS TIME CONSUMING! */
> }
>
> @Override
> public void collect(int id){
> int docID = docBase + id;
> //1. get specific field from the doc using DocValues and
> calculate score using my scorer
> String value = scoreFieldValues.get(docID).utf8ToString();
> scorer.calcScore(value);
> //2. add docId and score (ScoreDoc object) into
> PriorityQueue.
> }
> }
>
>
> I used DocValues to get the value of score_field. Currently its being
> instantiate in collector's constructor - which is performance killer,
> because it is being called for EVERY query, even if the index is static (no
> commits). I want to make the DocValue.getStored() call only when it is
> really necessary, but I dont know where to put that code. Is there a place
> to plug that code so when a new searcher is being opened I can add my this
> applicative cache?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Adding-applicative-cache-to-SolrSearcher-tp4211012.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>