You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sean O'Connor <se...@gmail.com> on 2010/11/10 16:39:59 UTC

SpanQuery basics in Solr QueryComponent(?)

Hi all,
     I seem to be lost in the new flex indexing api. In the older api I 
was able to extend QueryComponent with my custom component, parse a 
restricted-syntax user query into a SpanQuery, and then grab an 
IndexReader. From there I worked with the spanquery's spans. For a bit 
of reference my old QueryComponent code looks something like:

  @Override
     public void process(ResponseBuilder rb) throws IOException {
         SolrQueryRequest req = rb.req;
         SolrQueryResponse rsp = rb.rsp;
         SDRQParser qparser = (SDRQParser) rb.getQparser();

         SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();    
     // custom parser returns SpanQuery
         IndexReader reader = req.getSearcher().getReader();
         Spans spans = stq.getSpans(reader);
         // work with spans here...

     }

     With the new (1.5?) api, I got the warning about wrapping 
IndexReader with SlowMultiReaderWrapper, so I changed my approach above 
to something like:

  SolrIndexReader fullReader = req.getSearcher().getReader();
  IndexReader reader = SlowMultiReaderWrapper.wrap(fullReader);        
// need help avoiding this...?

     I then got a NPE on what seems to be EmptyTerms.toString(). For 
kicks, I noticed that EmpytyTerms did not override its parent 
(TermSpans) toString() method, which seemed to be the cause of the 
problems. Overriding that, fixed the NPE, and now I get results (so I 
will look at filing a bug report unless someone mentions otherwise).

     Any hints on how I can/should 'properly' work with spans in solr? 
Also, are there any introductory documents to the MultiFields and 
sub-indexes stuff? Particularly how to implement MultiFields as a better 
approach to SlowMultiReaderWrapper (thanks for the warnings about 
performance). I cannot seem to find the relevant beginner material to 
avoid using the SMRW. The material I do find seems to require that you 
pass in a 'found' document, or perhaps walk through all subReaders?

     And finally: should I be looking at some existing Solr code to lead 
guide me? I am having trouble finding the highlighter code which I 
believe uses spans (WeightedSpanTerm??). Is there already code to 
convert user queries to span queries?
Thanks,

Sean