You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Walter Ravenek <wr...@xs4all.nl> on 2009/07/13 20:37:24 UTC

Get TermVectors for query hits only

Hi all,

When I'm using the TermVectorComponent I receive term vectors with all 
tokens in the documents that meet my search criteria. I would be 
interested in getting the offsets for just those terms in the documents 
that meet the search citeria. My documents are about 200 K and are in 
XML. If I have just the offsets for the hits, I can easily implement my 
own highligting on the client side.

Does anyone know how to go about doing this?


Re: Get TermVectors for query hits only

Posted by Walter Ravenek <wr...@xs4all.nl>.
Thanks Grant,

I think I get the idea.


Grant Ingersoll wrote:
> I seem to recall that the Highlighter in Solr is pluggable, so you may 
> want to work at that level instead of the client side.  Otherwise, you 
> likely would have to implement your own TermVectorMapper and add that 
> to the TermVectorComponent capability which then feeds your client.
>
> For an example of using TermVectorMapper, but not solving exactly your 
> problem (but close), see 
> http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a-positional-match-in-lucene/ but 
> note that is at the Lucene level.
>
>
> On Jul 13, 2009, at 2:37 PM, Walter Ravenek wrote:
>
>> Hi all,
>>
>> When I'm using the TermVectorComponent I receive term vectors with 
>> all tokens in the documents that meet my search criteria. I would be 
>> interested in getting the offsets for just those terms in the 
>> documents that meet the search citeria. My documents are about 200 K 
>> and are in XML. If I have just the offsets for the hits, I can easily 
>> implement my own highligting on the client side.
>>
>> Does anyone know how to go about doing this?
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) 
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.5.387 / Virus Database: 270.13.12/2234 - Release Date: 07/12/09 17:56:00
>
>   


Re: Get TermVectors for query hits only

Posted by Grant Ingersoll <gs...@apache.org>.
I seem to recall that the Highlighter in Solr is pluggable, so you may  
want to work at that level instead of the client side.  Otherwise, you  
likely would have to implement your own TermVectorMapper and add that  
to the TermVectorComponent capability which then feeds your client.

For an example of using TermVectorMapper, but not solving exactly your  
problem (but close), see http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a-positional-match-in-lucene/ 
  but note that is at the Lucene level.


On Jul 13, 2009, at 2:37 PM, Walter Ravenek wrote:

> Hi all,
>
> When I'm using the TermVectorComponent I receive term vectors with  
> all tokens in the documents that meet my search criteria. I would be  
> interested in getting the offsets for just those terms in the  
> documents that meet the search citeria. My documents are about 200 K  
> and are in XML. If I have just the offsets for the hits, I can  
> easily implement my own highligting on the client side.
>
> Does anyone know how to go about doing this?
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search