You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by anuvenk <an...@hotmail.com> on 2008/01/21 04:04:49 UTC

Term vector

what are term vectors? How do they help with mlt?
-- 
View this message in context: http://www.nabble.com/Term-vector-tp14990408p14990408.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Term vector

Posted by Grant Ingersoll <gs...@apache.org>.
Term vectors are, to some extent, the opposite of the inverted index.   
They store term, position and offset (the latter two are optional) on  
a per document basis, such that you can say "give me the terms,  
position and offsets for document X".  In terms of MLT, they are used  
to figure out what the most "important" terms are in a document, such  
that  a new query can be formed to find other documents that are "more  
like this" document.  They are also useful for highlighting and other  
non-search related activities like clustering, etc.

For more info, see my talk at ApacheCon: http://cnlp.org/presentations/slides/AdvancedLucene.pdf 
    Also, search for term vectors on the Lucene user mailing list (you  
can do this via Nabble)

-Grant

On Jan 20, 2008, at 10:04 PM, anuvenk wrote:

>
> what are term vectors? How do they help with mlt?
> -- 
> View this message in context: http://www.nabble.com/Term-vector-tp14990408p14990408.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ