You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Domma, Achim" <ac...@uberresearch.com> on 2013/08/19 09:09:03 UTC

Create term vector from text

Hi,

the TermVectorComponent allows me to retrieve data about the terms of a
document, including tf-idf. Is it possible to get this data for a text, but
without storing it in SOLR? As far as I figured out, the AnalysisComponent
comes close, but does not return the core specific frequencies. Obviously
the MLT handler has to do something like that internally. I tried to read
the code and figured out, that there's a Lucene function to create a query
from a text.

Is there some public interface which allows me to access that kind of
functionality? Or do I have to write my own SearchComponent?

cheers,
Achim

Re: Create term vector from text

Posted by Jack Krupansky <ja...@basetechnology.com>.
The Solr Terms Component will give you the terms in the index and the 
document frequency of each.

https://cwiki.apache.org/confluence/display/solr/The+Terms+Component

-- Jack Krupansky

-----Original Message----- 
From: Domma, Achim
Sent: Monday, August 19, 2013 3:09 AM
To: solr-user@lucene.apache.org
Subject: Create term vector from text

Hi,

the TermVectorComponent allows me to retrieve data about the terms of a
document, including tf-idf. Is it possible to get this data for a text, but
without storing it in SOLR? As far as I figured out, the AnalysisComponent
comes close, but does not return the core specific frequencies. Obviously
the MLT handler has to do something like that internally. I tried to read
the code and figured out, that there's a Lucene function to create a query
from a text.

Is there some public interface which allows me to access that kind of
functionality? Or do I have to write my own SearchComponent?

cheers,
Achim